Episode 24: How to handle imbalanced datasets

from Data Science at Home · host Francesco Gadaleta <frag>

In machine learning and data science in general it is very common to deal at some point with imbalanced datasets and class distributions. This is the typical case where the number of observations that belong to one class is significantly lower than those belonging to the other classes. Actually this happens all the time, in several domains, from finance, to healthcare to social media, just to name a few I have personally worked with. Think about a bank detecting fraudulent transactions among millions or billions of daily operations, or equivalently in healthcare for the identification of rare disorders. In genetics but also with clinical lab tests this is a normal scenario, in which, fortunately there are very few patients affected by a disorder and therefore very few cases wrt the large pool of healthy patients (or not affected). There is no algorithm that can take into account the class distribution or the amount of observations in each class, if it is not explicitly designed to handle such situations. In this episode I speak about some effective techniques to handle imbalanced datasets, advising the right method, or the most appropriate one to the right dataset or problem.In this episode I explain how to deal with such common and challenging scenarios. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit datascienceathome.substack.com

NOW PLAYING

0:00 21:21

1×

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Share this episode

Similar Episodes

THE OMEN (1995) / DAMIEN

Apr 20, 2026 ·75m

THE FIRST OMEN (part 2 of 2)

Apr 16, 2026 ·84m

THE FIRST OMEN (Part 1 of 2)

Apr 13, 2026 ·79m

THE OMEN (2006)

Apr 6, 2026 ·116m

OMEN IV: THE AWAKENING

Mar 30, 2026 ·126m

UNDERTONE (Patreon Preview clip)

Mar 27, 2026 ·17m

Similar Podcasts

Big Old Life: Heather Blackbird interviews people on planet earth. Heather Blackbird loves asking questions. This podcast is a learning experience. Join me, Heather Blackbird, as I talk to people about their lives. Frequency of new episodes is a little all over the place and I'm learning as I go. Big Old Life is a small way of talking about the vastness of life, one person at a time. If you are reading this or found this podcast it's probably because someone you know gave you a link to it. :) Explicit Tales Of A Superstar DJ The Insomniac Spun seemingly out of nowhere from her complacent life in the corporate world, turned seemingly overnight from 16-Hour shift work and into the life of a literally starving artist and working musician, The Protagonist navigates her supposed rise to fame and superstardom on a journey through spiritual awakening, coming-of-age, and intimate self-realization--guided by an omnipresent force and equipped with the power of love, magic, and music. {Enter The Multiverse.} [The Festival Project] The Festival Project, Inc.™ is a multidimensional multimedia platform which encompasses exploratory and artistic social personifications and expressions on cosmic theory, spirituality, growth, health & wellness, philosophy and theoretic dynamics in entertainment such as music, design, film, television, radio, dance and festival culture, art, fashion, literature, and science. The Festival Project™ and its subsidiary Non-Profit, The Collective Complex © aims to challenge modern artistic and philosop Explicit Bitcoin Gateway Lea meakin Welcome to Bitcoin Gateway, the podcast where we dive deep into the world of Bitcoin, hosted by Lea Meakin. This show is for anyone who’s ever felt overwhelmed by the complex world of cryptocurrencies and wants a simple, straightforward explanation. Each episode, we’ll break down the basics of Bitcoin, explore its history, and discuss its potential impact on the future of finance. Whether you’re a complete beginner or just looking to expand your knowledge, Bitcoin Gateway is here to help you understand Bitcoin, one episode at a time. Explicit Chinook Realm Religion and crime collide when a gruesome murder rocks the isolated community of Chinook, Montana. Local Deputy Ruth Vogel thought she was answering a routine animal control call, only to find a mangled corpse on the frozen embankment. Her small town is whipped into a frenzy and everyone is quick to point their fingers at a drug-addicted teenager, but Ruth suspects connections to a powerful religious group. Enter Agent Loro, an enigmatic FBI investigator tracking an evangelical cult that may have roots right here in Chinook. Loro and Ruth form a cautious partnership to find the killer—but as the mystery winds through Ruth’s life, her family, and her church, she’ll discover something more sinister than murder is afoot.Binge all episodes of Chinook exclusively and ad-free by joining Wondery+ in the Wondery App, Apple Podcasts or Spotify. Start your free trial by wondery.com/links/chinook v Explicit

Frequently Asked Questions

How long is this episode of Data Science at Home?

This episode is 21 minutes long.

When was this Data Science at Home episode published?

This episode was published on October 9, 2017.

What is this episode about?

Can I download this Data Science at Home episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.

URL copied to clipboard!