Your Data Teacher Podcast podcast artwork

PODCAST · technology

Your Data Teacher Podcast

A podcast about data science, machine learning, artificial intelligence, statistics and everything related to data.Home Page: https://www.yourdatateacher.com

  1. 7

    Episode 7 - A Python library to remove collinearity

    Collinearity is a huge problem for machine learning problems. It increases the dimensions of our dataset without increasing the amount of information. That's why I've created a Python library that can be used to remove collinearity from a dataset. I talk about this library in this podcast.  Article: https://www.yourdatateacher.com/2021/06/28/a-python-library-to-remove-collinearity/  Pypi package: https://pypi.org/project/collinearity/  GitHub repo: https://github.com/gianlucamalato/collinearity

  2. 6

    Episode 6 - Checking the distribution of your data using Q-Q plot

    In this episode, I'm talking about Q-Q plot and how to use it for checking if our dataset follows a particular distribution. Instead of using complex hypothesis tests like Kolmogorov-Smirnov test, using this simple plot, we'll be able to check if our dataset follows a particular distribution or if two datasets have been created according to the same distribution. Link to the article: https://www.yourdatateacher.com/2021/06/16/how-to-use-q-q-plot-for-checking-the-distribution-of-our-data/

  3. 5

    Episode 5 - Tuning the threshold in binary classification tasks

    In this episode, I'll talk about tuning the threshold in binary classification tasks. The usual value for the threshold is 0.5, but it's useful to optimize it in order to make the model fit our needs. I talk about optimizing according to the ROC curve and maximizing the balanced accuracy.   Link to the article: https://www.yourdatateacher.com/2021/06/14/are-you-still-using-0-5-as-a-threshold/

  4. 4

    Episode 4 - Ensemble models. Bagging and boosting

    In this episode, I'm going to talk about ensemble models, particularly bagging and boosting. Bagging is very useful for reducing variance, boosting is used for reducing bias. The most common bagging algorithm is Random Forest, the most common boosting algorithm is Gradient Boosting, whose most common implementations are XGBoost, LightGBM and CatBoost. Home Page: https://www.yourdatateacher.com

  5. 3

    Episode 3 - Precision, recall, accuracy. How to choose?

    In this episode, I talk about accuracy, precision and recall. We're going to focus on what they are and when to use them in machine learning projects. Link to the article: https://www.yourdatateacher.com/2021/06/07/precision-recall-accuracy-how-to-choose/

  6. 2

    Episode 2 - How to explain neural networks using SHAP

    Today we're going to talk about how we can explain neural networks. Neural networks are like black boxes that hide the way they model and represent data. That's why explaining them is very difficult. A very powerful approach is called SHAP. Using this method, we can calculate the impact of a feature according to a given model independently of the type of model we're using. It's very useful for black boxes like neural networks. Home page: https://www.yourdatateacher.com Link to the article: https://www.yourdatateacher.com/2021/05/17/how-to-explain-neural-networks-using-shap/

  7. 1

    Episode 1 - How accurate is your accuracy?

    Today we're going to talk about the standard error on proportions. In data science, it's very important to calculate the standard error on every estimate we calculate in order to see if finite-size effects are lowering the precision too much and in order to compare two different measurement results with each other. Home page: https://www.yourdatateacher.com Link to the article: https://www.yourdatateacher.com/2021/05/31/how-accurate-is-your-accuracy/

Type above to search every episode's transcript for a word or phrase. Matches are scoped to this podcast.

Searching…

We're indexing this podcast's transcripts for the first time — this can take a minute or two. We'll show results as soon as they're ready.

No matches for "" in this podcast's transcripts.

Showing of matches

No topics indexed yet for this podcast.

Loading reviews...

ABOUT THIS SHOW

A podcast about data science, machine learning, artificial intelligence, statistics and everything related to data.Home Page: https://www.yourdatateacher.com

HOSTED BY

Your Data Teacher

CATEGORIES

Frequently Asked Questions

How many episodes does Your Data Teacher Podcast have?

Your Data Teacher Podcast currently has 7 episodes available on PodParley. New episodes are automatically indexed when they're published to the podcast feed.

What is Your Data Teacher Podcast about?

A podcast about data science, machine learning, artificial intelligence, statistics and everything related to data.Home Page: https://www.yourdatateacher.com

How often does Your Data Teacher Podcast release new episodes?

Your Data Teacher Podcast has 7 episodes. Check the episode list to see recent publication dates and frequency.

Where can I listen to Your Data Teacher Podcast?

You can listen to Your Data Teacher Podcast on PodParley by clicking any episode. We provide an embedded audio player for direct listening, and you can also subscribe via your preferred podcast app using the RSS feed.

Who hosts Your Data Teacher Podcast?

Your Data Teacher Podcast is created and hosted by Your Data Teacher.
URL copied to clipboard!