Data Skeptic cover art

All Episodes

Data Skeptic — 595 episodes

#
Title
1

Student Spotlight: Aaron Payne, Data Analyst

2

The Future is Agentic in Recommender Systems

3

Book Ratings and Recommendations

4

Disentanglement and Interpretability in Recommender Systems

5

Collective Altruism in Recommender Systems

6

Niche vs Mainstream

7

Healthy Friction in Job Recommender Systems

8

Fairness in PCA-Based Recommenders

9

Video Recommendations in Industry

10

Eye Tracking in Recommender Systems

11

Cracking the Cold Start Problem

12

Designing Recommender Systems for Digital Humanities

13

DataRec Library for Reproducible in Recommend Systems

14

Shilling Attacks on Recommender Systems

15

Music Playlist Recommendations

16

Bypassing the Popularity Bias

17

Sustainable Recommender Systems for Tourism

18

Interpretable Real Estate Recommendations

19

Why Am I Seeing This?

20

Eco-aware GNN Recommenders

21

Networks and Recommender Systems

22

Network of Past Guests Collaborations

23

The Network Diversion Problem

24

Complex Dynamic in Networks

25

Github Network Analysis

26

Networks and Complexity

27

Graphs for Causal AI

28

Power Networks

29

Unveiling Graph Datasets

30

Network Manipulation

31

The Small World Hypothesis

32

Thinking in Networks

33

Fraud Networks

34

Criminal Networks

35

Graph Bugs

36

Organizational Network Analysis

37

Organizational Networks

38

Networks of the Mind

39

LLMs and Graphs Synergy

40

A Network of Networks

41

Auditing LLMs and Twitter

42

Fraud Detection with Graphs

43

Optimizing Supply Chains with GNN

44

The Mystery Behind Large Graphs

45

Customizing a Graph Solution

46

Graph Transformations

47

Networks for AB Testing

48

Lessons from eGamer Networks

49

Github Collaboration Network

50

Graphs and ML for Robotics

51

Graphs for HPC and LLMs

52

Graph Databases and AI

53

Network Analysis in Practice

54

Animal Intelligence Final Exam

55

Process Mining with LLMs

56

Open Animal Tracks

57

Bird Distribution Modeling with Satbird

58

Ant Encounters

59

Computing Toolbox

60

Biodiversity Monitoring

61

Hacking the Colony

62

Primate Poses

63

Generating 3D Animals with YouDream

64

Weird Communication

65

Reducing the Impact of Ship Noise on Marine Mammals

66

Analysis of Unstructured Data

67

iNaturalist

68

Learn to Code

69

Animal Computer Interaction

70

Ape Gestures

71

Evaluating AI Abilities

72

HMMs for Behavior

73

Bioinspired Engineering

74

Modelling Evolution

75

Behavioral Genetics

76

Signal in the Noise

77

Pose Tracking

78

Modeling Group Behavior

79

Advances in Data Loggers

80

What You Know About Intelligence is Wrong (fixed)

81

Animal Decision Making

82

Octopus Cognition

83

Optimal Foraging

84

Memory in Chess

85

OpenWorm

86

What the Antlion Knows

87

AI Roundtable

88

Uncontrollable AI Risks

89

I LLM and You Can Too

90

Q&A with Kyle

91

LLMs for Data Analysis

92

AI Platforms

93

Deploying LLMs

94

A Survey Assessing Github Copilot

95

Program Aided Language Models

96

Which Programming Language is ChatGPT Best At

97

GraphText

98

arXiv Publication Patterns

99

Do LLMs Make Ethical Choices

100

Emergent Deception in LLMs

101

Agents with Theory of Mind Play Hanabi

102

LLMs for Evil

103

The Defeat of the Winograd Schema Challenge

104

LLMs in Social Science

105

LLMs in Music Composition

106

Cuttlefish Model Tuning

107

Which Professions Are Threatened by LLMs

108

Why Prompting is Hard

109

Automated Peer Review

110

Prompt Refusal

111

A Long Way Till AGI

112

Brain Inspired AI

113

Computable AGI

114

AGI Can Be Safe

115

AI Fails on Theory of Mind Tasks

116

AI for Mathematics Education

117

Evaluating Jokes with LLMs

118

Why Machines Will Never Rule the World

119

A Psychopathological Approach to Safety in AGI

120

The NLP Community Metasurvey

121

Skeptical Survey Interpretation

122

The Gallup Poll

123

Inclusive Study Group Formation at Scale

124

The PhilPapers Survey

125

Non-Response Bias

126

Measuring Trust in Robots with Likert Scales

127

CAREER Prediction

128

The Panel Study of Income Dynamics

129

Survey Design Working Session

130

Bot Detection and Dyadic Surveys

131

Reproducible ESP Testing

132

A Survey of Data Science Methodologies

133

Opinion Dynamics Models

134

Casual Affective Triggers

135

Conversational Surveys

136

Do Results Generalize for Privacy and Security Surveys

137

4 out of 5 Data Scientists Agree

138

Crowdfunded Board Games

139

Russian Election Interference Effectiveness

140

Placement Laundering Fraud

141

Data Clean Rooms

142

Dark Patterns in Site Design

143

Internet Advertising Bureau Media Lab

144

Your Mouse Reveals Your Gender and Age

145

Measuring Web Search Behavior

146

StrategyQA and Big Bench

147

Ad Blockers Effect on News Consumption

148

Your Consent is Worth 75 Euros a Year

149

Automated Email Generation for Targeted Attacks

150

Tribal Marketing

151

Nano-targetted Facebook Ads

152

Debiasing GPT-3 Job Ads

153

ML Ops in Production

154

Ad Network Tomography

155

First Party Tracking Cookies

156

The Harms of Targeted Weight Loss Ads

157

Podcast Advertising

158

Fairness in e-Commerce Search

159

Fraudulent Amazon Reviewers

160

Ad Targeting in Amazon Smart Speakers

161

Adwords with Unknown Budgets

162

ML Ops Best Practices

163

Affiliate Marketing Rabbithole

164

Monetization of Youtube Conspiracy Theorists

165

User Perceptions of Problematic Ads

166

Political Digital Advertising Analysis

167

Fraud Detection in Crowdfunding Campaigns

168

Artificial Intelligence and Auction Design

169

Privacy Preference Signals

170

Neural Architecture Search for CTR Prediction

171

Algorithmic PPC Management

172

Data Skeptic: Ad Tech

173

The Reliability of Mobile Phone Data

174

Haywire Algorithms

175

School Reopening Analysis

176

Modern Data Stacks

177

Emoji as a Predictor

178

Polarizing Trends in the Gig Economy

179

Remote Learning in Applied Engineering

180

Remote Productivity

181

Does Remote Learning Work?

182

Covid-19 Impact on Bicycle Usage

183

Learning Digital Fabrication Remotely

184

Remote Software Development

185

Quantum K-Means

186

K-Means in Practice

187

Fair Hierarchical Clustering

188

Matrix Factorization For k-Means

189

Breathing K-Means

190

Power K-Means

191

Explainable K-Means

192

Customer Clustering

193

k-means Image Segmentation

194

Tracking Elephant Clusters

195

k-means clustering

196

Snowflake Essentials

197

Explainable Climate Science

198

Energy Forecasting Pipelines

199

Matrix Profiles in Stumpy

200

The Great Australian Prediction Project

201

Water Demand Forecasting

202

Open Telemetry

203

Fashion Predictions

204

Time Series Mini Episodes

205

Forecasting Motor Vehicle Collision

206

Deep Learning for Road Traffic Forecasting

207

Bike Share Demand Forecasting

208

Forecasting in Supply Chain

209

Black Friday

210

Aligning Time Series on Incomparable Spaces

211

Comparing Time Series with HCTSA

212

Change Point Detection Algorithms

213

Time Series for Good

214

Long Term Time Series Forecasting

215

Fast and Frugal Time Series Forecasting

216

Causal Inference in Educational Systems

217

Boosted Embeddings for Time Series

218

Change Point Detection in Continuous Integration Systems

219

Applying k-Nearest Neighbors to Time Series

220

Ultra Long Time Series

221

MiniRocket

222

ARiMA is not Sufficient

223

Comp Engine

224

Detecting Ransomware

225

GANs in Finance

226

Predicting Urban Land Use

227

Opportunities for Skillful Weather Prediction

228

Predicting Stock Prices

229

N-Beats

230

Translation Automation

231

Time Series at the Beach

232

Automatic Identification of Outlier Galaxy Images

233

Do We Need Deep Learning in Time Series

234

Detecting Drift

235

Darts Library for Time Series

236

Forecasting Principles and Practice

237

Prequisites for Time Series

238

Orders of Magnitude

239

They're Coming for Our Jobs

240

Pandemic Machine Learning Pitfalls

241

Flesch Kincaid Readability Tests

242

Fairness Aware Outlier Detection

243

Life May be Rare

244

Social Networks

245

The QAnon Conspiracy

246

Benchmarking Vision on Edge vs Cloud

247

Goodhart's Law in Reinforcement Learning

248

Video Anomaly Detection

249

Fault Tolerant Distributed Gradient Descent

250

Decentralized Information Gathering

251

Leaderless Consensus

252

Automatic Summarization

253

Gerrymandering

254

Even Cooperative Chess is Hard

255

Consecutive Votes in Paxos

256

Visual Illusions Deceiving Neural Networks

257

Earthquake Detection with Crowd-sourced Data

258

Byzantine Fault Tolerant Consensus

259

Alpha Fold

260

Arrow's Impossibility Theorem

261

Face Mask Sentiment Analysis

262

Counting Briberies in Elections

263

Sybil Attacks on Federated Learning

264

Differential Privacy at the US Census

265

Distributed Consensus

266

ACID Compliance

267

National Popular Vote Interstate Compact

268

Defending the p-value

269

Retraction Watch

270

Crowdsourced Expertise

271

The Spread of Misinformation Online

272

Consensus Voting

273

Voting Mechanisms

274

False Consensus

275

Fraud Detection in Real Time

276

Listener Survey Review

277

Human Computer Interaction and Online Privacy

278

Authorship Attribution of Lennon McCartney Songs

279

GANs Can Be Interpretable

280

Sentiment Preserving Fake Reviews

281

Interpretability Practitioners

282

Facial Recognition Auditing

283

Robust Fit to Nature

284

Black Boxes Are Not Required

285

Robustness to Unforeseen Adversarial Attacks

286

Estimating the Size of Language Acquisition

287

Interpretable AI in Healthcare

288

Understanding Neural Networks

289

Self-Explaining AI

290

Plastic Bag Bans

291

Self Driving Cars and Pedestrians

292

Computer Vision is Not Perfect

293

Uncertainty Representations

294

AlphaGo, COVID-19 Contact Tracing and New Data Set

295

Visualizing Uncertainty

296

Interpretability Tooling

297

Shapley Values

298

Anchors as Explanations

299

Mathematical Models of Ecological Systems

300

Adversarial Explanations

301

ObjectNet

302

Visualization and Interpretability

303

Interpretable One Shot Learning

304

Fooling Computer Vision

305

Algorithmic Fairness

306

Interpretability

307

NLP in 2019

308

The Limits of NLP

309

Jumpstart Your ML Project

310

Serverless NLP Model Training

311

Team Data Science Process

312

Ancient Text Restoration

313

ML Ops

314

Annotator Bias

315

NLP for Developers

316

Indigenous American Language Research

317

Talking to GPT-2

318

Reproducing Deep Learning Models

319

What BERT is Not

320

SpanBERT

321

BERT is Shallow

322

BERT is Magic

323

Applied Data Science in Industry

324

Building the howto100m Video Corpus

325

BERT

326

Onnx

327

Catastrophic Forgetting

328

Transfer Learning

329

Facebook Bargaining Bots Invented a Language

330

Under Resourced Languages

331

Named Entity Recognition

332

The Death of a Language

333

Neural Turing Machines

334

Data Infrastructure in the Cloud

335

NCAA Predictions on Spark

336

The Transformer

337

Mapping Dialects with Twitter Data

338

Sentiment Analysis

339

Attention Primer

340

Cross-lingual Short-text Matching

341

ELMo

342

BLEU

343

Simultaneous Translation at Baidu

344

Human vs Machine Transcription

345

seq2seq

346

Text Mining in R

347

Recurrent Relational Networks

348

Text World and Word Embedding Lower Bounds

349

word2vec

350

Authorship Attribution

351

Very Large Corpora and Zipf's Law

352

Semantic search at Github

353

Let's Talk About Natural Language Processing

354

Data Science Hiring Processes

355

Holiday Reading - Epicac

356

Drug Discovery with Machine Learning

357

Sign Language Recognition

358

Data Ethics

359

Escaping the Rabbit Hole

360

[MINI] Theorem Provers

361

Automated Fact Checking

362

[MINI] Single Source of Truth

363

Detecting Fast Radio Bursts with Deep Learning

364

Being Bayesian

365

Modeling Fake News

366

The Louvain Method for Community Detection

367

Cultural Cognition of Scientific Consensus

368

False Discovery Rates

369

Deep Fakes

370

Fake News Midterm

371

Quality Score

372

The Knowledge Illusion

373

Click Through Rates

374

Algorithmic Detection of Fake News

375

Ant Intelligence

376

Human Detection of Fake News

377

Spam Filtering with Naive Bayes

378

The Spread of Fake News

379

Fake News

380

Dev Ops for Data Science

381

First Order Logic

382

Blind Spots in Reinforcement Learning

383

Defending Against Adversarial Attacks

384

Transfer Learning

385

Medical Imaging Training Techniques

386

Kalman Filters

387

AI in Industry

388

AI in Games

389

Game Theory

390

The Experimental Design of Paranormal Claims

391

Winograd Schema Challenge

392

The Imitation Game

393

Eugene Goostman

394

The Theory of Formal Languages

395

The Loebner Prize

396

Chatbots

397

The Master Algorithm

398

The No Free Lunch Theorems

399

ML at Sloan Kettering Cancer Center

400

Optimal Decision Making with POMDPs

401

AI Decision-Making

402

[MINI] Reinforcement Learning

403

Evolutionary Computation

404

[MINI] Markov Decision Processes

405

Neuroscience Frontiers

406

Neuroimaging and Big Data

407

The Agent Model of Artificial Intelligence

408

Artificial Intelligence, a Podcast Approach

409

Holiday reading 2017

410

Complexity and Cryptography

411

Mercedes Benz Machine Learning Research

412

[MINI] Parallel Algorithms

413

Quantum Computing

414

Azure Databricks

415

[MINI] Exponential Time Algorithms

416

P vs NP

417

[MINI] Sudoku \in NP

418

The Computational Complexity of Machine Learning

419

[MINI] Turing Machines

420

The Complexity of Learning Neural Networks

421

[MINI] Big Oh Analysis

422

Data science tools and other announcements from Ignite

423

Generative AI for Content Creation

424

[MINI] One Shot Learning

425

Recommender Systems Live from FARCON 2017

426

[MINI] Long Short Term Memory

427

Zillow Zestimate

428

Cardiologist Level Arrhythmia Detection with CNNs

429

[MINI] Recurrent Neural Networks

430

Project Common Voice

431

[MINI] Bayesian Belief Networks

432

pix2code

433

[MINI] Conditional Independence

434

Estimating Sheep Pain with Facial Recognition

435

CosmosDB

436

[MINI] The Vanishing Gradient

437

Doctor AI

438

[MINI] Activation Functions

439

MS Build 2017

440

[MINI] Max-pooling

441

Unsupervised Depth Perception

442

[MINI] Convolutional Neural Networks

443

Multi-Agent Diverse Generative Adversarial Networks

444

[MINI] Generative Adversarial Networks

445

Opinion Polls for Presidential Elections

446

OpenHouse

447

[MINI] GPU CPU

448

[MINI] Backpropagation

449

Data Science at Patreon

450

[MINI] Feed Forward Neural Networks

451

Reinventing Sponsored Search Auctions

452

[MINI] The Perceptron

453

The Data Refuge Project

454

[MINI] Automated Feature Engineering

455

Big Data Tools and Trends

456

[MINI] Primer on Deep Learning

457

Data Provenance and Reproducibility with Pachyderm

458

[MINI] Logistic Regression on Audio Data

459

Studying Competition and Gender Through Chess

460

[MINI] Dropout

461

The Police Data and the Data Driven Justice Initiatives

462

The Library Problem

463

2016 Holiday Special

464

[MINI] Entropy

465

MS Connect Conference

466

Causal Impact

467

[MINI] The Bootstrap

468

[MINI] Gini Coefficients

469

Unstructured Data for Finance

470

[MINI] AdaBoost

471

Stealing Models from the Cloud

472

[MINI] Calculating Feature Importance

473

NYC Bike Share Rebalancing

474

[MINI] Random Forest

475

Election Predictions

476

[MINI] F1 Score

477

Urban Congestion

478

[MINI] Heteroskedasticity

479

Music21

480

[MINI] Paxos

481

Trusting Machine Learning Models with LIME

482

[MINI] ANOVA

483

Machine Learning on Images with Noisy Human-centric Labels

484

[MINI] Survival Analysis

485

Predictive Models on Random Data

486

[MINI] Receiver Operating Characteristic (ROC) Curve

487

Multiple Comparisons and Conversion Optimization

488

[MINI] Leakage

489

Predictive Policing

490

[MINI] The CAP Theorem

491

Detecting Terrorists with Facial Recognition?

492

[MINI] Goodhart's Law

493

Data Science at eHarmony

494

[MINI] Stationarity and Differencing

495

Feather

496

[MINI] Bargaining

497

deepjazz

498

[MINI] Auto-correlative functions and correlograms

499

Early Identification of Violent Criminal Gang Members

500

[MINI] Fractional Factorial Design

501

Machine Learning Done Wrong

502

Potholes

503

[MINI] The Elbow Method

504

Too Good to be True

505

[MINI] R-squared

506

Models of Mental Simulation

507

[MINI] Multiple Regression

508

Scientific Studies of People's Relationship to Music

509

[MINI] k-d trees

510

Auditing Algorithms

511

[MINI] The Bonferroni Correction

512

[MINI] Gradient Descent

513

Let's Kill the Word Cloud

514

2015 Holiday Special

515

Wikipedia Revision Scoring as a Service

516

[MINI] Term Frequency - Inverse Document Frequency

517

The Hunt for Vulcan

518

[MINI] The Accuracy Paradox

519

Neuroscience from a Data Scientist's Perspective

520

[MINI] Bias Variance Tradeoff

521

Big Data Doesn't Exist

522

[MINI] Covariance and Correlation

523

Bayesian A/B Testing

524

[MINI] The Central Limit Theorem

525

Accessible Technology

526

[MINI] Multi-armed Bandit Problems

527

[MINI] Structured and Unstructured Data

528

Measuring the Influence of Fashion Designers

529

[MINI] PageRank

530

Data Science at Work in LA County

531

[MINI] k-Nearest Neighbors

532

Crypto

533

[MINI] MapReduce

534

Genetically Engineered Food and Trends in Herbicide Usage

535

[MINI] The Curse of Dimensionality

536

Video Game Analytics

537

[MINI] Anscombe's Quartet

538

Proposing Annoyance Mining

539

Preserving History at Cyark

540

[MINI] A Critical Examination of a Study of Marriage by Political Affiliation

541

Detecting Cheating in Chess

542

[MINI] z-scores

543

Using Data to Help Those in Crisis

544

The Ghost in the MP3

545

Data Fest 2015

546

[MINI] Cornbread and Overdispersion

547

[MINI] Natural Language Processing

548

Computer-based Personality Judgments

549

[MINI] Markov Chain Monte Carlo

550

[MINI] Markov Chains

551

Oceanography and Data Science

552

[MINI] Ordinary Least Squares Regression

553

NYC Speed Camera Analysis with Tim Schmeier

554

[MINI] k-means clustering

555

Shadow Profiles on Social Networks

556

[MINI] The Chi-Squared Test

557

Mapping Reddit Topics with Randy Olson

558

[MINI] Partially Observable State Spaces

559

Easily Fooling Deep Neural Networks

560

[MINI] Data Provenance

561

Doubtful News, Geology, Investigating Paranormal Groups, and Thinking Scientifically with Sharon Hill

562

[MINI] Belief in Santa

563

Economic Modeling and Prediction, Charitable Giving, and a Follow Up with Peter Backus

564

[MINI] The Battle of the Sexes

565

The Science of Online Data at Plenty of Fish with Thomas Levi

566

[MINI] The Girlfriend Equation

567

The Secret and the Global Consciousness Project with Alex Boklin

568

[MINI] Monkeys on Typewriters

569

Mining the Social Web with Matthew Russell

570

[MINI] Is the Internet Secure?

571

Practicing and Communicating Data Science with Jeff Stanton

572

[MINI] The T-Test

573

Data Myths with Karl Mamer

574

Contest Announcement

575

[MINI] Selection Bias

576

[MINI] Confidence Intervals

577

[MINI] Value of Information

578

Game Science Dice with Louis Zocchi

579

Data Science at ZestFinance with Marick Sinay

580

[MINI] Decision Tree Learning

581

Jackson Pollock Authentication Analysis with Kate Jones-Smith

582

[MINI] Noise!!

583

Guerilla Skepticism on Wikipedia with Susan Gerbic

584

[MINI] Ant Colony Optimization

585

Data in Healthcare IT with Shahid Shah

586

[MINI] Cross Validation

587

Streetlight Outage and Crime Rate Analysis with Zach Seeskin

588

[MINI] Experimental Design

589

The Right (big data) Tool for the Job with Jay Shankar

590

[MINI] Bayesian Updating

591

Personalized Medicine with Niki Athanasiadou

592

[MINI] p-values

593

Advertising Attribution with Nathan Janos

594

[MINI] type i / type ii errors

595

Introduction