Predicting the Future (risk stratification) with Shay Sayed episode artwork

EPISODE · Apr 29, 2026 · 1H 28M

Predicting the Future (risk stratification) with Shay Sayed

from Ops I did it again by Out of Pocket · host Shay Sayed, Alex Dou

What we cover Risk stratification is ranking patients by probability of an adverse outcome. Traditional indices like the Charlson Comorbidity Index use clinician-designed scoring systems; ML-based approaches automate feature generation and let the model surface correlations that a heuristic would miss. The tradeoff is interpretability: with tens of thousands of computations per prediction, explaining a ranking to a clinician of why it ranked Patient A ahead of Patient B is ... surprisingly hard to do. It's kind of like asking a human being "why did you do this thing X weeks ago"... memory fades, and us humans have this "delightful" post-hoc rationalization feature Actually getting the data in a good shape is harder than training the model. Schema differences between organizations are structural: different table names, different column types, different ways of representing the same event. ML tolerates directional imperfection in a way that population analytics does not, but the cleanup is still slow and dependent on tribal knowledge that data owners often can't fully explain. "Feature engineering" in this context means building hypotheses the model can test. An example we discussed was “if I’m trying to risk stratify kidney stones, what hypotheses might I try? Maybe soda intake. Maybe dehydration. Maybe SDOH." Those three things are all “features” in this context. The platform ClosedLoop built could generate complex clinical features in about ten minutes, which was most of the competitive advantage. Failure modes tend to be around operations, not accuracy of the algorithm. Buyers without a clear care management strategy can't actually impact patients on the list. ROI attribution takes years, by the which case people might revert to the mean. And without tracking what the clinical program is actually doing, you can't separate a model problem from a workflow problem ETHOS is Epic's transformer trained on serialized clinical event histories from 300 million patients. The way I think about this is if LLMs “predict the next word most likely to occur”, then ostensibly you could get a training set of healthcare events and “predict the next {event} most likely to occur” where {event} is NICU stay Brought to you by Toboggan Labs: A consultancy for healthcare builders. If you have a health product that needs engineers, product people, or experienced operators to help you build or fix something, go talk to them at https://bit.ly/oop-readmission For inquiries about sponsoring the podcast, email [email protected] Find Shay https://www.linkedin.com/in/shaayaan-sayed-8097b1100/ Timestamps [02:07] Shay's background: training models from scratch at Closed Loop [04:22] How Shay got into ML in high school by cold-emailing every professor in Houston. By contrast, Alex really got into Dynasty Warriors in high school [10:43] The CMS (Centers for Medicare & Medicaid Services) AI Health Outcomes Challenge. ClosedLoop won $1 million against some big names: Mayo Clinic, Geisinger, and Mathematica. The two components: predictive performance across 13 to 15 adverse outcomes, and interpretability for clinical teams [16:00] A layperson’s definition of risk stratification: a ranked patient list by probability of an adverse outcome. The Charlson Comorbidity Index as a standard example, and why ML outperforms it once you need more than one outcome. [29:27] The data layer you need. Claims, EHR (Electronic Health Record) dumps, SDOH (Social Determinants of Health) feeds, ADT (Admission, Discharge, Transfer) data. This is hard because everybody has different schema: payer one's data looks nothing like payer two's, and the data “owner” often can't explain their own tables. [41:50] Feature engineering: building hypotheses the model can test. The difference between "feature" as a PM uses the word and "feature" as a data scientist uses it. [47:52] Interpretability: being able to tell a human being why a patient ranked where they did. Two structural issues: incomplete data and unknown causal frameworks [54:14] Failure modes: Buyers without a care management strategy. Reversion to the mean within two years and you don’t know whether you made a difference. Not knowing where to cut the list (Patient number 50 vs 51?). And a related issue: missing data on what the clinical program is actually doing, which makes it impossible to separate a bad model from a bad workflow [01:09:39] Whether anyone should still learn traditional ML, or just LLMs. Shay's answer: gradient boosted trees and transformers are on a spectrum so it’s kind of a false dichotomy. Then: the ETHOS paper from Epic, a transformer trained on 300 million patient records that enables one model for many outcomes and counterfactual inference. And what Shay is watching next: robotics and the last-mile problem. AI can identify a list of people with fall risk but something or someone still has to act on it

Shay has been working in healthcare AI since high school, when a cold-email to a Houston lab professor accidentally landed him in an ML research group in 2016. He went on to ClosedLoop, a seed-stage healthcare ML platform out of Austin, where he spent years training models from scratch on claims and EHR (Electronic Health Record) data, embedding them into clinical workflows, and training nurses to use the outputs. More recently, he works in AI governance: advising health systems on which tools to deploy, how to evaluate vendors, and what due diligence on healthcare AI actually looks like. This episode covers risk stratification end to end: what it is, how it gets built, where it tends to fall apart, and what transformer-based clinical event models might mean for the whole field.

NOW PLAYING

Predicting the Future (risk stratification) with Shay Sayed

0:00 1:28:17

No transcript for this episode yet

We transcribe on demand. Request one and we'll notify you when it's ready — usually under 10 minutes.

Frequently Asked Questions

How long is this episode of Ops I did it again by Out of Pocket?

This episode is 1 hour and 28 minutes long.

When was this Ops I did it again by Out of Pocket episode published?

This episode was published on April 29, 2026.

What is this episode about?

What we cover Risk stratification is ranking patients by probability of an adverse outcome. Traditional indices like the Charlson Comorbidity Index use clinician-designed scoring systems; ML-based approaches automate feature generation and let the...

Can I download this Ops I did it again by Out of Pocket episode?

Yes, you can download this episode by clicking the download button on the episode player, or subscribe to the podcast in your preferred podcast app for automatic downloads.
URL copied to clipboard!