EPISODE · Aug 7, 2024 · 10 MIN
A recipe for frontier model post-training
from Interconnects · host Nathan Lambert
Apple, Meta, and Nvidia all agree -- synthetic data, iterative training, human preference labels, and lots of filtering.This is AI generated audio with Python and 11Labs.Source code: https://github.com/natolambert/interconnects-toolsOriginal post: https://www.interconnects.ai/p/frontier-model-post-training00:00 Llama 3.1 post-training and the new normal for RLHF01:18 A new standard pipeline01:45 Human preference data02:59 Scaling RLHF05:03 Synthetic data06:10 The new normal06:51 Data quality is king07:18 Apple confirms the new normalFig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/frontier-rlhf/img_018.pngFig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/frontier-rlhf/img_020.pngFig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/frontier-rlhf/img_031.pngFig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/frontier-rlhf/img_033.pngFig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/frontier-rlhf/img_035.png This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe
NOW PLAYING
A recipe for frontier model post-training
No transcript for this episode yet
Similar Episodes
May 20, 2026 ·8m
May 12, 2026 ·4m
Apr 28, 2026 ·7m
Apr 22, 2026 ·8m