EPISODE · Jun 25, 2026 · 23 MIN
“Exploration: fine-tuning with parameter decomposition” by Lucius Bushnaq
TL;DR: We can destroy a 67M-parameter language model's ability to predict German text by fine-tuning a single number: the scalar prefactor on one German-related rank-1 parameter subcomponent. This is an early exploration into using parameter decomposition for a more targeted and interpretable form of model fine-tuning. At small German-token budgets, fine-tuning the scalar prefactor of a single German-related parameter subcomponent beats rank-1 and rank-4 LoRA [1] fine-tunes on the trade-off between German performance removed vs. English performance retained. The single scalar fine-tune reaches nats cross-entropy on German, the score you'd get from a uniform distribution over all output tokens, with nats cross-entropy increase to English over the base model, from as few as ~4 German training tokens, compared to tokens for the LoRAs. In a sense this is cheating, though: we're indirectly exploiting the German tokens we already spent when we did the parameter decomposition and interpreted activating examples for the resulting subcomponents. More interestingly, unlike the LoRAs, the scalar fine-tune consistently leaves French and Spanish almost untouched without us regularising for that. I found that out by accident. I didn't think to specify that performance on other languages should be retained, but the targeted nature [...] ---Outline:(02:04) Recap: Parameter subcomponents(03:31) Idea: fine-tune by rescaling existing subcomponents(05:42) Original plan(07:11) The selected subcomponents(07:39) Results: the 16-component edit vs. rank-1 LoRA(08:50) A happy accident(11:13) A privilege of not working with black boxes(14:56) Rollouts(15:27) Limitations(16:02) Acknowledgments(16:36) Appendix A. More LoRAs(16:41) Rank-4 LoRAs(18:17) Localised rank-1 LoRAs(19:41) Appendix B. Protocol and hyperparameters The original text contained 4 footnotes which were omitted from this narration. --- First published: June 25th, 2026 Source: https://www.lesswrong.com/posts/ieoWstubDQWLrMnhH/exploration-fine-tuning-with-parameter-decomposition --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
NOW PLAYING
“Exploration: fine-tuning with parameter decomposition” by Lucius Bushnaq
No transcript for this episode yet
Similar Episodes
Dec 20, 2021 ·0m