One post tagged with "Work in progress"

[WORK IN PROGRESS] Merged Language Model Feature Genealogy

May 1, 2024 · 13 min read

Practitioner of Machine Learning

This is an informal blog post about the stability of language model features, using mechanistic interpretability to trace the lineage of language model features through fine-tuning, and weight interpolation.