33. Mechanistic Interpretability

David Lohmann

Ch 33: Mechanistic Interpretability (10 min)

Original Google Slides (well-formatted) Quarto version may have some slides missing or poorly formatted

Mechanistic Interpretability __ - Understanding the inner workings of LLMs by identifying sparse representations of “features” or “circuits” encoded in model weights.__