A mini-project for the Neel Nanda MATS track
8 min read · September 17, 2025
2025 · steering mechanistic-interpretability