Linear Probes Mechanistic Interpretability, Understanding AI systems' inner workings is critical for ensuring value alignment and safety.