Ai Research
Applied Interpretability: Foundation-Sec-Instruct Goes Under the Microscope
Exploring mechanistic interpretability methods for understanding internal behavior of security-focused language models.
Exploring mechanistic interpretability methods for understanding internal behavior of security-focused language models.