Skip to content

wassname/awesome-interpretability

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 

Repository files navigation

awesome-interpretability

Mechanistic interpretability libraries

Explainability, counterfactuals and probing

Adapters

See this lit review of Adapter intervention types

Steering

TODO format https://github.com/vgel/repeng https://github.com/IBM/AISteer360 https://github.com/wassname/ssteer-eval-aware https://github.com/IBM/activation-steering https://github.com/chili-lab/Spherical-Steering https://github.com/safety-research/weight-steering

Structured output

See more

About

Awesome tools for interpreting, manipulating the internals of of deep neural networks.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors