Python

eval-neuron-explanation

A framework for evaluating auto-interp pipelines, i.e., natural language explanations of neurons.

PythonEmergingcausal-interventionexplanabilityinterpretabilityneurons

Stars

Forks

Contributors

Last push

24mo ago

Recent commits

Latest commits.

Builders behind this project.