interpretable-layer-cost

A calculator for estimating the compute cost of building a sparse auto-encoder layer into an LLM to make concepts inside such LLM interpretable. Auto-published at https://huge.github.io/interpretable-layer-cost

todo:

add intro about http://transformer-circuits.pub/2023/monosemantic-features/index.html#problem-setup , short read, ..
explain params( like http://transformer-circuits.pub/2023/monosemantic-features/index.html#problem-setup ) and draw the inserted layer struct
sketch replication OSS efforts and ideas to be explored/developed on public-weights LLMs
instead of the parameter for sheer training samples, maybe a target precision form a scaling law could be set

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LICENSE		LICENSE
README.md		README.md
index.html		index.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

interpretable-layer-cost

About

Uh oh!

Releases

Contributors 2

Uh oh!

Languages

License

Huge/interpretable-layer-cost

Folders and files

Latest commit

History

Repository files navigation

interpretable-layer-cost

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Contributors 2

Uh oh!

Languages