🧩 Model Weights for Towards Atoms of Large Language Models

This repository contains the model weights associated with the paper:

👉 Towards Atoms of Large Language Models

Specifically, it provides the weights of threshold-activated sparse autoencoders (SAEs) trained on activations across layers of Gemma2-2B, using the CounterFact dataset.

Note that only the model weights are included in this repository.

For complete implementation, including training scripts, data preprocessing, and evaluation pipelines, please refer to the main codebase:

👉 https://github.com/ChenhuiHu/towards_atoms

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support