SyntheticSAEBench Model Variations

This repository contains variations on the SynthSAEBench-16k model, organized into subdirs based on the specific attribute that's different. Unless otherwise specified, all other attributes are identical to the original SynthSAEBench-16k model.

firing-magnitude-stdev

These models change the stev of firing magnitude, setting it to a constant for each feature in the model. The base model uses a random std per-feature with mean 0.5. Available variations:

  • std-0
  • std-0.1
  • std-0.5
  • std-2.5

superposition

These models change the hidden dimension of the model, changing the level of superposition in the model. Larger hidden dim means less superposition. The base model has hidden dim 768. Available variations:

  • d-512
  • d-1024
  • d-1536

truncate-num-features

These models truncate the number of features in the original model, keeping the first N features. The base model has 16384 feature. Available variations:

  • n-4096
  • n-8192

relative-firing-probability

These models scale all the probabilities of the original model by the given multiplier (1.0 would be identical to the base model). This also scales the L0 of the model. Available variations:

  • rel-p-0.1
  • rel-p-0.25
  • rel-p-0.5
  • rel-p-0.75
  • rel-p-1.25
  • rel-p-1.5

misc

These models change several properties at once, typically using different hierarchy structures. However, the current models here are designed to keep the L0 of the first 4096 features at around 25 to match the standard model. Available variations:

  • hierarchy-128-128-me-1.0-l0-40-4kl0-25
  • rand-hierarchy-16-4-32-me-0.75-l0-30-4kl0-24

In these models, me-0.75 means 75% of nodes in the hierarchy have mutually-exclusive children. The number after hierarchy is the number of root nodes. rand-hierarchy means there is a random number of children per parent. E.g. rand-hierarchy-16-4-32 means 16 root nodes, and randomly between 4 and 32 child nodes per parent. For full details of the settings of misc models, it's best to look at the model config directly.

Usage

from sae_lens.synthetic import SyntheticModel

model = SyntheticModel.from_pretrained("chanind/synth-sae-bench-variations", model_path="model/path")
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support