100k-row datasets filtered from https://huggingface.co/datasets/monology/pile-uncopyrighted. Doesn't include Books3, BookCorpus2, OpenSubtitles, YTSub
AI & ML interests
Singular Learning Theory & Developmental Interpretability
Recent Activity
View all activity
Attention-only transformers, sweep over head dimension
Attention-only transformers, sweep over number of heads (for fixed head dimension)
Attention-only transformers, sweep over model dimension
100k-row datasets filtered from https://huggingface.co/datasets/monology/pile-uncopyrighted. Doesn't include Books3, BookCorpus2, OpenSubtitles, YTSub
Attention-only transformers, sweep over head dimension
Attention-only transformers, sweep over number of heads (variable head dimension)
Attention-only transformers, sweep over number of heads (for fixed head dimension)
Attention-only transformers, sweep over number of layers
Attention-only transformers, sweep over model dimension
Missing square-brackets