babybabellm-multi-all / README.md

suchirsalhan

Upload README.md with huggingface_hub

228f5b1 verified about 2 months ago

preview code

raw

history blame

961 Bytes

metadata

tags:
  - babylm
  - language-model
  - coherence
license: mit

babybabellm-multismall

This repository contains checkpoints for the multismall variant of BabyBabeLLM.

Files

*_15_16.bin – main model weights
*_15_16_ema.bin – EMA smoothed weights
*_15_16_state_dict.bin – PyTorch state dict
pytorch_model.bin – extracted EMA weights (for AutoModel)
Config + tokenizer files for model loading (zipped in shared_files.zip)

Usage

from transformers import AutoModel, AutoTokenizer

repo = "suchirsalhan/babybabellm-multismall"

tokenizer = AutoTokenizer.from_pretrained(repo)
model = AutoModel.from_pretrained(repo)

inputs = tokenizer("Hello world!", return_tensors="pt")
outputs = model(**inputs)

Notes

These are research checkpoints trained on BabyLM-style data.
Model naming: multismall indicates the language/config variant.

To unzip shared files

unzip shared_files.zip -d .