suchirsalhan commited on
Commit
228f5b1
·
verified ·
1 Parent(s): d32ebc7

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +39 -0
README.md ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - babylm
4
+ - language-model
5
+ - coherence
6
+ license: mit
7
+ ---
8
+
9
+ # babybabellm-multismall
10
+
11
+ This repository contains checkpoints for the **multismall** variant of **BabyBabeLLM**.
12
+
13
+ ## Files
14
+ - `*_15_16.bin` – main model weights
15
+ - `*_15_16_ema.bin` – EMA smoothed weights
16
+ - `*_15_16_state_dict.bin` – PyTorch state dict
17
+ - `pytorch_model.bin` – extracted EMA weights (for AutoModel)
18
+ - Config + tokenizer files for model loading (zipped in shared_files.zip)
19
+
20
+ ## Usage
21
+
22
+ ```python
23
+ from transformers import AutoModel, AutoTokenizer
24
+
25
+ repo = "suchirsalhan/babybabellm-multismall"
26
+
27
+ tokenizer = AutoTokenizer.from_pretrained(repo)
28
+ model = AutoModel.from_pretrained(repo)
29
+
30
+ inputs = tokenizer("Hello world!", return_tensors="pt")
31
+ outputs = model(**inputs)
32
+ ```
33
+
34
+ ## Notes
35
+ - These are research checkpoints trained on BabyLM-style data.
36
+ - Model naming: `multismall` indicates the language/config variant.
37
+
38
+ # To unzip shared files
39
+ # unzip shared_files.zip -d .