Upload README.md with huggingface_hub
Browse files
    	
        README.md
    ADDED
    
    | @@ -0,0 +1,39 @@ | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | 
|  | |
| 1 | 
            +
            ---
         | 
| 2 | 
            +
            tags:
         | 
| 3 | 
            +
            - babylm
         | 
| 4 | 
            +
            - language-model
         | 
| 5 | 
            +
            - coherence
         | 
| 6 | 
            +
            license: mit
         | 
| 7 | 
            +
            ---
         | 
| 8 | 
            +
             | 
| 9 | 
            +
            # babybabellm-multismall
         | 
| 10 | 
            +
             | 
| 11 | 
            +
            This repository contains checkpoints for the **multismall** variant of **BabyBabeLLM**.
         | 
| 12 | 
            +
             | 
| 13 | 
            +
            ## Files
         | 
| 14 | 
            +
            - `*_15_16.bin` – main model weights  
         | 
| 15 | 
            +
            - `*_15_16_ema.bin` – EMA smoothed weights  
         | 
| 16 | 
            +
            - `*_15_16_state_dict.bin` – PyTorch state dict  
         | 
| 17 | 
            +
            - `pytorch_model.bin` – extracted EMA weights (for AutoModel)  
         | 
| 18 | 
            +
            - Config + tokenizer files for model loading (zipped in shared_files.zip)  
         | 
| 19 | 
            +
             | 
| 20 | 
            +
            ## Usage
         | 
| 21 | 
            +
             | 
| 22 | 
            +
            ```python
         | 
| 23 | 
            +
            from transformers import AutoModel, AutoTokenizer
         | 
| 24 | 
            +
             | 
| 25 | 
            +
            repo = "suchirsalhan/babybabellm-multismall"
         | 
| 26 | 
            +
             | 
| 27 | 
            +
            tokenizer = AutoTokenizer.from_pretrained(repo)
         | 
| 28 | 
            +
            model = AutoModel.from_pretrained(repo)
         | 
| 29 | 
            +
             | 
| 30 | 
            +
            inputs = tokenizer("Hello world!", return_tensors="pt")
         | 
| 31 | 
            +
            outputs = model(**inputs)
         | 
| 32 | 
            +
            ```
         | 
| 33 | 
            +
             | 
| 34 | 
            +
            ## Notes
         | 
| 35 | 
            +
            - These are research checkpoints trained on BabyLM-style data.
         | 
| 36 | 
            +
            - Model naming: `multismall` indicates the language/config variant.
         | 
| 37 | 
            +
             | 
| 38 | 
            +
            # To unzip shared files
         | 
| 39 | 
            +
            # unzip shared_files.zip -d .
         | 
