nanochat-d20 / README.md
burtenshaw's picture
burtenshaw HF Staff
Update README.md
3b2e7cc verified
metadata
license: apache-2.0
datasets:
  - karpathy/fineweb-edu-100b-shuffle
language:
  - en
model-index:
  - name: chat-d10
    results:
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Challenge
          split: test
        metrics:
          - type: acc_norm
            value: 29.61
            name: normalized accuracy
        source:
          url: https://github.com/karpathy/nanochat
          name: nanochat
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: AI2 Reasoning Challenge (25-Shot)
          type: ai2_arc
          config: ARC-Easy
          split: test
        metrics:
          - type: acc_norm
            value: 42.59
            name: normalized accuracy
        source:
          url: https://github.com/karpathy/nanochat
          name: nanochat
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: MMLU (5-Shot)
          type: cais/mmlu
          config: all
          split: test
        metrics:
          - type: acc
            value: 32.5
            name: accuracy
        source:
          url: https://github.com/karpathy/nanochat
          name: nanochat
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: GSM8k (5-shot)
          type: gsm8k
          config: main
          split: test
        metrics:
          - type: acc
            value: 4.32
            name: accuracy
        source:
          url: https://github.com/karpathy/nanochat
          name: nanochat
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: HumanEval
          type: openai_humaneval
          split: test
        metrics:
          - type: pass@1
            value: 5.49
            name: pass@1
        source:
          url: https://github.com/karpathy/nanochat
          name: nanochat
      - task:
          type: text-generation
          name: Text Generation
        dataset:
          name: ChatCORE
          type: chatcore
          split: test
        metrics:
          - type: score
            value: 9.88
            name: ChatCORE metric
        source:
          url: https://github.com/karpathy/nanochat
          name: nanochat

NanoChat SFT

This is the the checkpoint from Andrej Karpathy's fullstack llm project to build an LLM, nanochat.

Usage

Install transformers from this specific branch:

pip install git+https://github.com/huggingface/transformers.git@nanochat-implementation

Then, you can run this inference snippet:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer


model_id="nanochat-students/d20-chat-transformers"
max_new_tokens=64
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=False)
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=False, dtype=torch.bfloat16).to(device)
model.eval()

conversation = [
    {"role": "user", "content": "What is the capital of France?"},
]

inputs = tokenizer.apply_chat_template(
    conversation,
    add_generation_prompt=True,
    tokenize=True,
    return_tensors="pt"
).to(device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=max_new_tokens,
    )

# Decode only the generated tokens (excluding the input prompt)
generated_tokens = outputs[0, inputs.input_ids.shape[1]:]
print(tokenizer.decode(generated_tokens, skip_special_tokens=True))

vLLM Integration:

You can also run the model in vLLM, using the above branch install:

vllm serve nanochat-students/nanochat-d20 --enforce-eager

And then you can call the model like so:

url http://localhost:8000/v1/completions \
>   -H "Content-Type: application/json" \
>   -d '{"model": "nanochat-students/nanochat-d20", "prompt": "What is the capital of France?, "max_tokens": 7, "temperature": 0}'

Chat SFT Training Metrics

timestamp: 2025-10-14 20:17:42

  • run:
  • source: mid
  • dtype: bfloat16
  • device_batch_size: 4
  • num_epochs: 1
  • max_iterations: -1
  • target_examples_per_step: 32
  • unembedding_lr: 0.0040
  • embedding_lr: 0.2000
  • matrix_lr: 0.0200
  • weight_decay: 0.0000
  • init_lr_frac: 0.0200
  • eval_every: 100
  • eval_steps: 100
  • eval_metrics_every: 200
  • Training rows: 20,843
  • Number of iterations: 651
  • Training loss: 1.1904
  • Validation loss: 1.0664

Chat evaluation sft

timestamp: 2025-10-14 20:29:59

  • source: sft
  • task_name: None
  • dtype: bfloat16
  • temperature: 0.0000
  • max_new_tokens: 512
  • num_samples: 1
  • top_k: 50
  • batch_size: 8
  • model_tag: None
  • step: None
  • max_problems: None
  • ARC-Easy: 0.4259
  • ARC-Challenge: 0.2961
  • MMLU: 0.3250
  • GSM8K: 0.0432
  • HumanEval: 0.0549
  • ChatCORE metric: 0.0988

Logs from training can be found here: https://huggingface.co/spaces/nanochat-students/trackio