DARWIN-Family
Collection
๋น๋๋ํํธ โข 35 items โข Updated โข 11
Darwin V8 ๊ธฐ๋ฐ Claude Opus ์ฆ๋ฅ ๋ชจ๋ธ (2B ํ๋ผ๋ฏธํฐ)
Qwen/Qwen3.5-2BFINAL-Bench/Qwen3.5-2B-Opus-Distill-v1FINAL-Bench/lastbrain โ merged full-weight standaloneall-linear, rank=16, ฮฑ=32)from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "FINAL-Bench/lastbrain"
tok = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True
)
messages = [
{"role": "user", "content": "If a train travels 60 km in 45 minutes, what is its speed in km/h?"}
]
prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tok(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=800,
do_sample=False,
pad_token_id=tok.eos_token_id,
)
print(tok.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))
์์ ์ถ๋ ฅ:
To find the speed of the train in km/h, we need to convert the given time from minutes to hours.
**Given:**
- Distance = 60 km
- Time = 45 minutes
**Step 1: Convert time to hours**
Since there are 60 minutes in 1 hour:
**Step 2: Calculate speed**
**Final Answer:** The speed of the train is **80 km/h**.
[Qwen/Qwen3.5-2B] โโโโ Base ๋ชจ๋ธ (๋๊ฒฐ)
+
[4,451 Claude Opus/Sonnet reasoning traces]
โ
[SFT Training]
- LoRA (all-linear, r=16, ฮฑ=32)
- Learning rate: 2e-4 (V8 rule: ร10 FullFT)
- 2 epochs, bf16, 8รB200 DDP
- Loss: 1.33 โ 1.10 (-17%)
- Token accuracy: 68% โ 72% (+4%p)
โ
[LoRA merge into base weights]
โ
[lastbrain] โ ์ด ๋ชจ๋ธ
| ๋ฐ์ดํฐ์ | ์ํ ์ | ์ถ์ฒ Teacher |
|---|---|---|
| nohurry/Opus-4.6-Reasoning-3000x-filtered | 2,326 | Claude Opus 4.6 |
| TeichAI/Claude-Opus-4.6-Reasoning-887x | 887 | Claude Opus 4.6 |
| TeichAI/claude-4.5-opus-high-reasoning-250x | 250 | Claude Opus 4.5 |
| TeichAI/Claude-Sonnet-4.6-Reasoning-1100x | 1,100 | Claude Sonnet 4.6 |
| ํฉ๊ณ (ํํฐ ํ) | 4,451 | - |
all-linear target, high LR, ์์ rank๋ OK์ด ๋ชจ๋ธ์ ๋ค์ ๋ ์ปดํฌ๋ํธ๋ฅผ mergeํ์ฌ ๋ง๋ค์ด์ก์ต๋๋ค:
from transformers import AutoModelForCausalLM
from peft import PeftModel
import torch
base = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen3.5-2B", torch_dtype=torch.bfloat16
)
model = PeftModel.from_pretrained(
base, "FINAL-Bench/Qwen3.5-2B-Opus-Distill-v1"
)
merged = model.merge_and_unload()
merged.save_pretrained("./lastbrain")
| ์ ํ | ์ ๋ต ์ฌ๋ถ | ์๋ต ๊ธธ์ด |
|---|---|---|
| Math (๊ธฐ์ฐจ ์๋) | โ 80 km/h | 771์ |
| Logic (ํค ๋น๊ต) | โ Carol | 354์ |
| Code (์์ ํ๋ณ) | โ Python ํจ์ | 1,712์ |
| Korean (์ต์ ์๊ธ) | โ 1,577,600์ | 142์ |
Markdown/LaTeX/Step-by-Step ๊ตฌ์กฐํ๋ ๋ต๋ณ ์์ฐ์ค๋ฝ๊ฒ ์์ฑ
<think> ํ๊ทธ: ๋ช
์์ ์ฌ์ฉ ๋ฎ์ (reasoning์ ๋ณธ๋ฌธ์ ํตํฉ)FINAL-Bench/Qwen3.5-2B-Opus-Distill-v1 โ ์ด ๋ชจ๋ธ์ LoRA ์ด๋ํฐ ๋จ๋
๋ฒ์ FINAL-Bench/Qwen3.5-2B-Opus-SDPO-v1 โ Phase 4 SDPO ์๊ธฐ์ฆ๋ฅ ๊ฐํ๋ณธDarwin V8 ยท Part of the evolutionary model merging series by VIDRAFT_LAB