Knowledge Distillation - Students meta-llama/Llama-3.1-8B-Instruct Text Generation • 8B • Updated Sep 25, 2024 • 9.75M • • 5.83k Qwen/Qwen2.5-7B-Instruct Text Generation • 8B • Updated Jan 12, 2025 • 12.4M • • 1.28k
LM Eval Tasks cais/mmlu Viewer • Updated Mar 8, 2024 • 231k • 519k • 729 SaylorTwift/bbh Viewer • Updated Jun 16, 2024 • 6.76k • 61.7k • 6 Rowan/hellaswag Viewer • Updated Jul 10, 2025 • 60k • 309k • 175 allenai/ai2_arc Viewer • Updated Dec 21, 2023 • 7.79k • 453k • 337
Knowledge Distillation - Teachers meta-llama/Llama-3.1-70B-Instruct Text Generation • 71B • Updated Dec 15, 2024 • 757k • • 913 Qwen/Qwen2.5-72B-Instruct Text Generation • 73B • Updated Jan 12, 2025 • 837k • • 938
Knowledge Distillation - Students meta-llama/Llama-3.1-8B-Instruct Text Generation • 8B • Updated Sep 25, 2024 • 9.75M • • 5.83k Qwen/Qwen2.5-7B-Instruct Text Generation • 8B • Updated Jan 12, 2025 • 12.4M • • 1.28k
Knowledge Distillation - Teachers meta-llama/Llama-3.1-70B-Instruct Text Generation • 71B • Updated Dec 15, 2024 • 757k • • 913 Qwen/Qwen2.5-72B-Instruct Text Generation • 73B • Updated Jan 12, 2025 • 837k • • 938
LM Eval Tasks cais/mmlu Viewer • Updated Mar 8, 2024 • 231k • 519k • 729 SaylorTwift/bbh Viewer • Updated Jun 16, 2024 • 6.76k • 61.7k • 6 Rowan/hellaswag Viewer • Updated Jul 10, 2025 • 60k • 309k • 175 allenai/ai2_arc Viewer • Updated Dec 21, 2023 • 7.79k • 453k • 337