HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-3-gamma-3epoch-bias-update 14B • Updated about 10 hours ago
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-3-gamma-part2 Text Generation • 14B • Updated about 12 hours ago
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-3-gamma-part2-run1 Text Generation • 14B • Updated about 13 hours ago
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-3-gamma-1epoch Text Generation • 14B • Updated about 14 hours ago
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-4-gamma-3epoch Text Generation • 14B • Updated 2 days ago • 9
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-remov-aux-only Text Generation • 14B • Updated 2 days ago • 12 • 1
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-6-gamma Text Generation • 14B • Updated 2 days ago • 9 • 1
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-2-gamma Text Generation • 14B • Updated 2 days ago • 28 • 1
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-4-gamma Text Generation • 14B • Updated 2 days ago • 24 • 1
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-1e-3-gamma Text Generation • 14B • Updated 2 days ago • 5 • 1
HectorHe/Qwen1.5-MOE-aux-free-sft-math7k-5e-5-gamma Text Generation • 14B • Updated 2 days ago • 23 • 1
HectorHe/Deepseek-V2-13B-Math7K-Expert-Enhance-Subset-Expert-MoE-32-experts Text Generation • 16B • Updated 30 days ago • 18
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-sft-math7k Text Generation • 16B • Updated about 1 month ago • 1.64k • 2
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-sft-math14k Text Generation • 16B • Updated about 1 month ago • 840 • 1
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-sft-s1K Text Generation • 16B • Updated about 1 month ago • 8 • 1
HectorHe/Deepseek-Coder-V2-Lite-13B-Instruct-sft-nemotron-code Text Generation • 0.0B • Updated about 1 month ago • 30 • 1