mlfoundations-dev/Qwen-7B-Inst_flas-attn_fa2_pack_Fals_clau_3_7_2025_tben_trac_shar_cuto-len_1280_rope-scal_yarn Text Generation • 8B • Updated 1 day ago
mlfoundations-dev/Qwen-7B-Inst_flas-attn_fa2_pack_Fals_clau_3_7_2025_tben_trac_shar_cuto-len_6400_rope-scal_yarn Text Generation • 8B • Updated 1 day ago
mlfoundations-dev/Qwen-7B-Inst_flas-attn_fa2_pack_Fals_clau_3_7_2025_tben_trac_shar_cuto-len_3200_rope-scal_yarn Text Generation • 8B • Updated 1 day ago
mlfoundations-dev/Qwen-7B-Inst_flas-attn_fa2_pack_Fals_clau_3_7_2025_tben_trac_shar_cuto-len_8000_rope-scal_yarn Text Generation • 8B • Updated 1 day ago
mlfoundations-dev/Qwen-7B-Inst_flas-attn_fa2_pack_Fals_clau_3_7_2025_tben_trac_shar_cuto-len_1600_rope-scal_yarn Text Generation • 8B • Updated 1 day ago
mlfoundations-dev/Qwen-7B-Inst_flas-attn_fa2_pack_Fals_clau_3_7_2025_tben_trac_shar_cuto-len_4000_rope-scal_yarn Text Generation • 8B • Updated 1 day ago
mlfoundations-dev/Qwen-7B-Inst_pack_Fals_clau_3_7_2025_tben_trac_shar_cuto-len_6400_rope-scal_yarn Updated 2 days ago
mlfoundations-dev/Qwen-0.5B-Inst_pack_Fals_clau_3_7_2025_tben_trac_shar_cuto-len_6400_rope-scal_yarn Text Generation • 0.6B • Updated 2 days ago
mlfoundations-dev/Qwen-0.5B-Inst_pack_Fals_clau_3_7_2025_tben_trac_shar_cuto-len_1280_rope-scal_yarn Updated 21 days ago
mlfoundations-dev/Qwen-1.5B-Inst_pack_Fals_clau_3_7_2025_tben_trac_shar_cuto-len_1280_rope-scal_yarn Updated 21 days ago
mlfoundations-dev/packing_False_claude_3_7_20250219_tbench_traces_sharegptv1_cutoff-len_64000_rope-scaling_yarn Updated 21 days ago
mlfoundations-dev/claude_3_7_20250219_tbench_traces_sharegptv1_cutoff-len_64000_rope-scaling_yarn Updated 21 days ago
mlfoundations-dev/claude_3_7_20250219_tbench_traces_sharegptv1 Text Generation • 8B • Updated 26 days ago • 21
mlfoundations-dev/claude_3_7_tbench_traces_sharegptv1 Text Generation • 8B • Updated 26 days ago • 10
mlfoundations-dev/Qwen2.5-7B-Instruct_qwq_mix_qwen3_science Text Generation • 8B • Updated Jun 29 • 7
mlfoundations-dev/Qwen2.5-7B-Instruct_qwq_mix_r1_science Text Generation • 8B • Updated Jun 29 • 13 • 1
mlfoundations-dev/openthoughts3_100k_qwen25_1b_bsz256_lr16e5_epochs5 Text Generation • 2B • Updated Jun 25 • 5
mlfoundations-dev/openthoughts3_100k_qwen25_1b_bsz256_lr2e5_epochs5 Text Generation • 2B • Updated Jun 25 • 5
mlfoundations-dev/openthoughts3_100k_qwen25_1b_bsz256_lr8e5_epochs5 Text Generation • 2B • Updated Jun 25 • 5