TinyV TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning Paper • 2505.14625 • Published May 20, 2025 • 13 Sleeping 1 TinyV 💬 1 Verify model answers against ground truth zhangchenxu/TinyV-Qwen3-1.7B Text Generation • 2B • Updated Jun 22, 2025 • 4 zhangchenxu/TinyV-Qwen3-1.7B-Think Text Generation • 2B • Updated Jun 22, 2025 • 2 • 3
TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning Paper • 2505.14625 • Published May 20, 2025 • 13
TinyV TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning Paper • 2505.14625 • Published May 20, 2025 • 13 Sleeping 1 TinyV 💬 1 Verify model answers against ground truth zhangchenxu/TinyV-Qwen3-1.7B Text Generation • 2B • Updated Jun 22, 2025 • 4 zhangchenxu/TinyV-Qwen3-1.7B-Think Text Generation • 2B • Updated Jun 22, 2025 • 2 • 3
TinyV: Reducing False Negatives in Verification Improves RL for LLM Reasoning Paper • 2505.14625 • Published May 20, 2025 • 13
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step312 4B • Updated Jul 30, 2025
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step288 4B • Updated Jul 30, 2025
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step256 4B • Updated Jul 30, 2025
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step224 4B • Updated Jul 30, 2025
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step192 4B • Updated Jul 30, 2025
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step160 4B • Updated Jul 30, 2025
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step128 4B • Updated Jul 30, 2025 • 1
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step96 4B • Updated Jul 30, 2025
zhangchenxu/RB-Qwen2.5-VL-3B-Instruct-vlr_syn_filtered_10k_exp6_hybrid_nothink-GRPO_step64 4B • Updated Jul 30, 2025