29 68 40

Xueguang Ma PRO

MrLight

MXueguang

AI & ML interests

None yet

Recent Activity

new activity about 16 hours ago

Tevatron/OmniEmbed-v0.1:Integrate with Sentence Transformers v5.4

upvoted a paper 12 days ago

FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios

upvoted a paper 16 days ago

ORBIT: Scalable and Verifiable Data Generation for Search Agents on a Tight Budget

View all activity

Organizations

New activity in Tevatron/OmniEmbed-v0.1 about 16 hours ago

Integrate with Sentence Transformers v5.4

#3 opened 1 day ago by

tomaarsen

upvoted a paper 12 days ago

FORGE:Fine-grained Multimodal Evaluation for Manufacturing Scenarios

Paper • 2604.07413 • Published 17 days ago • 94

upvoted 3 papers 16 days ago

ORBIT: Scalable and Verifiable Data Generation for Search Agents on a Tight Budget

Paper • 2604.01195 • Published 24 days ago • 4

Watch Before You Answer: Learning from Visually Grounded Post-Training

Paper • 2604.05117 • Published 19 days ago • 35

SWE-Next: Scalable Real-World Software Engineering Tasks for Agents

Paper • 2603.20691 • Published Mar 21 • 10

upvoted a paper 17 days ago

Learning to Retrieve from Agent Trajectories

Paper • 2604.04949 • Published 26 days ago • 70

updated a dataset 2 months ago

Tevatron/msmarco-passage

Viewer • Updated Feb 16 • 408k • 448 • 10

upvoted a paper 3 months ago

Context Forcing: Consistent Autoregressive Video Generation with Long Context

Paper • 2602.06028 • Published Feb 5 • 36

updated 2 datasets 3 months ago

Tevatron/scifact

Updated Jan 25 • 154 • 2

Tevatron/beir-corpus

Updated Jan 25 • 129

upvoted a collection 3 months ago

FinMMEval Lab @CLEF'2026

Collection

Training datasets for FinMMEval Lab @CLEF'2026 • 12 items • Updated Mar 22 • 9

upvoted 2 papers 5 months ago

TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models

Paper • 2512.02014 • Published Dec 1, 2025 • 74

Guided Self-Evolving LLMs with Minimal Human Supervision

Paper • 2512.02472 • Published Dec 2, 2025 • 55

New activity in TIGER-Lab/WebInstruct-verified 5 months ago

For data in the multi-choice category, a lot of it only gives questions without options.

#3 opened 5 months ago by

zlk

updated a dataset 5 months ago

TIGER-Lab/WebInstruct-verified

Viewer • Updated Nov 27, 2025 • 462k • 325 • 67

upvoted a paper 5 months ago

General Agentic Memory Via Deep Research

Paper • 2511.18423 • Published Nov 23, 2025 • 170

published a dataset 5 months ago

MrLight/webinstruct-verified-fixmc

Viewer • Updated Nov 11, 2025 • 27.9k • 28

updated a dataset 5 months ago

MrLight/webinstruct-verified-fixmc

Viewer • Updated Nov 11, 2025 • 27.9k • 28

updated a dataset 6 months ago

SVRL2/general-reasoner-v2-data-fineweb-megamath-1014-top40

Viewer • Updated Nov 4, 2025 • 1.1M • 3

published a dataset 6 months ago