14 22 24

Bin Wang

wanderkid

https://wangbindl.github.io/

wangbinDL

AI & ML interests

Computer Vision, Multimodal Large Language Model

Recent Activity

liked a model 5 days ago

opendatalab/MinerU2.5-Pro-2604-1.2B

authored a paper 7 days ago

TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition

authored a paper 7 days ago

InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery

View all activity

Organizations

liked a model 5 days ago

opendatalab/MinerU2.5-Pro-2604-1.2B

Image-Text-to-Text • 1B • Updated 4 days ago • 1.34k • 40

authored 4 papers 7 days ago

TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition

Paper • 2512.01248 • Published Dec 1, 2025 • 12

InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery

Paper • 2602.08990 • Published Feb 9 • 77

MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding

Paper • 2603.22458 • Published 22 days ago • 135

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

Paper • 2603.25040 • Published 19 days ago • 130

upvoted a paper 7 days ago

MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale

Paper • 2604.04771 • Published 8 days ago • 115

submitted a paper to Daily Papers 7 days ago

MinerU2.5-Pro: Pushing the Limits of Data-Centric Document Parsing at Scale

Paper • 2604.04771 • Published 8 days ago • 115

liked a Space 19 days ago

MinerU Diffusion V1 0320 2.5B

🦀

demo of MinerU-Diffusion

liked a model 19 days ago

opendatalab/MinerU-Diffusion-V1-0320-2.5B

Image-to-Text • 3B • Updated 20 days ago • 2.42k • 22

upvoted a paper 20 days ago

MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding

Paper • 2603.22458 • Published 22 days ago • 135

upvoted a paper about 2 months ago

InternAgent-1.5: A Unified Agentic Framework for Long-Horizon Autonomous Scientific Discovery

Paper • 2602.08990 • Published Feb 9 • 77

upvoted a paper 2 months ago

Can LLMs Clean Up Your Mess? A Survey of Application-Ready Data Preparation with LLMs

Paper • 2601.17058 • Published Jan 22 • 190

upvoted 2 papers 3 months ago

ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning

Paper • 2512.05111 • Published Dec 4, 2025 • 50

DocDancer: Towards Agentic Document-Grounded Information Seeking

Paper • 2601.05163 • Published Jan 8 • 7

liked a model 4 months ago

opendatalab/TRivia-3B

Image-Text-to-Text • 4B • Updated Dec 2, 2025 • 495 • 8

upvoted a paper 4 months ago

TRivia: Self-supervised Fine-tuning of Vision-Language Models for Table Recognition

Paper • 2512.01248 • Published Dec 1, 2025 • 12

liked a Space 4 months ago

TRivia-3B

⭐

Convert table images into HTML tags with TRivia-3B

liked a dataset 4 months ago

opendatalab/AICC

Viewer • Updated Dec 25, 2025 • 4.82B • 9.44k • 106

authored a paper 7 months ago

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Paper • 2509.22186 • Published Sep 26, 2025 • 156

upvoted a paper 7 months ago

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Paper • 2509.22186 • Published Sep 26, 2025 • 156

Bin Wang

AI & ML interests

Recent Activity

Organizations

wanderkid's activity

MinerU Diffusion V1 0320 2.5B

TRivia-3B