Agentic
updated
GAIA: a benchmark for General AI Assistants
Paper
•
2311.12983
•
Published
•
245
Viewer
•
Updated
•
932
•
14.3k
•
614
Viewer
•
Updated
•
253
•
3.15k
•
123
AppAgent: Multimodal Agents as Smartphone Users
Paper
•
2312.13771
•
Published
•
54
GPT-4V(ision) is a Generalist Web Agent, if Grounded
Paper
•
2401.01614
•
Published
•
22
WebVoyager: Building an End-to-End Web Agent with Large Multimodal
Models
Paper
•
2401.13919
•
Published
•
32
LARP: Language-Agent Role Play for Open-World Games
Paper
•
2312.17653
•
Published
•
33
Viewer
•
Updated
•
1.23k
•
30.6k
•
78
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
Paper
•
2402.01622
•
Published
•
38
A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for
Verifiers of Reasoning Chains
Paper
•
2402.00559
•
Published
•
3
TradingAgents: Multi-Agents LLM Financial Trading Framework
Paper
•
2412.20138
•
Published
•
17
RAG-Anything: All-in-One RAG Framework
Paper
•
2510.12323
•
Published
•
66
PaperBanana: Automating Academic Illustration for AI Scientists
Paper
•
2601.23265
•
Published
•
194