1 43 28

Symbol-LLM

https://xufangzhi.github.io/symbol-llm-page/

https://github.com/xufangzhi/Symbol-LLM

AI & ML interests

Natural Language Processing, Large Language Models, Neuro-Symbolic

Recent Activity

upvoted a paper 2 months ago

OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions

upvoted a paper 2 months ago

TIDE: Trajectory-based Diagnostic Evaluation of Test-Time Improvement in LLM Agents

upvoted a paper 5 months ago

ARISE: An Adaptive Resolution-Aware Metric for Test-Time Scaling Evaluation in Large Reasoning Models

View all activity

Organizations

upvoted 2 papers 2 months ago

OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions

Paper • 2602.05843 • Published Feb 5 • 61

TIDE: Trajectory-based Diagnostic Evaluation of Test-Time Improvement in LLM Agents

Paper • 2602.02196 • Published Feb 2 • 35

upvoted 2 papers 5 months ago

ARISE: An Adaptive Resolution-Aware Metric for Test-Time Scaling Evaluation in Large Reasoning Models

Paper • 2510.06014 • Published Oct 7, 2025 • 10

OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows

Paper • 2510.24411 • Published Oct 28, 2025 • 72

upvoted a paper 6 months ago

JanusCoder: Towards a Foundational Visual-Programmatic Interface for Code Intelligence

Paper • 2510.23538 • Published Oct 27, 2025 • 98

upvoted a paper 7 months ago

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

Paper • 2509.15221 • Published Sep 18, 2025 • 111

liked a dataset 8 months ago

Qika/xraybench

Viewer • Updated Jul 14, 2025 • 14.2k • 15 • 3

upvoted a collection 8 months ago

DeepMedix-R1

Collection

Chest X-ray foundation model with step reasoning. • 2 items • Updated Jul 14, 2025 • 4

liked a model 8 months ago

Qika/DeepMedix-R1

8B • Updated Jul 14, 2025 • 878 • 32

upvoted a paper 8 months ago

CodeEvo: Interaction-Driven Synthesis of Code-centric Data through Hybrid and Iterative Feedback

Paper • 2507.22080 • Published Jul 25, 2025 • 9

upvoted a paper 9 months ago

MUR: Momentum Uncertainty guided Reasoning for Large Language Models

Paper • 2507.14958 • Published Jul 20, 2025 • 47

upvoted a paper 10 months ago

SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning

Paper • 2506.01713 • Published Jun 2, 2025 • 48

upvoted 3 papers 11 months ago

A Controllable Examination for Long-Context Language Models

Paper • 2506.02921 • Published Jun 3, 2025 • 34

GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents

Paper • 2506.03143 • Published Jun 3, 2025 • 54

ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows

Paper • 2505.19897 • Published May 26, 2025 • 104

upvoted 5 papers about 1 year ago

Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning

Paper • 2504.08672 • Published Apr 11, 2025 • 55

Breaking the Data Barrier -- Building GUI Agents Through Task Generalization

Paper • 2504.10127 • Published Apr 14, 2025 • 17

FortisAVQA and MAVEN: a Benchmark Dataset and Debiasing Framework for Robust Multimodal Reasoning

Paper • 2504.00487 • Published Apr 1, 2025 • 18

UI-R1: Enhancing Action Prediction of GUI Agents by Reinforcement Learning

Paper • 2503.21620 • Published Mar 27, 2025 • 62

MAPS: A Multi-Agent Framework Based on Big Seven Personality and Socratic Guidance for Multimodal Scientific Problem Solving

Paper • 2503.16905 • Published Mar 21, 2025 • 54

Symbol-LLM

AI & ML interests

Recent Activity

Organizations

Symbol-LLM's activity