2 14 2

Tianqing Fang

tqfang229

https://tqfang.github.io/

AI & ML interests

LLM, Agent

Recent Activity

authored a paper about 14 hours ago

WebDevJudge: Evaluating (M)LLMs as Critiques for Web Development Quality

authored a paper about 14 hours ago

InComeS: Integrating Compression and Selection Mechanisms into LLMs for Efficient Model Editing

authored a paper about 14 hours ago

Guided Self-Evolving LLMs with Minimal Human Supervision

View all activity

Organizations

None yet

authored 4 papers about 14 hours ago

WebDevJudge: Evaluating (M)LLMs as Critiques for Web Development Quality

Paper • 2510.18560 • Published Oct 21, 2025 • 1

InComeS: Integrating Compression and Selection Mechanisms into LLMs for Efficient Model Editing

Paper • 2505.22156 • Published May 28, 2025

Guided Self-Evolving LLMs with Minimal Human Supervision

Paper • 2512.02472 • Published Dec 2, 2025 • 53

Inference-Time Scaling of Verification: Self-Evolving Deep Research Agents via Test-Time Rubric-Guided Verification

Paper • 2601.15808 • Published 5 days ago • 14

upvoted a paper about 18 hours ago

Inference-Time Scaling of Verification: Self-Evolving Deep Research Agents via Test-Time Rubric-Guided Verification

Paper • 2601.15808 • Published 5 days ago • 14

submitted a paper to Daily Papers about 19 hours ago

Inference-Time Scaling of Verification: Self-Evolving Deep Research Agents via Test-Time Rubric-Guided Verification

Paper • 2601.15808 • Published 5 days ago • 14

upvoted 2 papers about 2 months ago

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published Nov 26, 2025 • 120

Guided Self-Evolving LLMs with Minimal Human Supervision

Paper • 2512.02472 • Published Dec 2, 2025 • 53

upvoted 4 papers 3 months ago

The End of Manual Decoding: Towards Truly End-to-End Language Models

Paper • 2510.26697 • Published Oct 30, 2025 • 117

HSCodeComp: A Realistic and Expert-level Benchmark for Deep Search Agents in Hierarchical Rule Application

Paper • 2510.19631 • Published Oct 22, 2025 • 28

DeepWideSearch: Benchmarking Depth and Width in Agentic Information Seeking

Paper • 2510.20168 • Published Oct 23, 2025 • 28

Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents

Paper • 2510.14438 • Published Oct 16, 2025 • 14

authored a paper 3 months ago

Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents

Paper • 2510.14438 • Published Oct 16, 2025 • 14

authored 6 papers 4 months ago

Benchmarking Commonsense Knowledge Base Population with an Effective Evaluation Dataset

Paper • 2109.07679 • Published Sep 16, 2021

AbsPyramid: Benchmarking the Abstraction Ability of Language Models with a Unified Entailment Graph

Paper • 2311.09174 • Published Nov 15, 2023

AbsInstruct: Eliciting Abstraction Ability from LLMs through Explanation Tuning with Plausibility Estimation

Paper • 2402.10646 • Published Feb 16, 2024

CKBP v2: Better Annotation and Reasoning for Commonsense Knowledge Base Population

Paper • 2304.10392 • Published Apr 20, 2023

UniGist: Towards General and Hardware-aligned Sequence-level Long Context Compression

Paper • 2509.15763 • Published Sep 19, 2025

NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents

Paper • 2510.07172 • Published Oct 8, 2025 • 28

upvoted a paper 4 months ago

Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9, 2025 • 272

Tianqing Fang

AI & ML interests

Recent Activity

Organizations

tqfang229's activity