Ben Shi's picture

Ben Shi

benshi34

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

τ-Knowledge: Evaluating Conversational Agents over Unstructured Knowledge

submitted a paper 6 days ago

τ-Knowledge: Evaluating Conversational Agents over Unstructured Knowledge

upvoted a paper 9 months ago

When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration

View all activity

Organizations

None yet

upvoted a paper 6 days ago

τ-Knowledge: Evaluating Conversational Agents over Unstructured Knowledge

Paper • 2603.04370 • Published 11 days ago • 2

submitted a paper to Daily Papers 6 days ago

τ-Knowledge: Evaluating Conversational Agents over Unstructured Knowledge

Paper • 2603.04370 • Published 11 days ago • 2

upvoted a paper 9 months ago

When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration

Paper • 2506.05579 • Published Jun 5, 2025 • 4

commented a paper 9 months ago

When Models Know More Than They Can Explain: Quantifying Knowledge Transfer in Human-AI Collaboration

Paper • 2506.05579 • Published Jun 5, 2025 • 4 •

authored 3 papers 11 months ago

Can Language Models Solve Olympiad Programming?

Paper • 2404.10952 • Published Apr 16, 2024 • 1

BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval

Paper • 2407.12883 • Published Jul 16, 2024 • 13

IMPersona: Evaluating Individual Level LM Impersonation

Paper • 2504.04332 • Published Apr 6, 2025 • 2

upvoted a paper 11 months ago

IMPersona: Evaluating Individual Level LM Impersonation

Paper • 2504.04332 • Published Apr 6, 2025 • 2

upvoted an article about 1 year ago

Article

Introducing the LiveCodeBench Leaderboard - Holistic and Contamination-Free Evaluation of Code LLMs

+5

Apr 16, 2024

•

16

updated a dataset about 1 year ago

benshi34/qual-analysis-reasoning-retrieval

Viewer • Updated Jan 7, 2025 • 80 • 18