hallucinations-leaderboard

community

https://www.neuralnoise.com

pminervini

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

pminervini authored a paper about 1 month ago

OpenSIR: Open-Ended Self-Improving Reasoner

pingnieuk authored a paper about 1 month ago

VisCoder2: Building Multi-Language Visualization Coding Agents

pminervini authored a paper about 2 months ago

Unveiling and Consulting Core Experts in Retrieval-Augmented MoE-based LLMs

View all activity

pminervini

authored a paper about 1 month ago

OpenSIR: Open-Ended Self-Improving Reasoner

Paper • 2511.00602 • Published Nov 1 • 20

pingnieuk

authored a paper about 1 month ago

VisCoder2: Building Multi-Language Visualization Coding Agents

Paper • 2510.23642 • Published Oct 24 • 21

pminervini

authored 9 papers about 2 months ago

An Analysis of Decoding Methods for LLM-based Agents for Faithful Multi-Hop Question Answering

Paper • 2503.23415 • Published Mar 30 • 1

MedDistant19: Towards an Accurate Benchmark for Broad-Coverage Biomedical Relation Extraction

Paper • 2204.04779 • Published Apr 10, 2022

PiCSAR: Probabilistic Confidence Selection And Ranking

Paper • 2508.21787 • Published Aug 29 • 4

Learning GUI Grounding with Spatial Reasoning from Visual Feedback

Paper • 2509.21552 • Published Sep 25 • 11

pingnieuk

authored a paper about 2 months ago

BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions

Paper • 2510.10666 • Published Oct 12 • 27

pingnieuk

authored 3 papers 2 months ago

A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports

Paper • 2510.02190 • Published Oct 2 • 18

EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing

Paper • 2509.26346 • Published Sep 30 • 18

VideoScore2: Think before You Score in Generative Video Evaluation

Paper • 2509.22799 • Published Sep 26 • 25

yuzhaouoe

authored a paper 2 months ago

Learning GUI Grounding with Spatial Reasoning from Visual Feedback

Paper • 2509.21552 • Published Sep 25 • 11

pingnieuk

authored a paper 3 months ago

VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

Paper • 2509.01055 • Published Sep 1 • 75

aryopg

authored a paper 3 months ago

PiCSAR: Probabilistic Confidence Selection And Ranking

Paper • 2508.21787 • Published Aug 29 • 4

pingnieuk

authored 2 papers 4 months ago

Unveiling and Consulting Core Experts in Retrieval-Augmented MoE-based LLMs

Paper • 2410.15438 • Published Oct 20, 2024

VisualWebInstruct: Scaling up Multimodal Instruction Data through Web Search

Paper • 2503.10582 • Published Mar 13 • 24

AI & ML interests

Recent Activity

Team members 10

hallucinations-leaderboard's activity