RLHN

AI & ML interests

None defined yet.

Recent Activity

nthakur  updated a dataset 11 days ago
rlhn/remove-100K
nthakur  updated a dataset 11 days ago
rlhn/remove-250K
nthakur  updated a dataset 11 days ago
rlhn/remove-400K
View all activity

Welcome to RLHN

RLHN (ReLabeing Hard Negatives) uses a cascading LLM framework to identify and relabel false negatives in IR training datasets.

This repository contains training datasets curated by RLHN & models fine-tuned on these curated datasets.

List of Contributors:

  • Nandan Thakur*
  • Crystina Zhang*
  • Xueguang Ma
  • Jimmy Lin

Preprint URL: https://huggingface.co/papers/2505.16967

Citation

@misc{thakur2025rlhn,
      title={Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval}, 
      author={Nandan Thakur and Crystina Zhang and Xueguang Ma and Jimmy Lin},
      year={2025},
      eprint={2505.16967},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2505.16967}, 
}