Evaluating Reasoning LLMs for Suicide Screening with the Columbia-Suicide Severity Rating Scale Paper • 2505.13480 • Published May 11, 2025
English Please: Evaluating Machine Translation with Large Language Models for Multilingual Bug Reports Paper • 2502.14338 • Published Feb 20, 2025
Cognitive-Mental-LLM: Evaluating Reasoning in Large Language Models for Mental Health Prediction via Online Text Paper • 2503.10095 • Published Mar 13, 2025
GitBugs: Bug Reports for Duplicate Detection, Retrieval Augmented Generation, and More Paper • 2504.09651 • Published Apr 13, 2025
Advancing Software Quality: A Standards-Focused Review of LLM-Based Assurance Techniques Paper • 2505.13766 • Published May 19, 2025
When Bugs Linger: A Study of Anomalous Resolution Time Outliers and Their Themes Paper • 2509.16140 • Published Sep 19, 2025
Advancing Reasoning in Large Language Models: Promising Methods and Approaches Paper • 2502.03671 • Published Feb 5, 2025 • 1
Enhancing Domain-Specific Retrieval-Augmented Generation: Synthetic Data Generation and Evaluation using Reasoning Models Paper • 2502.15854 • Published Feb 21, 2025
A Comprehensive Survey of Evaluation Techniques for Recommendation Systems Paper • 2312.16015 • Published Dec 26, 2023 • 1
A Comparative Study of Text Embedding Models for Semantic Text Similarity in Bug Reports Paper • 2308.09193 • Published Aug 17, 2023
Auto-labelling of Bug Report using Natural Language Processing Paper • 2212.06334 • Published Dec 13, 2022
A Comprehensive Survey of Regression Based Loss Functions for Time Series Forecasting Paper • 2211.02989 • Published Nov 5, 2022