Scito2M: A 2 Million, 30-Year Cross-disciplinary Dataset for Temporal Scientometric Analysis Paper • 2410.09510 • Published Oct 12, 2024 • 1
Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models Paper • 2405.17374 • Published May 27, 2024 • 1
CompCap: Improving Multimodal Large Language Models with Composite Captions Paper • 2412.05243 • Published Dec 6, 2024 • 20
Large Reasoning Models Learn Better Alignment from Flawed Thinking Paper • 2510.00938 • Published Oct 1 • 58
Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming Attacks Paper • 2510.02286 • Published Oct 2 • 28
Any2AnyTryon: Leveraging Adaptive Position Embeddings for Versatile Virtual Clothing Tasks Paper • 2501.15891 • Published Jan 27 • 16
AgentReview: Exploring Peer Review Dynamics with LLM Agents Paper • 2406.12708 • Published Jun 18, 2024 • 8
Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis Paper • 2410.07155 • Published Oct 9, 2024 • 11
MM-Soc: Benchmarking Multimodal Large Language Models in Social Media Platforms Paper • 2402.14154 • Published Feb 21, 2024 • 2
MMSoc Benchmark Collection Benchmark datasets for the paper "MM-Soc: Benchmarking Multimodal Large Language Models in Social Media Platforms" • 7 items • Updated Aug 22, 2024 • 1
Better to Ask in English: Cross-Lingual Evaluation of Large Language Models for Healthcare Queries Paper • 2310.13132 • Published Oct 19, 2023 • 8
HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models Paper • 2310.14566 • Published Oct 23, 2023 • 27