Too Good to be Bad: On the Failure of LLMs to Role-Play Villains Paper β’ 2511.04962 β’ Published 18 days ago β’ 50
view article Article Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models Jul 10 β’ 53
view article Article You could have designed state of the art positional encoding Nov 25, 2024 β’ 399
DolphinLabeled Datasets Collection Eric Hartford has added labels to help you filter datasets, for your pleasure. β’ 5 items β’ Updated Jan 6 β’ 15
view article Article πΊπ¦ββ¬ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark Jan 2 β’ 41
Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey Paper β’ 2412.18619 β’ Published Dec 16, 2024 β’ 58
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper β’ 2412.13663 β’ Published Dec 18, 2024 β’ 157
No More Adam: Learning Rate Scaling at Initialization is All You Need Paper β’ 2412.11768 β’ Published Dec 16, 2024 β’ 43
view article Article πΊπ¦ββ¬ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs Dec 4, 2024 β’ 80
view article Article Releasing the largest multilingual open pretraining dataset Nov 13, 2024 β’ 104
view article Article βοΈ π§πΌβπΎ Let's grow some Domain Specific Datasets together Apr 29, 2024 β’ 29
view article Article RAG Empowerment: Cohere C4AI Command-R and Transformers Unveiled Apr 7, 2024 β’ 10