deepseek-ai/DeepSeek-V3.2-Speciale Text Generation β’ 685B β’ Updated 26 days ago β’ 18.1k β’ 622
view article Article Topic 33: Slim Attention, KArAt, XAttention and Multi-Token Attention Explained β Whatβs Really Changing in Transformers? Apr 4 β’ 15
view article Article Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO) Jan 19 β’ 38
Running on CPU Upgrade 184 LLM Hallucination Leaderboard π 184 View and filter LLM hallucination leaderboard
intfloat/multilingual-e5-large-instruct Feature Extraction β’ 0.6B β’ Updated Jul 10 β’ 1.39M β’ β’ 587