liuheng

heng

liuheng2cqupt

AI & ML interests

None yet

Recent Activity

liked a dataset 25 days ago

AIDC-AI/HalloMTBench

liked a model 25 days ago

deepseek-ai/DeepSeek-V3.2-Speciale

liked a model 25 days ago

deepseek-ai/DeepSeek-V3.2

View all activity

Organizations

None yet

liked a dataset 25 days ago

AIDC-AI/HalloMTBench

Viewer • Updated Oct 23 • 5.44k • 49 • 5

liked 2 models 25 days ago

deepseek-ai/DeepSeek-V3.2-Speciale

Text Generation • 685B • Updated 26 days ago • 18.1k • 622

deepseek-ai/DeepSeek-V3.2

Text Generation • 685B • Updated 26 days ago • 98.6k • • 1.03k

liked 2 datasets about 1 month ago

google/IFEval

Viewer • Updated Aug 14, 2024 • 541 • 42.8k • 114

openai/MMMLU

Viewer • Updated Oct 16, 2024 • 393k • 8.1k • 513

upvoted 3 articles 3 months ago

Article

KV Cache from scratch in nanoVLM

Jun 4

•

106

Article

🕳️ Attention Sinks in LLMs for endless fluency

Oct 9, 2023

•

Article

Topic 33: Slim Attention, KArAt, XAttention and Multi-Token Attention Explained – What’s Really Changing in Transformers?

Apr 4

•

liked a model 3 months ago

AIDC-AI/Marco-MT-Algharb

Translation • 15B • Updated Oct 23 • 319 • 25

liked 2 datasets 4 months ago

jupyter-agent/jupyter-agent-dataset

Viewer • Updated Sep 10 • 95.8k • 2.2k • 153

HPLT/HPLT2.0_cleaned

Viewer • Updated Nov 13 • 9.03B • 36.9k • 36

liked a model 4 months ago

openai/gpt-oss-120b

Text Generation • 120B • Updated Aug 26 • 3.82M • • 4.28k

upvoted an article 4 months ago

Article

Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO)

Jan 19

•

liked 3 models 5 months ago

liked a Space 6 months ago

LLM Hallucination Leaderboard

🚀

184

View and filter LLM hallucination leaderboard

liked a model 8 months ago

intfloat/multilingual-e5-large-instruct

Feature Extraction • 0.6B • Updated Jul 10 • 1.39M • • 587

liked a Space 8 months ago

MTEB Leaderboard

🥇

6.85k

Embedding Leaderboard

upvoted a collection 8 months ago

Qwen3

Collection

84 items • Updated Aug 6 • 1.52k

liuheng

AI & ML interests

Recent Activity

Organizations

heng's activity

KV Cache from scratch in nanoVLM

🕳️ Attention Sinks in LLMs for endless fluency

Topic 33: Slim Attention, KArAt, XAttention and Multi-Token Attention Explained – What’s Really Changing in Transformers?

Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO)

LLM Hallucination Leaderboard

MTEB Leaderboard