Open source weights of Lorsa modules introduced in "Towards Understanding the Nature of Attention with Low-Rank Sparse Decomposition".

OpenMOSS, Fudan NLP, SII
university
AI & ML interests
LLM
Recent Activity
View all activity
Organization Card
Joint OpenMOSS group from Fudan NLP, SII and MoSi Inc.
Collections
2
The MHA2MLA model published in the paper "Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-Based LLMs"
-
Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs
Paper • 2502.14837 • Published • 4 -
fnlp/Llama-2-7B-MLA-d_kv_16
Text Generation • Updated • 31 -
fnlp/Llama-2-7B-MLA-d_kv_32
Text Generation • Updated • 29 -
fnlp/Llama-2-7B-MLA-d_kv_64
Text Generation • Updated • 14
models
73

fnlp/Lorsa-Llama-3.1-8B
Updated

fnlp/Lorsa-Pythia-160M
Updated
•
1

fnlp/Lorsa
Updated
•
2

fnlp/Llama-2-7B-MLA-d_kv_16
Text Generation
•
Updated
•
31

fnlp/Llama-2-7B-MLA-d_kv_32
Text Generation
•
Updated
•
29

fnlp/Llama-2-7B-MLA-d_kv_64
Text Generation
•
Updated
•
14

fnlp/Llama-2-7B-MHA-d_kv_256
Text Generation
•
Updated
•
13

fnlp/SmolLM-1B7-MLA-d_kv_8
Text Generation
•
Updated
•
18

fnlp/SmolLM-1B7-MLA-d_kv_16
Text Generation
•
Updated
•
36

fnlp/SmolLM-1B7-MLA-d_kv_32
Text Generation
•
Updated
•
30
datasets
17
fnlp/MHA2MLA-corpus-qwen1.5
Updated
•
26
fnlp/MHA2MLA-corpus-smollm
Updated
•
213
fnlp/MHA2MLA-corpus-qwen1_5
Updated
•
9
fnlp/MHA2MLA-corpus-qwen2
Updated
•
31
fnlp/MHA2MLA-corpus-mistral-v0_1
Updated
•
20
fnlp/MHA2MLA-corpus-smollm_v1
Updated
•
31
fnlp/MHA2MLA-corpus-llama2
Updated
•
49
fnlp/Ultra-Innerthought
Viewer
•
Updated
•
2.09M
•
70
•
2
fnlp/case2code-data
Viewer
•
Updated
•
887k
•
74
•
2
fnlp/AnyInstruct-resolution-1024
Updated
•
279