Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
XiaomiMiMo
/
MiMo-7B-Base
like
110
Follow
Xiaomi MiMo
625
Text Generation
Transformers
Safetensors
mimo
conversational
custom_code
arxiv:
2505.07608
License:
mit
Model card
Files
Files and versions
Community
9
Train
Use this model
Update modeling_mimo.py
#7
by
chengfeng17
- opened
May 8
base:
refs/heads/main
←
from:
refs/pr/7
Discussion
Files changed
+6
-6
chengfeng17
May 8
Fix:
Some parameter names are not aligned with Qwen2.
The return value of Qwen2Attention is three in some transformer versions.
See translation
Update modeling_mimo.py
5eb702f5
Edit
Preview
Upload images, audio, and videos by dragging in the text input, pasting, or
clicking here
.
Tap or paste here to upload images
Ready to merge
This branch is ready to get merged automatically.
Comment
·
Sign up
or
log in
to comment