Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
23
zhanglu
zhanglu
Follow
0 followers
·
6 following
AI & ML interests
None yet
Recent Activity
liked
a dataset
14 days ago
OpenStellarTeam/Chinese-EcomQA
published
a model
28 days ago
zhanglu/Simple-VL-8B
reacted
to
merterbak
's
post
with 🔥
3 months ago
Qwen 3 models released🔥 It offers 2 MoE and 6 dense models with following parameter sizes: 0.6B, 1.7B, 4B, 8B, 14B, 30B(MoE), 32B, and 235B(MoE). Models: https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2e4f653967f Blog: https://qwenlm.github.io/blog/qwen3/ Demo: https://huggingface.co/spaces/Qwen/Qwen3-Demo GitHub: https://github.com/QwenLM/Qwen3 ✅ Pre-trained 119 languages(36 trillion tokens) and dialects with strong translation and instruction following abilities. (Qwen2.5 was pre-trained on 18 trillion tokens.) ✅Qwen3 dense models match the performance of larger Qwen2.5 models. For example, Qwen3-1.7B/4B/8B/14B/32B perform like Qwen2.5-3B/7B/14B/32B/72B. ✅ Three stage done while pretraining: • Stage 1: General language learning and knowledge building. • Stage 2: Reasoning boost with STEM, coding, and logic skills. • Stage 3: Long context training ✅ It supports MCP in the model ✅ Strong agent skills ✅ Supports seamless between thinking mode (for hard tasks like math and coding) and non-thinking mode (for fast chatting) inside chat template. ✅ Better human alignment for creative writing, roleplay, multi-turn conversations, and following detailed instructions.
View all activity
Organizations
None yet
zhanglu
's models
3
Sort:Â Recently updated
zhanglu/Simple-VL-8B
Updated
28 days ago
zhanglu/bert-base-chinese-finetuned-tnews
Updated
Sep 21, 2022
zhanglu/distilbert-base-uncased-finetuned-cola
Text Classification
•
Updated
Sep 21, 2022
•
2