3 6

Kangjie Zhang

Quagmire1

AI & ML interests

None yet

Recent Activity

upvoted an article 27 days ago

Efficient LLM Pretraining: Packed Sequences and Masked Attention

liked a dataset about 1 month ago

General-Medical-AI/Project-Imaging-X

upvoted a paper 7 months ago

LIMI: Less is More for Agency

View all activity

Organizations

None yet

upvoted an article 27 days ago

Article

Efficient LLM Pretraining: Packed Sequences and Masked Attention

Oct 7, 2024

•

liked a dataset about 1 month ago

General-Medical-AI/Project-Imaging-X

Viewer • Updated Apr 1 • 8 • 2.44k • 22

upvoted a paper 7 months ago

LIMI: Less is More for Agency

Paper • 2509.17567 • Published Sep 22, 2025 • 104

liked a model 10 months ago

ByteDance-Seed/BAGEL-7B-MoT

Any-to-Any • 15B • Updated Jan 9 • 6.91k • 1.2k

liked a Space 12 months ago

The Ultra-Scale Playbook

🌌

3.82k

The ultimate guide to training LLM on large GPU Clusters

liked 2 models about 1 year ago

deepseek-ai/DeepSeek-V3

Text Generation • 685B • Updated Mar 27, 2025 • 1.21M • • 4.06k

mistralai/Mixtral-8x7B-v0.1

47B • Updated Jul 24, 2025 • 167k • 1.81k

upvoted an article about 1 year ago

Article

Visualize and understand GPU memory in PyTorch

Dec 24, 2024

•

269

liked a model over 1 year ago

deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27, 2025 • 4.02M • • 13.3k

updated a model over 1 year ago

Quagmire1/wiki-cased

Updated Dec 24, 2024

Kangjie Zhang

AI & ML interests

Recent Activity

Organizations

Quagmire1's activity

Efficient LLM Pretraining: Packed Sequences and Masked Attention

The Ultra-Scale Playbook

Visualize and understand GPU memory in PyTorch