Sehoon Kim
kssteven
AI & ML interests
Efficient AI, AI Systems, Model Compression
Recent Activity
upvoted
a
paper
4 days ago
XQuant: Breaking the Memory Wall for LLM Inference with KV Cache
Rematerialization
upvoted
an
article
6 months ago
Hugging Face and FriendliAI partner to supercharge model deployment on the Hub
authored
a paper
over 1 year ago
Applications and Techniques for Fast Machine Learning in Science
Organizations
None yet