2 166 141

Sergiu Han

hgsg

https://sergiudm.github.io/

sergiudm

AI & ML interests

NLP, agent

Recent Activity

upvoted an article 32 minutes ago

KV Caching Explained: Optimizing Transformer Inference Efficiency

liked a model about 1 hour ago

google/flan-t5-xxl

liked a model about 14 hours ago

openai-community/gpt2

View all activity

Organizations

None yet

upvoted an article 32 minutes ago

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

•

Jan 30

• 151

liked a model about 1 hour ago

google/flan-t5-xxl

11B • Updated Jul 27, 2023 • 116k • 1.27k

liked a model about 14 hours ago

openai-community/gpt2

Text Generation • 0.1B • Updated Feb 19, 2024 • 10.5M • 3k

liked a model about 17 hours ago

BlinkDL/temp-latest-training-models

Updated about 18 hours ago • 64

liked a model about 20 hours ago

tencent/HunyuanWorld-Mirror

Image-to-3D • Updated about 22 hours ago • 5.94k • 336

upvoted 2 papers 2 days ago

DeepSeek-OCR: Contexts Optical Compression

Paper • 2510.18234 • Published 5 days ago • 53

Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning

Paper • 2510.19338 • Published 4 days ago • 90

liked a model 3 days ago

Qwen/Qwen3-VL-2B-Instruct

Image-Text-to-Text • 2B • Updated 3 days ago • 23.8k • 84

upvoted 3 papers 4 days ago

PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

Paper • 2510.14528 • Published 10 days ago • 66

Latent Diffusion Model without Variational Autoencoder

Paper • 2510.15301 • Published 9 days ago • 40

FineVision: Open Data Is All You Need

Paper • 2510.17269 • Published 6 days ago • 53

liked 2 models 5 days ago

Qwen/Qwen3-VL-32B-Thinking

Image-Text-to-Text • 33B • Updated 4 days ago • 2.51k • 45

inclusionAI/Ring-mini-sparse-2.0-exp

Text Generation • 16B • Updated 4 days ago • 51 • 18

upvoted a paper 7 days ago

Detect Anything via Next Point Prediction

Paper • 2510.12798 • Published 11 days ago • 42

liked a model 8 days ago

Open-Bee/Bee-8B-RL

Image-Text-to-Text • 9B • Updated 6 days ago • 5.67k • 64

upvoted a paper 8 days ago

Robot Learning: A Tutorial

Paper • 2510.12403 • Published 12 days ago • 86

liked 2 models 9 days ago

ibm-granite/granite-3.3-8b-instruct

Text Generation • 8B • Updated May 12 • 175k • 138

sentence-transformers/sentence-t5-base

commented a paper 9 days ago

Cache-to-Cache: Direct Semantic Communication Between Large Language Models

Paper • 2510.03215 • Published 22 days ago • 92 •

upvoted a paper 9 days ago

Cache-to-Cache: Direct Semantic Communication Between Large Language Models

Paper • 2510.03215 • Published 22 days ago • 92

Sergiu Han

AI & ML interests

Recent Activity

Organizations

hgsg's activity

KV Caching Explained: Optimizing Transformer Inference Efficiency