Building on HF

7 28 42

Zixi "Oz" Li PRO

OzTianlu

https://github.com/lizixi-0x2F

lizixi-0x2F

AI & ML interests

My research focuses on deep reasoning with small language models, Transformer architecture innovation, and knowledge distillation for efficient alignment and transfer.

Recent Activity

upvoted a collection 7 days ago

DeepSeek-V4

liked a dataset 21 days ago

TAAC2026/data_sample_1000

liked a model 25 days ago

google/gemma-4-26B-A4B-it

View all activity

Organizations

Posts 11

Post

1404

https://github.com/lizixi-0x2F/March
I just released March, an open-source high-performance KV cache sharing library for LLM inference that uses Trie-based prefix deduplication.
When you run LLM services, you often see thousands of requests sharing the same system prompt and conversation history. But traditional KV cache systems store each sequence separately — duplicating the exact same data over and over again. Pure waste.
March uses a Trie structure to automatically detect and reuse identical token prefixes. Instead of storing [system_prompt + history] 1000 times, it's stored once. Everyone shares it.
- 80-97% memory reduction in prefix-heavy workloads (tested on SmolLM2-135M with 500 multi-turn conversations)
- Zero-copy queries — returns direct pointers into the memory pool, no expensive memcpy on the hot path
- Predictable memory usage — fixed-size page pool with O(L) complexity
- Trade-off: slightly slower than dict O(1) lookup, but the memory savings are worth it in production

View all Posts

Articles 5

Article

Zixi "Oz" Li PRO

AI & ML interests

Recent Activity

Organizations

Posts 11

Articles 5

Arcade-3B: SLM Optimization via Orthogonal Decoupling of Latent State Spaces

Collections 1

OzTianlu/Semigroup_Reasoning_Model_A_Scalpel

OzTianlu/A_Reasoning_Critique_of_Diffusion_Models

OzTianlu/Abstract_of_Structural_Critique_of_Reasoning

OzTianlu/Reasoning_and_Jacobian_Collapse

OzTianlu/Semigroup_Reasoning_Model_A_Scalpel

OzTianlu/A_Reasoning_Critique_of_Diffusion_Models

OzTianlu/Abstract_of_Structural_Critique_of_Reasoning

OzTianlu/Reasoning_and_Jacobian_Collapse

Papers 1

models 0

datasets 12

OzTianlu/A_Reasoning_Critique_of_Diffusion_Models

OzTianlu/Semigroup_Reasoning_Model_A_Scalpel

OzTianlu/Abstract_of_Structural_Critique_of_Reasoning

OzTianlu/Reasoning_and_Jacobian_Collapse

OzTianlu/From_Reasoning_Structure_to_the_Ancient_Problem_of_Primes

OzTianlu/When_Euler_Meets_Stack

OzTianlu/Reasoning_as_Fluid

OzTianlu/The_Geometric_Incompleteness_of_Reasoning

OzTianlu/The_Incompleteness_of_Reasoning

OzTianlu/Why_Reasoning_Models_Collapse_Themselves_in_Reasoning

Zixi "Oz" Li PRO

AI & ML interests

Recent Activity

Organizations

Posts 11

Articles 5

Arcade-3B: SLM Optimization via Orthogonal Decoupling of Latent State Spaces

Collections 1

Papers 1

models 0

datasets 12 Sort: Recently updated

datasets 12