Jonna Matthiesen's picture

Jonna Matthiesen

JonnaMat

embedl

·

AI & ML interests

None yet

Recent Activity

updated a model 12 minutes ago

embedl/Qwen3.5-9B-FlashHead

updated a model 13 minutes ago

embedl/Qwen3.5-4B-FlashHead

updated a model 13 minutes ago

embedl/Qwen3.5-0.8B-FlashHead

View all activity

Organizations

Posts 10

Post

⚡ Qwen3.5, up to 1.4× faster. Same quality. Less latency.

We applied FlashHead to the Qwen3.5 family: Novel drop-in replacement of the LM head with measurably lower latency on edge hardware. Benchmarks and models below.

📊 embedl/Edge-Inference-Benchmarks

🤗 https://huggingface.co/collections/embedl/qwen35

Articles 3

Article

2

How to Build a vLLM Plugin: A Guide to the general_plugins Entry Point

View all Articles

models 0

None public yet

datasets 0

None public yet