Asankhaya Sharma PRO

codelion

http://asankhaya.github.io/

AI & ML interests

Creator of OptiLLM, OpenEvolve, Adaptive Classifier, and PTS. Pioneering a new category in AI infrastructure: inference-time compute for LLMs.

Recent Activity

reacted to their post with 👍 about 9 hours ago

New research: Understanding how different LLMs approach reasoning through "thought anchors" I just published a comparative study analyzing the reasoning patterns of Qwen3-0.6B vs DeepSeek-R1-Distill-1.5B using thought anchors - critical sentences that significantly impact task success probability. Key findings: - DeepSeek-R1: Uses concentrated reasoning with fewer, high-impact steps (0.408 avg impact) - Qwen3: Employs distributed reasoning spreading impact across multiple steps (0.278 avg impact) - Different risk-reward profiles: DeepSeek more consistent (82.7% positive steps), Qwen3 more exploratory (71.6% positive) This reveals different cognitive architectures rather than simple performance differences. The models optimize for different reasoning strategies - consistency vs exploration. Both datasets are now available on HF: - Qwen3 thought anchors: https://huggingface.co/datasets/codelion/Qwen3-0.6B-pts-thought-anchors - DeepSeek-R1 thought anchors: https://huggingface.co/datasets/codelion/DeepSeek-R1-Distill-Qwen-1.5B-pts-thought-anchors Built using our open-source PTS library for mechanistic interpretability analysis. All methodology is fully reproducible. Full article: https://huggingface.co/blog/codelion/understanding-model-reasoning-thought-anchors What reasoning patterns have you noticed in your model experiments? Would love to hear about other architectures showing similar cognitive diversity!

reacted to their post with 🔥 about 9 hours ago

reacted to their post with 🚀 about 9 hours ago

View all activity

Organizations

Posts 21

Post

540

New research: Understanding how different LLMs approach reasoning through "thought anchors"

I just published a comparative study analyzing the reasoning patterns of Qwen3-0.6B vs DeepSeek-R1-Distill-1.5B using thought anchors - critical sentences that significantly impact task success probability.

Key findings:
- DeepSeek-R1: Uses concentrated reasoning with fewer, high-impact steps (0.408 avg impact)
- Qwen3: Employs distributed reasoning spreading impact across multiple steps (0.278 avg impact)
- Different risk-reward profiles: DeepSeek more consistent (82.7% positive steps), Qwen3 more exploratory (71.6% positive)

This reveals different cognitive architectures rather than simple performance differences. The models optimize for different reasoning strategies - consistency vs exploration.

Both datasets are now available on HF:
- Qwen3 thought anchors: codelion/Qwen3-0.6B-pts-thought-anchors
- DeepSeek-R1 thought anchors: codelion/DeepSeek-R1-Distill-Qwen-1.5B-pts-thought-anchors

Built using our open-source PTS library for mechanistic interpretability analysis. All methodology is fully reproducible.

Full article: https://huggingface.co/blog/codelion/understanding-model-reasoning-thought-anchors

What reasoning patterns have you noticed in your model experiments? Would love to hear about other architectures showing similar cognitive diversity!

View all Posts