Post
540
New research: Understanding how different LLMs approach reasoning through "thought anchors"
I just published a comparative study analyzing the reasoning patterns of Qwen3-0.6B vs DeepSeek-R1-Distill-1.5B using thought anchors - critical sentences that significantly impact task success probability.
Key findings:
- DeepSeek-R1: Uses concentrated reasoning with fewer, high-impact steps (0.408 avg impact)
- Qwen3: Employs distributed reasoning spreading impact across multiple steps (0.278 avg impact)
- Different risk-reward profiles: DeepSeek more consistent (82.7% positive steps), Qwen3 more exploratory (71.6% positive)
This reveals different cognitive architectures rather than simple performance differences. The models optimize for different reasoning strategies - consistency vs exploration.
Both datasets are now available on HF:
- Qwen3 thought anchors: codelion/Qwen3-0.6B-pts-thought-anchors
- DeepSeek-R1 thought anchors: codelion/DeepSeek-R1-Distill-Qwen-1.5B-pts-thought-anchors
Built using our open-source PTS library for mechanistic interpretability analysis. All methodology is fully reproducible.
Full article: https://huggingface.co/blog/codelion/understanding-model-reasoning-thought-anchors
What reasoning patterns have you noticed in your model experiments? Would love to hear about other architectures showing similar cognitive diversity!
I just published a comparative study analyzing the reasoning patterns of Qwen3-0.6B vs DeepSeek-R1-Distill-1.5B using thought anchors - critical sentences that significantly impact task success probability.
Key findings:
- DeepSeek-R1: Uses concentrated reasoning with fewer, high-impact steps (0.408 avg impact)
- Qwen3: Employs distributed reasoning spreading impact across multiple steps (0.278 avg impact)
- Different risk-reward profiles: DeepSeek more consistent (82.7% positive steps), Qwen3 more exploratory (71.6% positive)
This reveals different cognitive architectures rather than simple performance differences. The models optimize for different reasoning strategies - consistency vs exploration.
Both datasets are now available on HF:
- Qwen3 thought anchors: codelion/Qwen3-0.6B-pts-thought-anchors
- DeepSeek-R1 thought anchors: codelion/DeepSeek-R1-Distill-Qwen-1.5B-pts-thought-anchors
Built using our open-source PTS library for mechanistic interpretability analysis. All methodology is fully reproducible.
Full article: https://huggingface.co/blog/codelion/understanding-model-reasoning-thought-anchors
What reasoning patterns have you noticed in your model experiments? Would love to hear about other architectures showing similar cognitive diversity!