Baris Parlan

bparlan

https://www.bparlan.com

AI & ML interests

Storytelling, social graph, mythology, data analysis and vis

Recent Activity

liked a model 4 days ago

Quant-Cartel/MilkDropLM-7b-v0.3-GGUF

liked a model 4 days ago

InferenceIllusionist/MilkDropLM-7b-v0.3

reacted to alibidaran's post with 🔥 about 1 month ago

🧠 Introducing Qwen2.5 — Cognitive Reasoning Mode I fine-tuned Qwen2.5 with GRPO to actually think before it answers — not just pattern-match. Most LLMs mimic reasoning. This one builds a real cognitive path: 📌 Plan → understand the task 🔍 Monitor → reason step by step ✅ Evaluate → verify before answering Every response follows a strict structured protocol: <think> <planning> ... <monitoring> ... <evaluation> ... </think> Then a clean, reasoning-free <output>. The model self-checks its own structure. If a section is missing or malformed → the response is invalid. This isn't chain-of-thought slapped on top. The reasoning protocol is baked in via RL. 🔗 Full README + inference code below 👇 https://huggingface.co/alibidaran/Qwen_COG_Thinker_Merged #AI #LLM #Qwen #ReasoningModels #GRPO #OpenSource

View all activity

Organizations

liked 2 models 4 days ago

Quant-Cartel/MilkDropLM-7b-v0.3-GGUF

Text Generation • 8B • Updated Dec 20, 2024 • 66 • 3

InferenceIllusionist/MilkDropLM-7b-v0.3

Text Generation • 8B • Updated Dec 19, 2024 • 14 • • 18

reacted to alibidaran's post with 🔥 about 1 month ago

Post

2081

🧠 Introducing Qwen2.5 — Cognitive Reasoning Mode

I fine-tuned Qwen2.5 with GRPO to actually think before it answers — not just pattern-match.

Most LLMs mimic reasoning. This one builds a real cognitive path:

📌 Plan → understand the task
🔍 Monitor → reason step by step
✅ Evaluate → verify before answering

Every response follows a strict structured protocol:
<think> <planning> ... <monitoring> ... <evaluation> ... </think>
Then a clean, reasoning-free <output>.

The model self-checks its own structure. If a section is missing or malformed → the response is invalid.

This isn't chain-of-thought slapped on top. The reasoning protocol is baked in via RL.

🔗 Full README + inference code below 👇
alibidaran/Qwen_COG_Thinker_Merged

#AI #LLM #Qwen #ReasoningModels #GRPO #OpenSource