Daniel Fox's picture

Daniel Fox

FlameF0X

·

https://flamef0x.github.io

FlameF0X

AI & ML interests

Pre-training text generator. (Brother, im 18) Please don't try to contact me.

Recent Activity

reacted to anakin87's post with ❤️ about 1 hour ago

A small model that struggled against a random opponent now beats GPT-5-mini at tic-tac-toe I took https://huggingface.co/LiquidAI/LFM2-2.6B and trained it through play. 🧑‍🍳 Here's how: 1️⃣ Build a solid RL env with Verifiers (Prime Intellect) 2️⃣ Generate synthetic data: <200 games sampled from GPT-5-mini playing in the env 3️⃣ SFT warm-up to teach format 4️⃣ Group-based RL (CISPO) against opponents making 20-70% random moves 5️⃣ RL again with stronger opponents (0-25% random moves) + 1.25 temperature to push exploration and shake off suboptimal strategies Done! Beats GPT-5-mini 🏆 --- 🎮 Play against the model: https://huggingface.co/spaces/anakin87/LFM2-2.6B-mr-tictactoe 🤗 Model: https://huggingface.co/anakin87/LFM2-2.6B-mr-tictactoe 📚 Walkthrough/course: https://github.com/anakin87/llm-rl-environments-lil-course 🤗 Dataset and checkpoints: https://huggingface.co/collections/anakin87/lfm2-26b-mr-tic-tac-toe

liked a model about 3 hours ago

deepseek-ai/DeepSeek-V4-Pro

liked a model about 3 hours ago

deepseek-ai/DeepSeek-V4-Flash

View all activity

Organizations

FlameF0X 's buckets 1

FlameF0X/test