Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
22.6
TFLOPS
1
3
Dominick Wirzba
Chronuid
Follow
0 followers
·
20 following
dominick-wirzba-a46898115
AI & ML interests
None yet
Recent Activity
reacted
to
qgallouedec
's
post
with 🔥
14 days ago
TRL v1.2 introduces the SSDTrainer 🚀 Simple Self-Distillation (SSD) from Apple's paper "Embarrassingly Simple Self-Distillation Improves Code Generation" is now available as an experimental trainer in TRL. The recipe is as minimal as the name suggests: sample completions from the model itself at a training-time temperature, then fine-tune on those raw, unverified samples with plain cross-entropy. No reward model. No verifier. No teacher model. No reinforcement learning. Just prompts and the model. ```python from trl.experimental.ssd import SSDConfig, SSDTrainer trainer = SSDTrainer( model="Qwen/Qwen3-4B-Instruct", args=SSDConfig(temperature=0.6, top_k=20, top_p=0.95), train_dataset=dataset, ) trainer.train() ``` v1.2 also ships expanded tool-calling support (LLaMA 3.1 / 3.2, DeepSeek-V3), another round of KTO ↔ DPO alignment getting us closer to promoting KTO to stable, a big GRPO simplification for overlong tool results, deprecation of `use_transformers_paged`, and key fixes for VLM response parsing. Full release notes: https://github.com/huggingface/trl/releases/tag/v1.2.0
reacted
to
kelsend
's
post
with 👍
14 days ago
The rebuilt Hunyuan HY3 Preview is here! I tested it on all the tricky scenarios where most LLMs usually face-plant—and guess what? It didn’t flop. 295B total params, 21B active params, 256K context window. Built on MoE architecture, it delivers trillion-parameter-level performance with a much smaller footprint. Long-context capabilities get a massive upgrade. Agent abilities stand out this time: tool calling, workflow orchestration, and autonomous planning are far more stable in real business scenarios. AI PPT generation in Tencent Docs is also significantly smoother and more reliable. Real-world tests on WorkBuddy show first-token latency down 54%, success rate over 99.99%, and an Agent workflow that ran continuously for 495 steps. Its Coding Agent achieved top-tier results on both SWE-Bench Verified and Terminal-Bench 2.0 Now open-sourced on GitHub, HuggingFace, and ModelScope. Available on TokenHub at just 1.2 RMB per million tokens.
reacted
to
sergiopaniego
's
post
with 🔥
15 days ago
Earlier this month, Apple introduced Simple Self-Distillation: a fine-tuning method that improves models on coding tasks just by sampling from the model and training on its own outputs with plain cross-entropy And… it's already supported in TRL, built by Kashif Rasul. you can really feel the pace of development in the team 🐎 Paper by Ruixiang ZHANG, He Bai, Huangjie Zheng, Navdeep Jaitly, Ronan Collobert, Yizhe Zhang at Apple 🍎 How it works: the model generates completions at a training-time temperature (T_train) with top_k/top_p truncation, then fine-tunes on them with plain cross-entropy. no labels or verifier needed You can try it right away with this ready-to-run example (Qwen3-4B on rStar-Coder): https://github.com/huggingface/trl/blob/main/trl/experimental/ssd/ssd.py or benchmark a checkpoint with the eval script: https://github.com/huggingface/trl/blob/main/trl/experimental/ssd/ssd_eval.py One neat insight from the paper: T_train and T_eval compose into an effective T_eff = T_train × T_eval, so a broad band of configs works well. even very noisy samples still help Want to dig deeper? Paper: https://huggingface.co/papers/2604.01193 Trainer docs: https://huggingface.co/docs/trl/main/en/ssd_trainer
View all activity
Organizations
Chronuid
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a model
5 months ago
google/functiongemma-270m-it
Text Generation
•
Updated
Jan 14
•
40k
•
979
liked
2 models
12 months ago
OS-Copilot/OS-Atlas-Pro-7B
Image-Text-to-Text
•
8B
•
Updated
Nov 19, 2024
•
1.89k
•
28
jinaai/jina-embeddings-v3
Feature Extraction
•
0.6B
•
Updated
30 days ago
•
2.93M
•
1.14k