Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Building on HF
9.2
TFLOPS
223
173
113
Sergio Paniego
PRO
sergiopaniego
Follow
FermiQ's profile picture
eventanilha's profile picture
malekabuhammad's profile picture
1,627 followers
Β·
108 following
https://sergiopaniego.github.io/
sergiopaniego
sergiopaniego
sergio-paniego-blanco
AI & ML interests
None yet
Recent Activity
updated
a dataset
about 4 hours ago
huggingface-projects/Deep-RL-Course-Certification
new
activity
about 5 hours ago
agents-course/notebooks:
fix: support Google Colab secrets for HF_TOKEN loading
reacted
to
qgallouedec
's
post
with π
about 5 hours ago
TRL v1.3 ships day-one training support for Qwen 3.6 π The new Qwen 3.6 family (`Qwen/Qwen3.6-27B`, `Qwen/Qwen3.6-35B-A3B`) reuses the Qwen3.5-MoE architecture but ships a slightly different chat template, so we updated the stack end-to-end: new training template with `{% generation %}` markers, tool-call response schema routing, tiny test models for the VLM matrix. SFT with assistant-only loss works out of the box: ```python from trl import SFTConfig, SFTTrainer trainer = SFTTrainer( model="Qwen/Qwen3.6-27B", args=SFTConfig(assistant_only_loss=True), train_dataset=dataset, ) trainer.train() ``` So does GRPO tool-calling β just hand `tools=[...]` to `GRPOTrainer`. v1.3 also brings a new experimental TPO trainer (Triple Preference Optimization), speculative decoding in `trl vllm-serve` (Qwen3 MTP / Eagle3 drafts), 12 more KTO β DPO alignment PRs (KTO promotion to stable is now in reach), three more `{% generation %}` chat templates (Gemma/Gemma 2, Phi-3, GLM-4-MoE), and a chunky SFT entropy bug fix. Full release notes: https://github.com/huggingface/trl/releases/tag/v1.3.0
View all activity
Organizations
sergiopaniego
's buckets
5
Sort:Β Recently updated
sergiopaniego/async-grpo-gsm8k-bucket
3.74 MB
sergiopaniego/async-grpo-openr1-bucket
185 kB
sergiopaniego/async-grpo-math500-bucket
193 kB
sergiopaniego/async-grpo-test-bucket
422 kB
sergiopaniego/browsergym-vlm-grpo-Qwen-Qwen3.5-2B-bucket
808 kB