48 21 153

Ivan Fioravanti PRO

ivanfioravanti

AI & ML interests

None yet

Recent Activity

liked a model 6 days ago

mlx-community/VibeThinker-1.5B-mlx-4bit

liked a Space 11 days ago

damianpumar/adaptive-ui

liked a model 14 days ago

autoweeb/Qwen-Image-Edit-2509-Photo-to-Anime

View all activity

Organizations

upvoted a paper 15 days ago

Too Good to be Bad: On the Failure of LLMs to Role-Play Villains

Paper • 2511.04962 • Published 18 days ago • 50

upvoted an article 25 days ago

Article

On the Shifting Global Compute Landscape

27 days ago

•

upvoted 2 articles 3 months ago

Article

Introducing Marvis TTS: Real-Time Streaming Speech Synthesis

Aug 27

•

Article

Uncensor any LLM with abliteration

Jun 13, 2024

•

722

upvoted a paper 4 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 310

upvoted 2 articles 5 months ago

Article

Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models

Jul 10

•

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8

•

730

upvoted an article 6 months ago

Article

You could have designed state of the art positional encoding

Nov 25, 2024

•

399

upvoted a collection 8 months ago

Llama 4

Collection

Llama 4 release • 13 items • Updated Apr 29 • 662

upvoted a collection 11 months ago

DolphinLabeled Datasets

Collection

Eric Hartford has added labels to help you filter datasets, for your pleasure. • 5 items • Updated Jan 6 • 15

upvoted an article 11 months ago

Article

🐺🐦‍⬛ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark

Jan 2

•

upvoted 4 papers 11 months ago

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Paper • 2412.18619 • Published Dec 16, 2024 • 58

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 157

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 376

No More Adam: Learning Rate Scaling at Initialization is All You Need

Paper • 2412.11768 • Published Dec 16, 2024 • 43

upvoted an article 12 months ago

Article

🐺🐦‍⬛ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs

Dec 4, 2024

•

upvoted an article about 1 year ago

Article

Releasing the largest multilingual open pretraining dataset

Nov 13, 2024

•

104

upvoted 2 articles over 1 year ago

Article

⚗️ 🧑🏼‍🌾 Let's grow some Domain Specific Datasets together

Apr 29, 2024

•

Article

RAG Empowerment: Cohere C4AI Command-R and Transformers Unveiled

Apr 7, 2024

•

Ivan Fioravanti PRO

AI & ML interests

Recent Activity

Organizations

ivanfioravanti's activity

On the Shifting Global Compute Landscape

Introducing Marvis TTS: Real-Time Streaming Speech Synthesis

Uncensor any LLM with abliteration

Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models

SmolLM3: smol, multilingual, long-context reasoner

You could have designed state of the art positional encoding

🐺🐦‍⬛ LLM Comparison/Test: DeepSeek-V3, QVQ-72B-Preview, Falcon3 10B, Llama 3.3 70B, Nemotron 70B in my updated MMLU-Pro CS benchmark

🐺🐦‍⬛ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs

Releasing the largest multilingual open pretraining dataset

⚗️ 🧑🏼‍🌾 Let's grow some Domain Specific Datasets together

RAG Empowerment: Cohere C4AI Command-R and Transformers Unveiled