Yadnyesh Chakane's picture

9 13

Yadnyesh Chakane

ydnysh

·

AI & ML interests

Diffusion Models, NLP, Reasoning LLMs, Reinforcement Learning

Organizations

upvoted an article 3 months ago

Article

Vision Language Models (Better, Faster, Stronger)

By

and 4 others •

May 12

• 500

upvoted 2 collections 4 months ago

Unsloth 4-bit Dynamic Quants

Unsloths Dynamic 4bit Quants selectively skips quantizing certain parameters; greatly improving accuracy while only using <10% more VRAM than BnB 4bit • 28 items • Updated 10 days ago • 84

🧠 Reasoning datasets

Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19 • 162

upvoted 4 papers 4 months ago

Scaling Laws for Downstream Task Performance of Large Language Models

Paper • 2402.04177 • Published Feb 6, 2024 • 19

OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models

Paper • 2402.01739 • Published Jan 29, 2024 • 29

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 126

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 416

upvoted a collection 4 months ago

The Deepseek AI Collection

Papers and Models by Deepseek AI • 7 items • Updated Apr 4 • 1

upvoted a paper over 1 year ago

Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning

Paper • 2402.06619 • Published Feb 9, 2024 • 57