lhl's picture

lhl PRO

leonardlin

·

https://randomfoo.net/

lhl
lhl

AI & ML interests

None yet

Recent Activity

updated a collection about 10 hours ago

updated a collection about 10 hours ago

updated a collection about 10 hours ago

View all activity

Organizations

upvoted a paper about 11 hours ago

If You Can't Use Them, Recycle Them: Optimizing Merging at Scale Mitigates Performance Tradeoffs

Paper • 2412.04144 • Published Dec 5, 2024 • 6

upvoted a collection 4 days ago

Olmo 3

Artifacts for the Olmo 3 release. • 9 items • Updated about 11 hours ago • 104

upvoted a collection 22 days ago

Granite 4.0 Nano Language Models

9 items • Updated 7 days ago • 90

upvoted 2 papers about 1 month ago

Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model

Paper • 2510.18855 • Published Oct 21 • 68

Recent Advances in Speech Language Models: A Survey

Paper • 2410.03751 • Published Oct 1, 2024 • 1

upvoted 2 collections 3 months ago

Ovis2.5

Our next-generation MLLMs for native-resolution vision and advanced reasoning • 5 items • Updated Aug 19 • 56

SYNTHETIC-1

A collection of tasks & verifiers for reasoning datasets • 9 items • Updated Oct 7 • 64

upvoted a paper 5 months ago

Quartet: Native FP4 Training Can Be Optimal for Large Language Models

Paper • 2505.14669 • Published May 20 • 78

upvoted a collection 7 months ago

ChatVector

モデル間の重みの加減算のみで構築した日本語LLM • 4 items • Updated Nov 24, 2024 • 2

upvoted a paper 7 months ago

Phi-4-reasoning Technical Report

Paper • 2504.21318 • Published Apr 30 • 52

upvoted a collection 7 months ago

Shisa V2

A family of bilingual JA/EN LLMs • 32 items • Updated Jun 4 • 9

upvoted 2 papers 8 months ago

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published Apr 2 • 86

OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

Paper • 2504.07096 • Published Apr 9 • 76

upvoted a collection 8 months ago

Llama 4

Llama 4 release • 13 items • Updated Apr 29 • 662

upvoted 2 papers 9 months ago

The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks

Paper • 2502.08235 • Published Feb 12 • 58

InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published Feb 13 • 148

upvoted 4 papers 10 months ago

Omni-MATH: A Universal Olympiad Level Mathematic Benchmark For Large Language Models

Paper • 2410.07985 • Published Oct 10, 2024 • 33

Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking

Paper • 2502.02339 • Published Feb 4 • 22

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4 • 249

Demystifying Long Chain-of-Thought Reasoning in LLMs

Paper • 2502.03373 • Published Feb 5 • 58