1 59 14

taicheng guo

taicheng

AI & ML interests

None yet

Recent Activity

liked a model 11 days ago

meta-llama/Llama-3.2-3B

upvoted a paper about 1 month ago

Glyph: Scaling Context Windows via Visual-Text Compression

upvoted a paper about 1 month ago

Efficient Long-context Language Model Training by Core Attention Disaggregation

View all activity

Organizations

liked a model 11 days ago

meta-llama/Llama-3.2-3B

Text Generation • 3B • Updated Oct 24, 2024 • 437k • 661

upvoted 5 papers about 1 month ago

Glyph: Scaling Context Windows via Visual-Text Compression

Paper • 2510.17800 • Published Oct 20 • 67

Efficient Long-context Language Model Training by Core Attention Disaggregation

Paper • 2510.18121 • Published Oct 20 • 119

Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning

Paper • 2510.19338 • Published Oct 22 • 112

AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders

Paper • 2510.19779 • Published Oct 22 • 59

LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts

Paper • 2510.19363 • Published Oct 22 • 60

authored 3 papers about 1 month ago

SceMQA: A Scientific College Entrance Level Multimodal Question Answering Benchmark

Paper • 2402.05138 • Published Feb 6, 2024 • 2

Data Interpreter: An LLM Agent For Data Science

Paper • 2402.18679 • Published Feb 28, 2024 • 1

MTSQL-R1: Towards Long-Horizon Multi-Turn Text-to-SQL via Agentic Training

Paper • 2510.12831 • Published Oct 12 • 4

upvoted 5 papers about 1 month ago

commented a paper about 1 month ago

MTSQL-R1: Towards Long-Horizon Multi-Turn Text-to-SQL via Agentic Training

Paper • 2510.12831 • Published Oct 12 • 4 •

upvoted a paper about 2 months ago

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published Oct 13 • 174

liked a model about 2 months ago

facebook/cwm

33B • Updated Oct 15 • 257k • 243

upvoted a paper about 2 months ago

LongCodeZip: Compress Long Context for Code Language Models

Paper • 2510.00446 • Published Oct 1 • 108

upvoted a paper 2 months ago

Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation

Paper • 2509.15194 • Published Sep 18 • 33

liked a model 3 months ago

Qwen/Qwen3-Embedding-8B

Feature Extraction • 8B • Updated Jul 7 • 663k • • 462

taicheng guo

AI & ML interests

Recent Activity

Organizations

taicheng's activity