ML Optimization Papers - a hasanar1f Collection

hasanar1f 's Collections

Agents

ML Optimization Papers

ML Optimization Papers

updated Apr 4, 2025

FAST: Efficient Action Tokenization for Vision-Language-Action Models

Paper • 2501.09747 • Published Jan 16, 2025 • 27
Tensor Product Attention Is All You Need

Paper • 2501.06425 • Published Jan 11, 2025 • 90
SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training

Paper • 2501.06842 • Published Jan 12, 2025 • 16
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Paper • 2501.03895 • Published Jan 7, 2025 • 52
LTX-Video: Realtime Video Latent Diffusion

Paper • 2501.00103 • Published Dec 30, 2024 • 48
Efficiently Serving LLM Reasoning Programs with Certaindex

Paper • 2412.20993 • Published Dec 30, 2024 • 36
Token-Budget-Aware LLM Reasoning

Paper • 2412.18547 • Published Dec 24, 2024 • 46
TRecViT: A Recurrent Video Transformer

Paper • 2412.14294 • Published Dec 18, 2024 • 13
iFormer: Integrating ConvNet and Transformer for Mobile Application

Paper • 2501.15369 • Published Jan 26, 2025 • 13
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models

Paper • 2501.12370 • Published Jan 21, 2025 • 11
Return of the Encoder: Maximizing Parameter Efficiency for SLMs

Paper • 2501.16273 • Published Jan 27, 2025 • 5
Cost-Optimal Grouped-Query Attention for Long-Context LLMs

Paper • 2503.09579 • Published Mar 12, 2025 • 5
Streaming Video Question-Answering with In-context Video KV-Cache Retrieval

Paper • 2503.00540 • Published Mar 1, 2025 • 3
MaxInfo: A Training-Free Key-Frame Selection Method Using Maximum Volume for Enhanced Video Understanding

Paper • 2502.03183 • Published Feb 5, 2025 • 5
OmniMamba: Efficient and Unified Multimodal Understanding and Generation via State Space Models

Paper • 2503.08686 • Published Mar 11, 2025 • 19
QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension

Paper • 2503.08689 • Published Mar 11, 2025 • 4
PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity

Paper • 2503.07677 • Published Mar 10, 2025 • 86
LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization

Paper • 2503.08619 • Published Mar 11, 2025 • 20
Adaptive Layer-skipping in Pre-trained LLMs

Paper • 2503.23798 • Published Mar 31, 2025 • 5