Identifying Sensitive Weights via Post-quantization Integral Paper • 2503.01901 • Published Feb 28 • 8
S-STE: Continuous Pruning Function for Efficient 2:4 Sparse Pre-training Paper • 2409.09099 • Published Sep 13, 2024
CAST: Continuous and Differentiable Semi-Structured Sparsity-Aware Training for Large Language Models Paper • 2509.25996 • Published Sep 30
Pruning Large Language Models with Semi-Structural Adaptive Sparse Training Paper • 2407.20584 • Published Jul 30, 2024
ParallelBench: Understanding the Trade-offs of Parallel Decoding in Diffusion LLMs Paper • 2510.04767 • Published Oct 6 • 26
AdaSPEC: Selective Knowledge Distillation for Efficient Speculative Decoders Paper • 2510.19779 • Published 23 days ago • 58