- 
	
	
	SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference AccelerationPaper • 2411.10958 • Published • 56
- 
	
	
	SpargeAttn: Accurate Sparse Attention Accelerating Any Model InferencePaper • 2502.18137 • Published • 58
- 
	
	
	SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit TrainingPaper • 2505.11594 • Published • 75
- 
	
	
	SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference AccelerationPaper • 2410.02367 • Published • 50
Jintao Zhang
jt-zhang
		AI & ML interests
Efficient ML
		Recent Activity
						upvoted 
								a
								paper
							
						7 days ago
						
					
						
						
						AdaSPEC: Selective Knowledge Distillation for Efficient Speculative
  Decoders
						
						authored 
								a paper
							
						18 days ago
						
					
						
						
						Large Scale Diffusion Distillation via Score-Regularized Continuous-Time
  Consistency
						
						upvoted 
								a
								paper
							
						22 days ago
						
					
						
						
						Large Scale Diffusion Distillation via Score-Regularized Continuous-Time
  Consistency
						
 
								



