Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
vbdai 's Collections
Spatial-Temporal Reasoning
AI4Math
Object Detection
Federated Learning
Graph Neural Networks
Data / Model Search
Trustworthy AI
Inference Optimization

Inference Optimization

updated 12 days ago
Upvote
1

  • DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models

    Paper • 2503.02175 • Published Mar 4 • 3

  • CASP: Compression of Large Multimodal Models Based on Attention Sparsity

    Paper • 2503.05936 • Published Mar 7 • 2

  • EBJR: Energy-Based Joint Reasoning for Adaptive Inference

    Paper • 2110.10343 • Published Oct 20, 2021 • 1

  • E-LANG: Energy-Based Joint Inferencing of Super and Swift Language Models

    Paper • 2203.00748 • Published Mar 1, 2022 • 1

  • GOLD: Generalized Knowledge Distillation via Out-of-Distribution-Guided Language Data Generation

    Paper • 2403.19754 • Published Mar 28, 2024

  • Efficiently Serving Large Multimodal Models Using EPD Disaggregation

    Paper • 2501.05460 • Published Dec 25, 2024 • 1

  • ElasticMoE: An Efficient Auto Scaling Method for Mixture-of-Experts Models

    Paper • 2510.02613 • Published 20 days ago • 1

  • ExpertWeave: Efficiently Serving Expert-Specialized Fine-Tuned Adapters at Scale

    Paper • 2508.17624 • Published Aug 25
Upvote
1
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs