Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
AdamF92 's Collections
Reactive Transformer PoC Supervised Models by Reactive AI
Sparse Query Attention (SQA) Research by Reactive AI
Interaction SFT Datasets for Reactive Transformer by RxAI

Sparse Query Attention (SQA) Research by Reactive AI

updated 10 days ago

Experimental models with Sparse Query Attention layers. Reducing training time/cost by ~3-10% compared to GQA & MQA, with the same level performance

Upvote
-

  • Sparse Query Attention (SQA): A Computationally Efficient Attention Mechanism with Query Heads Reduction

    Paper • 2510.01817 • Published 17 days ago • 13

  • ReactiveAI/sSQAT-mm

    Text Generation • 8.62M • Updated 16 days ago

  • ReactiveAI/SQAT-mm

    Text Generation • 8.57M • Updated 16 days ago

  • ReactiveAI/xSQAT-mm

    Text Generation • 8.52M • Updated 16 days ago

  • ReactiveAI/GQA-Ref-Micro

    Text Generation • 8.67M • Updated 16 days ago

  • ReactiveAI/MQA-Ref-Micro

    Text Generation • 8.64M • Updated 16 days ago

  • ReactiveAI/SQAT-m

    Text Generation • 10.7M • Updated 16 days ago

  • ReactiveAI/xSQAT-m

    Text Generation • 10.4M • Updated 16 days ago

  • ReactiveAI/sSQAT-m

    Text Generation • 10.9M • Updated 16 days ago

  • ReactiveAI/xSMQAT-m

    Text Generation • 10.2M • Updated 16 days ago
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs