IBM Granite

Enterprise

company

IBM-Granite

Activity Feed

AI & ML interests

LLMs for language and code + Time series and geospatial foundation models

Recent Activity

mayank-mishra authored a paper 3 days ago

FlashFormer: Whole-Model Kernels for Efficient Low-Batch Inference

mayank-mishra authored a paper 3 days ago

PaTH Attention: Position Encoding via Accumulating Householder Transformations

gabegoodhart new activity 4 days ago

ibm-granite/granite-3.3-8b-instruct-GGUF:thank you for GGUF!

View all activity

ibm-granite's activity

mayank-mishra

authored 2 papers 3 days ago

FlashFormer: Whole-Model Kernels for Efficient Low-Batch Inference

Paper • 2505.22758 • Published 10 days ago

PaTH Attention: Position Encoding via Accumulating Householder Transformations

Paper • 2505.16381 • Published 17 days ago

ariG23498

posted an update 3 days ago

Post

1191

🚨 Implement KV Cache from scratch in pure PyTorch. 🚨

We have documented all of our learning while implementing KV Cache to nanoVLM. Joint work with @kashif @lusxvr @andito @pcuenq

Blog: hf.co/blog/kv-cache

1 reply

gabegoodhart

in ibm-granite/granite-3.3-8b-instruct-GGUF 4 days ago

thank you for GGUF!

#1 opened about 2 months ago by

jacek2024

kgreenewald

updated a collection 4 days ago

Granite Experiments

Collection

Experimental projects under consideration for the Granite family. • 17 items • Updated 4 days ago • 12

kgreenewald

published a model 4 days ago

ibm-granite/granite-3.2-8b-alora-requirement-check

Text Generation • Updated Apr 27

ibibrahim

updated 3 collections 4 days ago

abrooks9944

in ibm-granite/granite-speech-3.3-8b 4 days ago

Problems with word insertions (hallucinations) when used with vLLM (online)

#4 opened 5 days ago by

entn-at

kgreenewald

updated a model 18 days ago

ibm-granite/granite-3.2-8b-alora-uncertainty

Text Generation • Updated 18 days ago • 2

ibibrahim

updated a collection 19 days ago

Granite Docling

Collection

1 item • Updated 19 days ago

reach-vb

posted an update 19 days ago

Post

3691

hey hey @mradermacher - VB from Hugging Face here, we'd love to onboard you over to our optimised xet backend! 💥

as you know we're in the process of upgrading our storage backend to xet (which helps us scale and offer blazingly fast upload/ download speeds too): https://huggingface.co/blog/xet-on-the-hub and now that we are certain that the backend can scale with even big models like Llama 4/ Qwen 3 - we;re moving to the next phase of inviting impactful orgs and users on the hub over as you are a big part of the open source ML community - we would love to onboard you next and create some excitement about it in the community too!

in terms of actual steps - it should be as simple as one of the org admins to join hf.co/join/xet - we'll take care of the rest.

p.s. you'd need to have a the latest hf_xet version of huggingface_hub lib but everything else should be the same: https://huggingface.co/docs/hub/storage-backends#using-xet-storage

p.p.s. this is fully backwards compatible so everything will work as it should! 🤗

16 replies

clefourrier

posted an update 20 days ago

Post

614

Always surprised that so few people actually read the FineTasks blog, on
✨how to select training evals with the highest signal✨

If you're serious about training models without wasting compute on shitty runs, you absolutely should read it!!

An high signal eval actually tells you precisely, during training, how wel & what your model is learning, allowing you to discard the bad runs/bad samplings/...!

The blog covers in depth prompt choice, metrics, dataset, across languages/capabilities, and my fave section is "which properties should evals have"👌
(to know on your use case how to select the best evals for you)

Blog: HuggingFaceFW/blogpost-fine-tasks