AI & ML interests

Contributors who are invited to beta-test our next big feature! Contact us if you want to join this team :-)

cbensimon 
posted an update about 2 months ago
view post
Post
3375
🚀 ZeroGPU now supports PyTorch native quantization via torchao

While it hasn’t been battle-tested yet, Int8WeightOnlyConfig is already working flawlessly in our tests.

Let us know if you run into any issues — and we’re excited to see what the community will build!

import spaces
from diffusers import FluxPipeline
from torchao.quantization.quant_api import Int8WeightOnlyConfig, quantize_

pipeline = FluxPipeline.from_pretrained(...).to('cuda')
quantize_(pipeline.transformer, Int8WeightOnlyConfig()) # Or any other component(s)

@spaces.GPU
def generate(prompt: str):
    return pipeline(prompt).images[0]
·
victor 
posted an update about 2 months ago
view post
Post
4443
Open Source Avengers, Assemble! Ask an expert AI agent team to solve complex problems together 🔥

Consilium brings together multiple agents that debate and use live research (web, arXiv, SEC) to reach a consensus. You set the strategy, they find the answer.

Credit to @azettl for this awesome demo: Agents-MCP-Hackathon/consilium_mcp
  • 2 replies
·
cbensimon 
posted an update 3 months ago
view post
Post
5920
🚀 ZeroGPU medium size is now available as a power-user feature

Nothing too fancy for now—ZeroGPU Spaces still default to large (70GB VRAM)—but this paves the way for:
- 💰 size-based quotas / pricing (medium will offer significantly more usage than large)
- 🦣 the upcoming xlarge size (141GB VRAM)

You can as of now control GPU size via a Space variable. Accepted values:
- auto (future default)
- medium
- large (current default)

The auto mode checks total CUDA tensor size during startup:
- More than 30GB → large
- Otherwise → medium
·
victor 
posted an update 4 months ago
view post
Post
4973
DIA TTS is just amazing - please share your funniest gens (here is mine) 😂
nari-labs/Dia-1.6B
victor 
posted an update 6 months ago
view post
Post
6299
Hey everyone, we've given https://hf.co/spaces page a fresh update!

Smart Search: Now just type what you want to do—like "make a viral meme" or "generate music"—and our search gets it.

New Categories: Check out the cool new filter bar with icons to help you pick a category fast.

Redesigned Space Cards: Reworked a bit to really show off the app descriptions, so you know what each Space does at a glance.

Random Prompt: Need ideas? Hit the dice button for a burst of inspiration.

We’d love to hear what you think—drop us some feedback plz!
·
victor 
posted an update 6 months ago
view post
Post
3375
Finally, an open-source AI that turns your lyrics into full songs is here—meet YuE! Unlike other tools that only create short clips, YuE can make entire songs (up to 5 minutes) with vocals, melody, and instruments all working together. Letsss go!

m-a-p/YuE-s1-7B-anneal-en-cot
rwightman 
posted an update 7 months ago
view post
Post
1913
I re-worked the JuptyerLab Space template recently. It's optimized for timm use, but will work great with transformers and other libs. Updated the base image, Python 3.12, Pillow-SIMD before better CPU use with image preprocessing, and made a number of other tweaks. From the Jupyter launcher you can run the terminal and setup a timm environment in moments with setup_timm_dev or setup_timm_scripts helpers. Give it a try, timm/jupyterlab-timm
rwightman 
posted an update 7 months ago
view post
Post
1209
New timm 1.0.13 and OpenCLIP 2.30.0 releases to start the year. Both modest but worthwhile updates.

timm added a number of new model weights, supporting loading of:
* PaliGemma2 encoders (ported from google/paligemma-2-release-67500e1e1dbfdd4dee27ba48)
* AIMv-2 encoders (ported from apple/aimv2-6720fe1558d94c7805f7688c)

A few higher resolution 384x384 ConvNeXt-Nano ImageNet-12k pretrain & finetunes. See other changes here: https://github.com/huggingface/pytorch-image-models/releases/tag/v1.0.13

And support added in both OpenCLIP and timm for two CLIP models that were missed. The DFN L/14 is 🔥
* DFN CLIP L/14 w/ 39B samples seen - apple/DFN2B-CLIP-ViT-L-14-39B, timm/vit_large_patch14_clip_224.dfn2b_s39b
* MetaCLIP H/14 (altogether) - timm/vit_huge_patch14_clip_224.metaclip_altogether

And last, ~70-80 models that were relying on timm remapping from OpenCLIP got their own timm hub instances to allow use with the upcoming Transformers TimmWrapperModel
rwightman 
posted an update 8 months ago
view post
Post
1479
There's a new timm release, v 1.0.12, with a focus on optimizers. The optimizer factory has been refactored, there's now a timm.optim.list_optimizers() and new way to register optimizers and their attributes. As always you can use an timm optimizer like a torch one, just replace torch.optim with timm.optim

New optimizers include:
* AdafactorBigVision - adfactorbv
* ADOPT - adopt / adoptw (decoupled decay)
* MARS - mars
* LaProp - laprop
* Cautious Optimizers - a modification to all of the above, prefix with c as well as cadamw, cnadamw, csgdw, clamb, crmsproptf

I shared some caution comparisons in this model repo: rwightman/timm-optim-caution

For details, references, see the code: https://github.com/huggingface/pytorch-image-models/tree/main/timm/optim

  • 3 replies
·
victor 
posted an update 8 months ago
victor 
posted an update 9 months ago
view post
Post
2627
Perfect example of why Qwen/Qwen2.5-Coder-32B-Instruct is insane?

Introducing: AI Video Composer 🔥
huggingface-projects/ai-video-composer

Drag and drop your assets (images/videos/audios) to create any video you want using natural language!

It works by asking the model to output a valid FFMPEG and this can be quite complex but most of the time Qwen2.5-Coder-32B gets it right (that thing is a beast). It's an update of an old project made with GPT4 and it was almost impossible to make it work with open models back then (~1.5 years ago), but not anymore, let's go open weights 🚀.
rwightman 
posted an update 9 months ago
view post
Post
1397
I'm currently on a push to expand the scope of image based datasets on the Hub. There's certainly a lot already, but for anyone who's looked closely, there's not a whole lot of standardization. I am to fix that, datasets under the timm and pixparse orgs will serve as canonical examples for various task / modality combinations and be useable without fuss in libraries like timm, OpenCLIP, and hopefully more.

I just uploaded the first multi-label dataset that I'll support with timm scripts soon: timm/plant-pathology-2021

Next up object detection & segmentation! I've got an annotation spec sorted out, a lot of datasets ready to rip, and yeah that means timm support for object detection, eventually segmentation, is finally under development :O
victor 
posted an update 9 months ago
view post
Post
1871
Qwen2.5-72B is now the default HuggingChat model.
This model is so good that you must try it! I often get better results on rephrasing with it than Sonnet or GPT-4!!
rwightman 
posted an update 9 months ago
view post
Post
1088
Want to validate some hparams or figure out what timm model to use before commiting to download or training with a large dataset? Try mini-imagenet: timm/mini-imagenet

I had this sitting on my drive and forgot where I pulled it together from. It's 100 classes of imagenet, 50k train and 10k val images (from ImageNet-1k train set), and 5k test images (from ImageNet-1k val set). 7.4GB instead of > 100GB for the full ImageNet-1k. This ver is not reduced resolution like some other 'mini' versions. Super easy to use with timm train/val scripts, checkout the dataset card.

I often check fine-tuning with even smaller datasets like:
* timm/resisc45
* timm/oxford-iiit-pet
But those are a bit small to train any modest size model w/o starting from pretrained weights.
rwightman 
posted an update 9 months ago
view post
Post
1626
New MobileNetV4 weights were uploaded a few days ago -- more ImageNet-12k training at 384x384 for the speedy 'Conv Medium' models.

There are 3 weight variants here for those who like to tinker. On my hold-out eval they are ordered as below, not that different, but the Adopt 180 epochs closer to AdamW 250 than to AdamW 180.
* AdamW for 250 epochs - timm/mobilenetv4_conv_medium.e250_r384_in12k
* Adopt for 180 epochs - timm/mobilenetv4_conv_medium.e180_ad_r384_in12k
* AdamW for 180 epochs - timm/mobilenetv4_conv_medium.e180_r384_in12k

This was by request as a user reported impressive results using the 'Conv Large' ImagNet-12k pretrains as object detection backbones. ImageNet-1k fine-tunes are pending, the weights do behave differently with the 180 vs 250 epochs and the Adopt vs AdamW optimizer.

rwightman 
posted an update 10 months ago
view post
Post
672
A new timm release (1.0.11) is out now. A also wrote an article on one of the included models: https://huggingface.co/blog/rwightman/mambaout

Featured in the release are:
* The MambaOut model, a cheeky arch inspired by SSM but without the SSM part, a ConvNeXt with gating.
* Several timm trained MambaOut variations with arch tweaks and ImageNet-12k pretrain to verify scaling, supplement ported weights.
* The smallest MobileNetV4, a 0.5x width scaled Conv-Small.
* Two impressive MobileNetV3 Large models outperforming all previous, using MNV4 Small recipe.
* 'Zepto,' a new compact ConvNeXt variant even smaller than the previous Atto, 2.2M params, RMSNorm, and solid results for its size.
* Newly ported SigLIP SO400M/16 ViT multi-lingual weights, the largest i18n weights, prevous was B/16.
* Two ImageNet-1k fine-tuned SigLIP SO400M models at 378x378
* InternViT 300M weight port. A really solid ViT encoder distilled from OpenGVLab 6B VL model encoder.
* An assortment of very small, sub 1M param pretrained test models to improve library unit tests and serve low-resource applications.
victor 
posted an update 10 months ago
victor 
posted an update 10 months ago
view post
Post
2701
NEW - Inference Playground

Maybe like me you have always wanted a super easy way to compare llama3.2-1B vs. llama3.2-3B? or the same model with different temperatures?

Trying and comparing warm Inference API models has never been easier!
Just go to https://hf.co/playground, set your token and you're ready to go.
We'll keep improving, feedback welcome 😊
  • 2 replies
·