471 476 654

Sylvain Filoni

fffiloni

AI & ML interests

ML for Animation • Alumni Arts Déco Paris • PSL

Recent Activity

reacted to merve's post with 🔥 about 20 hours ago

Qwen2.5-Omni is soooo good that people build multimodal reasoning models off of it 🥹 > https://huggingface.co/KE-Team/Ke-Omni-R-3B is open-source audio reasoning model sota on average of benchmarks, based on https://huggingface.co/Qwen/Qwen2.5-Omni-3B 🗣️ > https://huggingface.co/Haoz0206/Omni-R1 is a video reasoning model with pixel level grounding (see below) and it's super competitive ⏯️ based on https://huggingface.co/Qwen/Qwen2.5-Omni-7B

reacted to jbilcke-hf's post with 👍 about 20 hours ago

Hi everyone, I've seen some unsuccessful attempts at running Wan2GP inside a Hugging Face Space, which is a shame as it is a great Gradio app! So here is a fork that you can use, with some instructions on how to do this: https://huggingface.co/spaces/jbilcke-hf/Wan2GP_you_must_clone_this_space_to_use_it/discussions/1 Note : some things like persistent models/storage/custom LoRAs might not be fully working out of the box. If you need those, you might have to dig into the Wan2GP codebase, see how to tweak the storage folder. Happy hacking!

reacted to jbilcke-hf's post with 👀 about 20 hours ago

Did you know that there is a UI wrapper around https://github.com/a-r-r-o-w/finetrainers which is a great library made by @a-r-r-o-w for finetuning AI video models? The UI is called VideoModelStudio (or VMS in casual chat) All you have to do is to duplicate this space: https://huggingface.co/spaces/jbilcke-hf/VideoModelStudio

View all activity

Organizations

fffiloni's activity

reacted to merve's post with 🔥 about 20 hours ago

Post

2165

Qwen2.5-Omni is soooo good that people build multimodal reasoning models off of it 🥹
> KE-Team/Ke-Omni-R-3B is open-source audio reasoning model sota on average of benchmarks, based on Qwen/Qwen2.5-Omni-3B 🗣️
> Haoz0206/Omni-R1 is a video reasoning model with pixel level grounding (see below) and it's super competitive ⏯️ based on Qwen/Qwen2.5-Omni-7B

reacted to jbilcke-hf's post with 👍 about 20 hours ago

Post

686

Hi everyone,

I've seen some unsuccessful attempts at running Wan2GP inside a Hugging Face Space, which is a shame as it is a great Gradio app!

So here is a fork that you can use, with some instructions on how to do this:

jbilcke-hf/Wan2GP_you_must_clone_this_space_to_use_it#1

Note : some things like persistent models/storage/custom LoRAs might not be fully working out of the box. If you need those, you might have to dig into the Wan2GP codebase, see how to tweak the storage folder. Happy hacking!

reacted to jbilcke-hf's post with 👀 about 20 hours ago

Post

725

Did you know that there is a UI wrapper around https://github.com/a-r-r-o-w/finetrainers which is a great library made by @a-r-r-o-w for finetuning AI video models?

The UI is called VideoModelStudio (or VMS in casual chat)

All you have to do is to duplicate this space:
jbilcke-hf/VideoModelStudio

reacted to fantaxy's post with 👍 about 20 hours ago

Post

933

🎭 AI's Nobel Prize Challenge: Novel Generator 🚀
Hello! Today I'm thrilled to introduce my AI Short Story Generator 📚✨

🌟 Project Overview
Novel Generator is an AI tool that automatically creates Nobel Prize-worthy short stories. Supporting both Korean and English, it empowers anyone to craft literary masterpieces with ease!

🎯 Key Features
1. 🎲 Story Seed Generator
Randomly generates captivating topics and opening lines
Example: "The Time Traveler's Final Choice" + "That morning, a clock fell from the sky" ⏰

2. 🌐 Multilingual Support
🇬🇧 English: Creates English fiction (Western literary style)
🇰🇷 Korean: Generates Korean novels (reflecting Korean sentiment and style)

3. 📖 Literary Excellence
7,000-10,000 words of complete short fiction
Incorporates techniques from Nobel Prize-winning authors
Advanced literary devices: foreshadowing, symbolism, metaphors

💡 How to Use
Select Language: Choose Korean/English checkbox 🔤
Generate Story Seed: Click "Random Generate SEED" button 🎰
Start Writing: Submit to AI with the Submit button 📝
Continue Story: Type "continued" or "이어서" for next chapter 📄

🛠️ Tech Stack
Friendli API: High-performance LLM serving
Gradio: Intuitive web interface
Python: Backend logic implementation

⚡ Powered by Cutting-Edge Technology
Dedicated NVIDIA H100 GPU Server: Lightning-fast inference speeds
Uncensored LLM Model: Based on 'Gemma-3-R1984-27B' for unrestricted creative freedom
API-driven Architecture: Ensures blazing-fast response times and seamless performance

🎨 What Makes It Special
Anti-repetition Algorithm: Generates fresh, original sentences every time
Genre Diversity: Sci-fi, fantasy, realism, magical realism, and more
PDF/TXT Upload: Create stories based on reference materials
Zero Censorship: Complete creative freedom without content restrictions

🚀 Get Started
fantaxy/fantasy-novel

This project began with a simple question: "Can AI create emotionally compelling literature?"

reacted to fdaudens's post with 🤗 11 days ago

Post

2868

🎵 Dream come true for content creators! TIGER AI can extract voice, effects & music from ANY audio file 🤯
This lightweight model uses frequency band-split technology to separate speech like magic. Kudos to @fffiloni for the amazing demo! fffiloni/TIGER-audio-extraction

reacted to clem's post with 👀 17 days ago

Post

3502

Playing with Veo3 this morning. Share your prompt if you want me to create videos for you (bonus point if they funnily reference HF/open-source). These videos are "a cat on the moon rapping "I love Hugging Face""!

25 replies

reacted to merve's post with 🔥 17 days ago

Post

2377

tis the year of any-to-any/omni models 🤠
ByteDance-Seed/BAGEL-7B-MoT 7B native multimodal model that understands and generates both image + text

it outperforms leading VLMs like Qwen 2.5-VL 👏 and has Apache 2.0 license 😱

reacted to cbensimon's post with 👍 20 days ago

Post

5745

🚀 ZeroGPU medium size is now available as a power-user feature

Nothing too fancy for now—ZeroGPU Spaces still default to large (70GB VRAM)—but this paves the way for:
- 💰 size-based quotas / pricing (medium will offer significantly more usage than large)
- 🦣 the upcoming xlarge size (141GB VRAM)

You can as of now control GPU size via a Space variable. Accepted values:
- auto (future default)
- medium
- large (current default)

The auto mode checks total CUDA tensor size during startup:
- More than 30GB → large
- Otherwise → medium

3 replies

reacted to abidlabs's post with 🔥 about 1 month ago

Post

4937

HOW TO ADD MCP SUPPORT TO ANY 🤗 SPACE

Gradio now supports MCP! If you want to convert an existing Space, like this one hexgrad/Kokoro-TTS, so that you can use it with Claude Desktop / Cursor / Cline / TinyAgents / or any LLM that supports MCP, here's all you need to do:

1. Duplicate the Space (in the Settings Tab)
2. Upgrade the Gradio sdk_version to 5.28 (in the README.md)
3. Set mcp_server=True in launch()
4. (Optionally) add docstrings to the function so that the LLM knows how to use it, like this:

def generate(text, speed=1):
    """
    Convert text to speech audio.

    Parameters:
        text (str): The input text to be converted to speech.
        speed (float, optional): Playback speed of the generated speech.

That's it! Now your LLM will be able to talk to you 🤯

reacted to seawolf2357's post with ❤️ about 2 months ago

Post

5764

📚 Papers Leaderboard - See the Latest AI Research Trends at a Glance! ✨

Hello, AI research community! Today I'm introducing a new tool for exploring research papers. Papers Leaderboard is an open-source dashboard that makes it easy to find and filter the latest AI research papers.

Heartsync/Papers-Leaderboard

🌟 Key Features

Date Filtering: View only papers published within a specific timeframe (from May 5, 2023 to present)
Title Search: Quickly find papers containing your keywords of interest
Abstract Search: Explore paper content more deeply by searching for keywords within abstracts
Automatic Updates: The database is updated with the latest papers every hour

💡 How to Use It?

Select a start date and end date
Enter keywords you want to find in titles or abstracts
Adjust the maximum number of search results for abstract searches
Results are displayed neatly in table format

reacted to fdaudens's post with 👍 3 months ago

Post

1552

Ever wanted 45 min with one of AI’s most fascinating minds? Was with @thomwolf at HumanX Vegas. Sharing my notes of his Q&A with the press—completely changed how I think about AI’s future:

1️⃣ The next wave of successful AI companies won’t be defined by who has the best model but by who builds the most useful real-world solutions. "We all have engines in our cars, but that’s rarely the only reason we buy one. We expect it to work well, and that’s enough. LLMs will be the same."

2️⃣ Big players are pivoting: "Closed-source companies—OpenAI being the first—have largely shifted from LLM announcements to product announcements."

3️⃣ Open source is changing everything: "DeepSeek was open source AI’s ChatGPT moment. Basically, everyone outside the bubble realized you can get a model for free—and it’s just as good as the paid ones."

4️⃣ Product innovation is being democratized: Take Manus, for example—they built a product on top of Anthropic’s models that’s "actually better than Anthropic’s own product for now, in terms of agents." This proves that anyone can build great products with existing models.

We’re entering a "multi-LLM world," where models are becoming commoditized, and all the tools to build are readily available—just look at the flurry of daily new releases on Hugging Face.

Thom's comparison to the internet era is spot-on: "In the beginning you made a lot of money by making websites... but nowadays the huge internet companies are not the companies that built websites. Like Airbnb, Uber, Facebook, they just use the internet as a medium to make something for real life use cases."

Love to hear your thoughts on this shift!

1 reply

reacted to fdaudens's post with 🚀 3 months ago

Post

960

🤯 Gemma 3's image analysis blew me away!

Tested 2 ways to extract airplane registration numbers from photos with 12B model:

1️⃣ Gradio app w/API link (underrated feature IMO) + ZeroGPU infra on Hugging Face in Google Colab. Fast & free.

2️⃣ LMStudio + local processing (100% private). Running this powerhouse on a MacBook w/16GB RAM is wild! 🚀

Colab: https://colab.research.google.com/drive/1YmmaP0IDEu98CLDppAAK9kbQZ7lFnLZ1?usp=sharing

reacted to jsulz's post with 🧠👍 3 months ago

Post

2009

If you've been following along with the Xet Team's (

xet-team ) work, you know we've been working to migrate the Hugging Face Hub from Git LFS and to Xet.

Recently, we launched a waitlist to join the movement to Xet (join here! https://huggingface.co/join/xet ) but getting to this point was a journey.

From the initial proof of concept in August, to launching on the Hub internally, to migrating a set of repositories and routing a small chunk of download traffic on the Hub through our infrastructure. Every step of the way has been full of challenges, big and small, and well worth the effort.

Over the past few weeks, with real traffic flowing through our services we’ve tackled some truly gnarly issues (unusual upload/download patterns, memory leaks, load imbalances, and more) and resolved each without major disruptions.

If you're curious about how this sliver of Hub infrastructure looks as we routed traffic through it for the first time (and want a deep dive full of Grafana and Kibana charts 🤓) I have a post for you.

Here's an inside look into the day of our first migrations and the weeks following, where we pieced together solutions in real time.

https://huggingface.co/blog/xet-on-the-hub

reacted to m-ric's post with 🤗 3 months ago

Post

5077

smolagents now support vLLM! 🥳

As one of the most popular local inference solutions, the community had been asking us to integrate vLLM: after a heavy refactoring of our LLM classes, we've just released smolagents 1.11.0, with a brand new VLLMModel class.

Go try it and tell us what you think!

https://github.com/huggingface/smolagents/blob/45b2c86857b7f7657daaa74e4d17d347e9e2c4a4/src/smolagents/models.py#L497

reacted to AtAndDev's post with 😔 3 months ago

Post

4294

There seems to multiple paid apps shared here that are based on models on hf, but some ppl sell their wrappers as "products" and promote them here. For a long time, hf was the best and only platform to do oss model stuff but with the recent AI website builders anyone can create a product (really crappy ones btw) and try to sell it with no contribution to oss stuff. Please dont do this, or try finetuning the models you use...
Sorry for filling yall feed with this bs but yk...

6 replies

reacted to DmitryRyumin's post with 👍 4 months ago

Post

3908

🚀🎭🌟 New Research Alert - WACV 2025 (Avatars Collection)! 🌟🎭🚀
📄 Title: EmoVOCA: Speech-Driven Emotional 3D Talking Heads 🔝

📝 Description: EmoVOCA is a data-driven method for generating emotional 3D talking heads by combining speech-driven lip movements with expressive facial dynamics. This method has been developed to overcome the limitations of corpora and to achieve state-of-the-art animation quality.

👥 Authors: @FedeNoce , Claudio Ferrari, and Stefano Berretti

📅 Conference: WACV, 28 Feb – 4 Mar, 2025 | Arizona, USA 🇺🇸

📄 Paper: https://arxiv.org/abs/2403.12886

🌐 Github Page: https://fedenoce.github.io/emovoca/
📁 Repository: https://github.com/miccunifi/EmoVOCA

🚀 CVPR-2023-24-Papers: https://github.com/DmitryRyumin/CVPR-2023-24-Papers

🚀 WACV-2024-Papers: https://github.com/DmitryRyumin/WACV-2024-Papers

🚀 ICCV-2023-Papers: https://github.com/DmitryRyumin/ICCV-2023-Papers

📚 More Papers: more cutting-edge research presented at other conferences in the DmitryRyumin/NewEraAI-Papers curated by @DmitryRyumin

🚀 Added to the Avatars Collection: DmitryRyumin/avatars-65df37cdf81fec13d4dbac36

🔍 Keywords: #EmoVOCA #3DAnimation #TalkingHeads #SpeechDriven #FacialExpressions #MachineLearning #ComputerVision #ComputerGraphics #DeepLearning #AI #WACV2024

1 reply

reacted to jsulz's post with 🚀 4 months ago

Post

3662

Time flies!

Six months after joining Hugging Face the Xet team is kicking off the first migrations from LFS to our storage for a number of repositories on the Hub.

More on the nitty gritty details behind the migration soon, but here are the big takeaways:

🤖 We've successfully completed the first migrations from LFS -> Xet to test the infrastructure and prepare for a wider release

✅ No action on your part needed - you can work with a Xet-backed repo like any other repo on the Hub (for now - major improvements on their way!)

👀 Keep an eye out for the Xet logo to see if a repo you know is on our infra! See the screenshots below to spot the difference 👇

⏩ ⏩ ⏩ Blazing uploads and downloads coming soon. W’re gearing up for a full integration with the Hub's Python library that will make building on the Hub faster than ever - special thanks to @celinah and @Wauplin for their assistance.

🎉 Want Early Access? If you’re curious and want to test it out the bleeding edge that will power the development experience on the Hub, we’d love to partner with you. Let me know!

This is the culmination of a lot of effort from the entire team. Big round of applause to @sirahd @brianronan @jgodlewski @hoytak @seanses @assafvayner @znation @saba9 @rajatarya @port8080 @yuchenglow

1 reply

posted an update 4 months ago

Post

13308

I was thinking i need to step up my game on training Flux LoRas models, time to have some fun ! ☀️

Expect a new drop per week on aesthetics that catched my attention, here are 3 of them that worked really well !

fffiloni/cute-comic-800
fffiloni/carbo-800
fffiloni/oniric-750

3 replies

posted an update 4 months ago

Post

3572

Explain like i'm 5 the last take from @thomwolf on X about Dario's essay on DeepSeek:

—› Open-source AI is like a big cookbook that everyone can read and improve. Instead of a few chefs keeping their recipes secret, anyone can cook, test, and invent new things.

If only one company controls AI, everything stops if they have a problem—like when the internet goes down. With open-source, many people can help, making sure it keeps running smoothly.

AI isn’t just a race between two countries; it’s a team effort around the world. By sharing, we move faster and create safer technology for everyone.
—
🤗