jorgemunozl (Jorge Munoz Laredo)

replied to MonsterMMORPG's post about 21 hours ago

Yeah, it seem pretty nice !

reacted to MonsterMMORPG's post with 👍 about 21 hours ago

Post

1450

LTX 2 & Z Image Base Full Tutorial + Audio to Video Lip Sync + ComfyUI + SwarmUI + Windows + Cloud

Full tutorial link > https://www.youtube.com/watch?v=SkXrYezeEDc

Info
LTX 2 is the newest state of the art (SOTA) Open Source video generation model and tutorial will show you how to use it with very best and most performant way in ComfyUI and also in SwarmUI. Moreover, Z Image Base model published and I will show how to use Z Image Base with most amazing preset and workflow as well. Furthermore, this tutorial will show you how to install, update, setup, download ComfyUI and SwarmUI and models and presets and workflows both on Windows and on RunPod, Massed Compute and SimplePod. Linux users can use Massed Compute scripts and installers directly. This is a masterpiece entire lecture level complete tutorial. This video will kickstart your AI journey 100x. Both local Windows and Cloud.

45 Second Raw Demo Video

This video made with text + image + audio = lip synched and animated video at once

See video below

3 replies

·

posted an update 3 days ago

Post

179

Test

I know that it was buggy, OMG

reacted to hassenhamdi's post with 🔥 10 days ago

Post

1997

Google published the paper. I shipped the code. 🚀

DeepMind just released PACEvolve (Progress-Aware Consistent Evolution), a massive overhaul of the AlphaEvolve framework. It solves the critical issues of "Context Pollution" and "Mode Collapse" that have historically crippled evolutionary coding agents.

But there was no public implementation. So I built one.

Introducing OpenPACEvolve: A fully open-source, production-grade implementation of the PACEvolve framework.

🛠 I engineered this framework solo, but I wasn't working alone. I orchestrated a custom coding agents powered by Claude Opus 4.5 as Engineer and Gemini Pro 3 Preview ensuring fiedelity and quallty.

By leveraging these SOTA models, I was able to translate complex theoretical research into functional, modular Python architecture in record time. This is what the future of AI engineering looks like: Human architectural oversight + AI velocity.

🧠 What OpenPACEvolve Solves: Unlike standard agents that get "stuck" in loops, this framework implements the paper's full recipe for long-horizon stability: ✅ Hierarchical Context Management (HCM): Bi-level pruning to keep the agent's memory clean. ✅ Momentum-Based Backtracking (MBB): Uses "power-law backtracking" to detect stagnation and force pivots. ✅ Self-Adaptive Crossover: Intelligent code-sharing between parallel "islands."

👨‍💻 This project is more than a repo; it's a demonstration of rapid research-to-production cycles using next-gen AI workflows.

📎 Link of the paper : https://arxiv.org/abs/2601.10657

The code is live. The agents are ready. Check out the repository below. 👇
https://github.com/hassenhamdi/OpenPACEvolve
Star the repo 🌟.

reacted to jzhang533's post with 🔥 3 months ago

Post

3255

We’ve officially kicked off the ERNIE AI Developer Challenge!

We want to create something interesting with you all, so we partnered with Unsloth, LLaMA-Factory, Novita AI, D-Robotics, and CAMEL-AI to empower your creativity.

Come build with us: https://baiduernieai.devpost.com/?utm_source=ERNIE-HF&utm_medium=ERNIE-HF&utm_campaign=ERNIE+AI+Developer+Challenge

reacted to cjerzak's post with 👀 3 months ago

Post

2876

>>> We're writing a new book, <Planetary Causal Inference>, on how to model counterfactuals at planetary scale by combining satellite imagery + other global data with local studies and RCTs. Forthcoming in 2026+.
>>> Book info: https://planetarycausalinference.org/book-launch
>>> All datasets used in the book will be openly available on our lab’s Hugging Face hub:

theaidevlab

reacted to codelion's post with 🚀 3 months ago

Post

3614

On this day in 2019, OpenAI released the final GPT-2 model as part of their staged release. I still remember that November well - so much was happening, but GPT-2's release felt like a watershed moment for the field. It showed us what was possible with carefully trained language models.

To recreate some of that GPT-2 magic, I recently tackled an interesting challenge: can you pretrain a language model with just 1 billion tokens - roughly 1/10th of what GPT-2 used - and still get comparable performance? After 50+ systematic experiments testing different dataset mixtures, the answer is yes.

The result is codelion/gpt-2-70m, which achieves over 90% of GPT-2's benchmark performance despite being trained on 10x less data. The key was finding the optimal dataset composition: 50% high-quality textbook PDFs, 30% filtered web content, and 20% educational resources. It even beats GPT-2 on TruthfulQA (47.31% vs 40.69%).

If you're interested in the full story of how we discovered this optimal mixture and why curriculum learning catastrophically failed, check out the complete article: https://huggingface.co/blog/codelion/optimal-dataset-mixing

Sometimes less really is more - when you mix it right.

1 reply

·

reacted to daqc's post with 😎 3 months ago

Post

2879

Just applied for HF Community Grant for “Hugging Research” — a lightweight CodeAgent‑based research assistant built on Hugging Face’s Open Deep Research project for the Hugging Face Hub (models, datasets, Spaces, users, collections, papers). It gathers links via dedicated tools and organizes them for easy review.

As this is for the community, comments and suggestions are appreciated: daqc/hugging-research#1

reacted to MonsterMMORPG's post with ❤️ 5 months ago

Post

7424

I have concluded first 8 traininings of Qwen Image LoRA - we are not at the level of FLUX yet and next 8 trainings starting hopefully - 2656x2656px image generated with 8 steps Fast Qwen LoRA + myself trained LoRA :

Grid test results shared here along with App installer : https://www.patreon.com/posts/137551634

reacted to mrs83's post with 👀 6 months ago

Post

2863

Introducing the Computer Says No Dataset: ethicalabs/computer-says-no

An LLM can do almost anything, but should it?

This dataset provides clear examples of when LLMs should decline requests, such as:

- Counting characters (e.g., "number of 'r's in 'raspberry'" – seriously, you’ve got this)
- Solving basic equations (like *5.9 = x + 5.11* – please, show that calculator some love)

Inspired by Little Britain's iconic "Computer Says No" sketch, we address a critical issue in AI systems today: the waste of using a rocket launcher to swat flies (aka powerful models for trivial tasks).

Goals:
- Reduce waste by saving compute for tasks that actually need it
- Guide users to better tools
- Spark discussion about ethical AI

This isn’t a training set. It’s a provocation: if we don’t define AI's limits, who will?

9 replies

·

reacted to pcuenq's post with 🔥 6 months ago

Post

10283

OpenELM in Core ML

Apple recently released a set of efficient LLMs in sizes varying between 270M and 3B parameters. Their quality, according to benchmarks, is similar to OLMo models of comparable size, but they required half the pre-training tokens because they use layer-wise scaling, where the number of attention heads increases in deeper layers.

I converted these models to Core ML, for use on Apple Silicon, using this script: https://gist.github.com/pcuenca/23cd08443460bc90854e2a6f0f575084. The converted models were uploaded to this community in the Hub for anyone that wants to integrate inside their apps: corenet-community/openelm-core-ml-6630c6b19268a5d878cfd194

The conversion was done with the following parameters:
- Precision: float32.
- Sequence length: fixed to 128.

With swift-transformers (https://github.com/huggingface/swift-transformers), I'm getting about 56 tok/s with the 270M on my M1 Max, and 6.5 with the largest 3B model. These speeds could be improved by converting to float16. However, there's some precision loss somewhere and generation doesn't work in float16 mode yet. I'm looking into this and will keep you posted! Or take a look at this issue if you'd like to help: https://github.com/huggingface/swift-transformers/issues/95

I'm also looking at optimizing inference using an experimental kv cache in swift-transformers. It's a bit tricky because the layers have varying number of attention heads, but I'm curious to see how much this feature can accelerate performance in this model family :)

Regarding the instruct fine-tuned models, I don't know the chat template that was used. The models use the Llama 2 tokenizer, but the Llama 2 chat template, or the default Alignment Handbook one that was used to train, are not recognized. Any ideas on this welcome!

5 replies

·

Jorge Munoz Laredo

AI & ML interests

Recent Activity

Organizations

Jorge Munoz Laredo

AI & ML interests

Recent Activity

Organizations

jorgemunozl's activity