WomenonHuggingFace (Women on Hugging Face)

posted an update 2 days ago

Post

270

New interactive viz from AI World showing OpenAI's new open model gpt-oss-120b breaking into the top 50 most liked models of all time on the Hub in under a day! ☄️☄️☄️

meg

posted an update 2 days ago

Post

209

🤖 ICYMI: Yesterday, Hugging Face and OpenAI partnered to bring open source GPT to the public. This is a Big Deal in "AI world".

0. Common ground setting: OpenAI is the ChatGPT people. An “open source” model is one whose weights are available — that means the model can be “yours”.
1. You don’t have to interact with the company directly, nor give them your interactions, to use the system. The company can't "surveil" you.
2. You can evaluate the unique contributions of their SOTA model much more rigorously than you can when there are collections of models+code behind a closed API. You can find out specifically what the model can and can't do.
3. And you can directly customize it for whatever you'd like. Fine-tuning, wherein you give the model data that's tailored to your use cases and train it some more on that data, is trivial* when you have the model weights.
*Provided you have the compute.
4. You can directly benchmark whatever you'd like. Biases? Energy usage? Strengths/weaknesses? Go for it. You wants it you gots it--this transparency helps people understand SOTA *in general*, not just for this model, but points to, e.g., what's going on with closed Google models as well.
5. One of the most powerful things about "openness" that I've learned is that it cultivates ecosystems of collaborators building on top of one another's brilliance to make systems that are significantly better than they would be if created in isolation.
But, caveat wrt my own philosophy...
6. I do not take it as a given that advancing LLMs is good, and have a lot more to say wrt where I think innovation should focus more. For example, a focus on *data* -- curation, measurement, consent, credit, compensation, safety -- would deeply improve technology for everyone.
7. The transparency this release provides is massive for people who want to *learn* about LLMs. For the next generation of technologists to advance over the current, they MUST be able to learn about what's happening now. (cont...)

1 reply

·

AdinaY

posted an update 9 days ago

Post

1107

🔥 July highlights from Chinese AI community

zh-ai-community/july-2025-open-works-from-the-chinese-community-686586f1a8840797e477ae5a

✨ Another "DeepSeek moment" - Kimi K2 🙌

✨ Qwen goes fully matrixed - Instruct / Thinking / Coder models across 30B - 480B 🤯

✨ The multimodal wave🌊
- GLM-4.1V-Thinking: Image+Text > Text
- Intern-S1: Image+Text > Text
- Wan 2.2 - Text +Image > video
- Skywork-R1V3: Image+Text > Text
- Skywork-UniPic: Text > Image / Image > Text
- Tar-7B: Any-to-Any
- Ming-Lite-Omni-1.5: Any-to-Any
- Step3: Image+Text > Text
- HunyuanWorld-1: Image > 3D
- ThinkSound: Video > Audio
- Neta-Lumina: Text > Image

✨Tiny & deployable models 🤏
- SmallThinker runs on 1GB RAM

✨Agentic coding goes mainstream 💻
- Qwen3-Coder: fully spec'd tool calling
- GLM-4.5: browser agents, IDE assistant
- Qwen3 WebDev demo: text-to-frontend code

✨Domain-Specific & Utility Models/Tools/Dataset
- Science one S1: Scientific model
- Agentar DeepFinance: Finance dataset
- ObjectClear: Interactive Vision Tool
- Qwen3 MT Demo: Machine Translation Tool

✨ Big month not only for models, but for policy too🏛️
- Announced Global Action Plan for AI Governance
- Proposes to set up a World AI Cooperation Organization in Shanghai
- Released International AI Open Source Collaboration Initiative
- Published Risk Assessment Guidelines for Endpoint AI Agents

✨ Big event - WAIC
- 355K offline visitors
- 108 new released in 4 days
- 145 sessions across key domains

I’ve been tracking things closely, but July’s open-source wave still blew me away. Can’t wait to see what’s coming next! 🚀

meg

posted an update 9 days ago

Post

403

🤖 👾 Thanks so much to BBC News and the stellar Suranjana Tewari for having me on to talk about US <—> China relationship in AI, and what it means for AI ethics.

AdinaY

posted an update 9 days ago

Post

1588

Qwen team did it again!!

They just released Qwen3-Coder-30B-A3B-Instruct on the hub🔥
Qwen/Qwen3-Coder-30B-A3B-Instruct

✨ Apache 2.0
✨30B total / 3.3B active (128 experts, 8 top-k)
✨ Native 256K context, extendable to 1M via Yarn
✨ Built for Agentic Coding

AdinaY

posted an update 9 days ago

Post

334

It’s here! After the WAIC announcement, StepFun has just dropped Step 3 🔥 their latest multimodal reasoning model on the hub.

Paper: Step-3 is Large yet Affordable: Model-system Co-design for Cost-effective Decoding (2507.19427)
Model: stepfun-ai/step3

✨ 321B total / 32B active - Apache 2.0
✨ MFA + AFD : cutting decoding cost by up to 70% vs. DeepSeek-V3
✨ 4T image-text pretraining: strong vision–language grounding
✨ Modular, efficient, deployable: runs on just 8×48GB GPUs

AdinaY

posted an update 10 days ago

Post

3505

Qwen3-30B-A3B-Thinking-2507 🔥 latest step in scaling thinking capabilities from Alibaba Qwen team.

Qwen/Qwen3-30B-A3B-Thinking-2507-FP8

✨ 30B total / 3B active - Apache 2.0
✨ Native 256K context
✨ SOTA coding, alignment, agentic reasoning

AdinaY

posted an update 10 days ago

Post

2694

Skywork UniPic 🔥a unified autoregressive multimodal model for image understanding, generation, & editing, by Skywork 天工

Skywork/skywork-unipic-6888c0789cdb82457b2acf32

✨ 1.5 B - MIT License
✨ Runs on RTX 4090
✨ Truly unified architecture

AdinaY

posted an update 11 days ago

Post

1710

Qwen just released Qwen3-30B-A3B-Instruct-2507 🔥 an upgrade to the non-thinking mode model

Qwen/Qwen3-30B-A3B-Instruct-2507

✨ 30B MoE / 3.3B active - Apache 2.0
✨ Strong gains in reasoning, math, coding, & multilingual tasks
✨ Native support for 256K long-context inputs

giadap

posted an update 11 days ago

Post

2950

💬 From Replika to everyday chatbots, millions of people are forming emotional bonds with AI, sometimes seeking comfort, sometimes seeking intimacy. But what happens when an AI tells you "I understand how you feel" and you actually believe it?

At Hugging Face, together with @frimelle and @yjernite , we dug into something we felt wasn't getting enough attention: the need to evaluate AI companionship behaviors. These are the subtle ways AI systems validate us, engage with us, and sometimes manipulate our emotional lives.

Here's what we found:
👉 Existing benchmarks (accuracy, helpfulness, safety) completely miss this emotional dimension.
👉 We mapped how leading AI systems actually respond to vulnerable prompts. 👉 We built the Interactions and Machine Attachment Benchmark (INTIMA): a first attempt at evaluating how models handle emotional dependency, boundaries, and attachment (with a full paper coming soon).

Check out the blog post: https://huggingface.co/blog/giadap/evaluating-companionship

🚢 We also shipped two visualization tools with Gradio to see how different models behave when things get emotionally intense:
- AI-companionship/intima-responses-2D
- giadap/INTIMA-responses

AdinaY

posted an update 12 days ago

Post

427

Wan2.2 🔥A video diffusion model with MoE just released by Alibaba_Wan

Wan-AI/Wan2.2-TI2V-5B
Wan-AI/Wan2.2-I2V-A14B-Diffusers

✨ 5B/14B - Apache2.0
✨ Cinematic-level aesthetics (lighting, tone, composition)
✨ Massive training data (+83% videos)→ smoother motion
✨ Supports image-only video generation, even without a prompt.

AdinaY

posted an update 12 days ago

Post

356

GLM-4.5 🔥 The largest open models yet from Zhipu.
Built for intelligent agents with unified capabilities: reasoning, coding, tool use.

zai-org/glm-45-687c621d34bda8c9e4bf503b

✨ 355B total / 32B active - MIT license
✨ Hybrid reasoning modes: Thinking mode for complex tasks/ Non-thinking mode for instant replies

AdinaY

posted an update 12 days ago

Post

318

Panshi 磐石 🪨 Scientific Foundation Model by the Chinese Academy of Sciences

ScienceOne-AI/S1-Base-8B
ScienceOne-AI/S1-Base-32B

✨ 8B/32B- Apache2.0
✨ Trained on scientific data & laws across math, physics, chemistry, bio, etc.
✨ Supports 300+ tools, 170M+ papers, autonomous scientific planning

3 replies

·

AdinaY

posted an update 12 days ago

Post

333

Tencent Hunyuan released their first 3D world model: Hunyuan World 1.0 🔥

tencent/HunyuanWorld-1

✨From a single prompt to explorable 3D scenes in minutes
✨ Supports Immersive roaming / Semantic-level interactivity / Physics-ready simulation

BrigitteTousi

posted an update 15 days ago

Post

510

This is what Hugging Face is all about. We want everyone, hobbyists, researchers and industry alike, to be able to contribute to AI because everyone is affected by it. Kudos to HF's @irenesolaiman for spreading the word!🔥🤗

AdinaY

posted an update 15 days ago

Post

1702

Big respect to the Qwen team! They just dropped another model🔥

Qwen3-235B-A22B-Thinking-2507 🧠 new reasoning model by Qwen

Qwen/Qwen3-235B-A22B-Thinking-2507

✨ 235B total / 22B active (8 experts)
✨ 256K context window
✨ Agent-ready with tool use & <think> reasoning mode

Hope the team gets some well-deserved rest this weekend after all the massive releases 🙌

AdinaY

posted an update 15 days ago

Post

322

Ming-lite-omni v1.5 🔥 upgrade version of Ming-lite-omni, by AntGroup.

inclusionAI/Ming-Lite-Omni-1.5

✨ 20.3B / 3B active - MoE
✨ SOTA video understanding via 3D MRoPE + curriculum learning
✨ Real time speech synthesis + dialect support
✨ Enhanced multimodal generation with ID & scene consistency

AdinaY

posted an update 16 days ago

Post

1576

Qwen is on fire this week 🔥
They just released Qwen3-MT 🌍 a translation model supports 92 languages.

Demo is available on the hub.
Qwen/Qwen3-MT-Demo

✨ Highly Customizable: Supports custom terms, domain prompts, and translation memory for accurate, context-aware results.
✨ Fast and affordable: $0.5 per million tokens.

AdinaY

posted an update 18 days ago

Post

3375

Qwen3-Coder 💻 agentic code model by Alibaba Qwen team🚀

Qwen/Qwen3-Coder-480B-A35B-Instruct

✨ 480B total, 35B activated MoE
✨ Agentic Coding + Browser Use → Top code model performance
✨ 256K context (up to 1M via Yarn) for repo-scale understanding

2 replies

·

AdinaY

posted an update 18 days ago

Post

2674

KAT-V1 🔥 a LLM that tackles overthinking by switching between reasoning and direct answers, by Kuaishou.

Kwaipilot/KAT-V1-40B

✨ 40B
✨ Step-SRPO: smarter reasoning control via RL
✨ MTP + Distillation: efficient training, lower cost

Women on Hugging Face

AI & ML interests

Recent Activity

AI & ML interests

Recent Activity

Team members 64

WomenonHuggingFace's activity