blog-explorers (Blog-explorers)

eliebak

authored a paper about 21 hours ago

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

Paper • 2506.05209 • Published 2 days ago • 27

Aurelien-Morgan

in blog-explorers/README 3 days ago

Preview dark/light toggle

1

#9 opened 2 months ago by

Aurelien-Morgan

Xenova

posted an update 3 days ago

Post

2219

NEW: Real-time conversational AI models can now run 100% locally in your browser! 🤯

🔐 Privacy by design (no data leaves your device)
💰 Completely free... forever
📦 Zero installation required, just visit a website
⚡️ Blazingly-fast WebGPU-accelerated inference

Try it out: webml-community/conversational-webgpu

For those interested, here's how it works:
- Silero VAD for voice activity detection
- Whisper for speech recognition
- SmolLM2-1.7B for text generation
- Kokoro for text to speech

Powered by Transformers.js and ONNX Runtime Web! 🤗 I hope you like it!

2 replies

·

ariG23498

posted an update 4 days ago

Post

1193

🚨 Implement KV Cache from scratch in pure PyTorch. 🚨

We have documented all of our learning while implementing KV Cache to nanoVLM. Joint work with @kashif @lusxvr @andito @pcuenq

Blog: hf.co/blog/kv-cache

1 reply

·

KaraKaraWitch

posted an update 4 days ago

Post

148

"What's wrong with using huggingface transformers?"

Here's a quick example. Am I supposed to be going in with the full knowledge of the inner workings of a LLM model?

import pathlib
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("<ModernBERT>")
# Triton is **required**, but no where in the documentation is specified that triton is needed.
# Installing triton in windows isn't super straightforward. Thankfully someone has already built wheels for it.
#  - https://github.com/woct0rdho/triton-windows/releases

model = AutoModelForSequenceClassification.from_pretrained(
    "<ModernBERT>",  # reference_compile=False
)
# By default it uses CPU. Which is slow. Move to a cuda device.
# This will actually error out if you use "gpu" instead.
model = model.to("cuda")


with torch.no_grad():
    # Not setting `return_tensors="pt"` causes
    #   File "C:\Program Files\Python310\lib\site-packages\transformers\modeling_utils.py", line 5311, in warn_if_padding_and_no_attention_mask
    #     if self.config.pad_token_id in input_ids[:, [-1, 0]]:
    #   TypeError: list indices must be integers or slices, not tuple
    # or...
    #  File "C:\Program Files\Python310\lib\site-packages\transformers\models\modernbert\modeling_modernbert.py", line 836, in forward
    #    batch_size, seq_len = input_ids.shape[:2]
    #  AttributeError: 'list' object has no attribute 'shape'
    block = tokenizer(
        pathlib.Path("test-fic.txt").read_text("utf-8"), return_tensors="pt"
    )
    block = block.to("cuda")
    # **block is needed to fix "AttributeError: 'NoneType' object has no attribute 'unsqueeze'" on attention_mask.unsqueeze(-1)
    logits = model(**block).logits

# Not moving to cpu will cause the sigmoid/softmax ops to fail.
logits = logits.to("cpu")
# print(logits)
predicted_class_ids = torch.softmax(logits, -1)[
    0
].numpy()

3 replies

·

Reality123b

posted an update 11 days ago

Post

219

does merging models count as creating a new model myself?

celinah

posted an update 16 days ago

Post

2218

✨ Today we’re releasing Tiny Agents in Python — an MCP-powered Agent in ~70 lines of code 🐍

Inspired by Tiny Agents in JS from @julien-c , we ported the idea to Python and integrated it directly into huggingface_hub — with a built-in MCP Client and a Tiny Agents CLI.

TL;DR: With MCP (Model Context Protocol), you can expose tools like web search or image generation and connect them directly to LLMs. It’s simple — and surprisingly powerful.

pip install "huggingface_hub[mcp]>=0.32.0"

We wrote a blog post where we show how to run Tiny Agents, and dive deeper into how they work and how to build your own.
👉 https://huggingface.co/blog/python-tiny-agents

1 reply

·

KaraKaraWitch

posted an update 16 days ago

Post

2629

> New Model
> Looks at Model Card
> "Open-Weights"

1 reply

·

Reality123b

in blog-explorers/README 18 days ago

[Support] Community Articles

🤝 🚀 1

85

#5 opened about 1 year ago by

victor

jordiclive

authored 2 papers 23 days ago

Lessons from the Trenches on Reproducible Evaluation of Language Models

Paper • 2405.14782 • Published May 23, 2024

MMTEB: Massive Multilingual Text Embedding Benchmark

Paper • 2502.13595 • Published Feb 19 • 34

ImranzamanML

posted an update 27 days ago

Post

550

Run LLM model Locally using Docker right inside your codebase (No GUI Needed!)

In this project, I did not used the suporting GUI like Open WebUI or LM Studio or any other, so the purpose to use stand alone LLM models with ollama to give you the idea that how you can use it in your project/code instead of running through third party. Everything is containerized with Docker, so setup is clean and repeatable. Its just a fun side project so my connections can learn more about running models locally in their own projects.

Tech stack used:

🐋 Docker

🦙 LLaMA via Ollama

💻 HTML/CSS/JS

🐍 Python + FastAPI

🌐 NGINX

Its still early and a fun side project, but if you are into local model deployment, or just want to see how it works, check it out on the given link!

https://github.com/Imran-ml/llama-chatbot-dockerized

#LLM #Docker #OpenSource #Chatbot #LLaMA #fastapi

juhoinkinen

authored a paper 30 days ago

Annif at SemEval-2025 Task 5: Traditional XMTC augmented by LLMs

Paper • 2504.19675 • Published Apr 28

juhoinkinen

posted an update about 1 month ago

Post

2769

We ( @osma , @MonaLehtinen & me, i.e. the Annif team at the National Library of Finland) recently took part in the LLMs4Subjects challenge at the SemEval-2025 workshop. The task was to use large language models (LLMs) to generate good quality subject indexing for bibliographic records, i.e. titles and abstracts.

We are glad to report that our system performed well; it was ranked

🥇 1st in the category where the full vocabulary was used
🥈 2nd in the smaller vocabulary category
🏅 4th in the qualitative evaluations.

14 participating teams developed their own solutions for generating subject headings and the output of each system was assessed using both quantitative and qualitative evaluations. Research papers about most of the systems are going to be published around the time of the workshop in late July, and many pre-prints are already available.

We applied Annif together with several LLMs that we used to preprocess the data sets: translated the GND vocabulary terms to English, translated bibliographic records into English and German as required, and generated additional synthetic training data. After the preprocessing, we used the traditional machine learning algorithms in Annif as well as the experimental XTransformer algorithm that is based on language models. We also combined the subject suggestions generated using English and German language records in a novel way.

More information can be found in our system description preprint: Annif at SemEval-2025 Task 5: Traditional XMTC augmented by LLMs (2504.19675)

See also the task description preprint: SemEval-2025 Task 5: LLMs4Subjects -- LLM-based Automated Subject Tagging for a National Technical Library's Open-Access Catalog (2504.07199)

The Annif models trained for this task are available here: NatLibFi/Annif-LLMs4Subjects-data

2 replies

·

ImranzamanML

authored a paper about 1 month ago

A Robust Deep Networks based Multi-Object MultiCamera Tracking System for City Scale Traffic

Paper • 2505.00534 • Published May 1 • 2

mrfakename

posted an update about 1 month ago

Post

3464

Hi everyone,

I just launched TTS Arena V2 - a platform for benchmarking TTS models by blind A/B testing. The goal is to make it easy to compare quality between open-source and commercial models, including conversational ones.

What's new in V2:

- **Conversational Arena**: Evaluate models like CSM-1B, Dia 1.6B, and PlayDialog in multi-turn settings
- **Personal Leaderboard**: Optional login to see which models you tend to prefer
- **Multi-speaker TTS**: Random voices per generation to reduce speaker bias
- **Performance Upgrade**: Rebuilt from Gradio → Flask. Much faster with fewer failed generations.
- **Keyboard Shortcuts**: Vote entirely via keyboard

Also added models like MegaTTS 3, Cartesia Sonic, and ElevenLabs' full lineup.

I'd love any feedback, feature suggestions, or ideas for models to include.

TTS-AGI/TTS-Arena-V2

4 replies

·

Reality123b

posted an update about 1 month ago

Post

265

https://huggingface.co/posts/Reality123b/379097737205276
remember this dataset?
im bumping the example count to approx 23 million prompt-response pairs
and ofc. it is going to be a hybrid reasoning, well, it isnt programmatically hybrid reasoning but it is that it is going to use CoT whenever necessary and it doesnt when it doesnt seem like it doesnt need

Xenova

posted an update about 1 month ago

Post

8087

Introducing the ONNX model explorer: Browse, search, and visualize neural networks directly in your browser. 🤯 A great tool for anyone studying Machine Learning! We're also releasing the entire dataset of graphs so you can use them in your own projects! 🤗

Check it out! 👇
Demo: onnx-community/model-explorer
Dataset: onnx-community/model-explorer
Source code: https://github.com/xenova/model-explorer

ImranzamanML

posted an update about 1 month ago

Post

2878

🚀 New paper out: "Improving Arabic Multi-Label Emotion Classification using Stacked Embeddings and Hybrid Loss Function"
Improving Arabic Multi-Label Emotion Classification using Stacked Embeddings and Hybrid Loss Function (2410.03979)

In this work, we tackle some major challenges in Arabic multi-label emotion classification especially the issues of class imbalance and label correlation that often hurt model performance, particularly for minority emotions.

Our approach:

Stacked contextual embeddings from fine-tuned ArabicBERT, MarBERT, and AraBERT models.

A meta-learning strategy that builds richer representations.

A hybrid loss function combining class weighting, label correlation matrices, and contrastive learning to better handle class imbalances.

🧠 Model pipeline: stacked embeddings → meta-learner → Bi-LSTM → fully connected network → multi-label classification.

🔍 Extensive experiments show significant improvements across Precision, Recall, F1-Score, Jaccard Accuracy, and Hamming Loss.
🌟 The hybrid loss function in particular helped close the gap between majority and minority classes!

We also performed ablation studies to break down each component’s contribution and the results consistently validated our design choices.

This framework isn't just for Arabic it offers a generalizable path for improving multi-label emotion classification in other low-resource languages and domains.

Big thanks to my co-authors: Muhammad Azeem Aslam, Wang Jun, Nisar Ahmed, Li Yanan, Hu Hongfei, Wang Shiyu, and Xin Liu!

Would love to hear your thoughts on this work! 👇

JLouisBiz

posted an update about 2 months ago

Post

902

https://www.youtube.com/watch?v=AN-iZblyZNE

Discover how to harness the power of NOMIC Embed Vision v1.5 to find similar images within GNU Emacs Dired mode. With this innovative embeddings model, you can search for images based on semantic similarities using simple keywords. This is possible because the text model of NOMIC shares the same vector space as the Embed Vision model.

In this video, we'll show you how to run the script on your computer and explore the capabilities of this groundbreaking model. You'll learn how to find similar pictures and enjoy the convenience of searching for images using just a few words.

Don't miss out on this exciting opportunity to enhance your image searching experience with NOMIC Embed Vision v1.5 in Emacs Lisp.

Script to run model:
https://gitea.com/gnusupport/LLM-Helpers/src/branch/main/bin/nomic-embed-vision-v1.5-api.py

1 reply

·

Blog-explorers

AI & ML interests

Recent Activity

blog-explorers's activity

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

Preview dark/light toggle

[Support] Community Articles

Lessons from the Trenches on Reproducible Evaluation of Language Models

MMTEB: Massive Multilingual Text Embedding Benchmark

Annif at SemEval-2025 Task 5: Traditional XMTC augmented by LLMs

A Robust Deep Networks based Multi-Object MultiCamera Tracking System for City Scale Traffic

AI & ML interests

Recent Activity

Team members 713

blog-explorers's activity

Preview dark/light toggle

[Support] Community Articles