Smoliakov

Yehor

https://t.me/doing_something

AI & ML interests

Speech-to-Text, Text-to-Speech, Voice over Internet Protocol

Recent Activity

liked a model 5 days ago

espnet/owsm_v4_medium_1B

new activity 8 days ago

Infomaniak-AI/vllm-translategemma-4b-it:How to use it with images?

liked a model 10 days ago

gliner-community/gliner_large-v2.5

View all activity

Organizations

posted an update about 2 months ago

Post

326

A useful tool for all who works with audio datasets: https://github.com/RustedBytes/data-viewer-audio

reacted to mitkox's post with 🔥 5 months ago

Post

2834

Say hello to my little friends! I just unboxed this trio of HP Z2 G1a!

Three is always better than one!
3x AMD Ryzen AI Max+ Pro 395
384GB RAM
24TB of RAID storage
Ubuntu 24.04
ROCm 7.0.2
llama cpp, vLLM and Aibrix

Small, cheap GPUs are about to become the Raspberry Pi of edge AI inference. Sprinkle some kubectl fairy dust on top, and suddenly it's a high-availability, self-healing, cloud-native, enterprise-grade AI cluster camping in a closet.

Make sure you own your AI. AI in the cloud is not aligned with you; it’s aligned with the company that owns it.

3 replies

replied to sequelbox's post 5 months ago

where can I try it?

posted an update 5 months ago

Post

388

Added an Apptainer image to Kulyk:

Yehor/kulyk-sif

replied to their post 5 months ago

Added this tool: https://github.com/RustedBytes/audio-parquet-merger

posted an update 5 months ago

Post

327

If you work with Audio ML, look at https://github.com/RustedBytes/wav-files-toolkit

1 reply

replied to their post 5 months ago

Now with inference on Rust: https://github.com/egorsmkv/kulyk-rust

posted an update 5 months ago

Post

283

Containerized Yehor/kulyk-en-uk and Yehor/kulyk-uk-en so you can just pull an image and run CPU-version to do machine translation:

docker run -p 3000:3000 --rm ghcr.io/egorsmkv/kulyk-rust:latest

reacted to MohamedRashad's post with 👍 7 months ago

Post

3290

If someone is interested in trying the new rednote-hilab/dots.ocr model. I made this space for you:

MohamedRashad/Dots-OCR

2 replies

replied to their post 8 months ago

Added a vice versa model: from Ukrainian to English - https://huggingface.co/spaces/Yehor/uk-en-translator

replied to their post 8 months ago

This comment has been hidden

replied to their post 8 months ago

This comment has been hidden

posted an update 8 months ago

Post

769

A new lightweight model to do machine translation from English to Ukrainian using recently published LFM2 model. Use demo Yehor/en-uk-translator to test it.

Facts:
- Fine-tuned with 40M samples (filtered by quality metric) from ~53.5M for 1.4 epochs
- 354M params
- Requires 1 GB of RAM to run with bf16
- BLEU on FLORES-200: 27.24
- Tokens per second: 229.93 (bs=1), 1664.40 (bs=10), 8392.48 (bs=64)
- License: lfm1.0

Mode page: Yehor/kulyk-en-uk

5 replies

posted an update 11 months ago

Post

965

Esoteric practices: inference models in PHP!

Repository: https://github.com/egorsmkv/speech-to-text-using-php

posted an update 11 months ago

Post

2510

Made a workable program that uses IREE runtime using Rust to inference wav2vec2-bert model for Automatic Speech Recognition.

1 reply

reacted to leonardlin's post with 👍 11 months ago

Post

2704

Happy to announce the release of Shisa V2, our latest generation of our bilingual Japanese-English language models. After hundreds of ablations and months of work, we're releasing some of the strongest open Japanese models at 7B, 8B, 12B, 14B, 32B and 70B! Full announcement here https://shisa.ai/posts/shisa-v2/ or visit the Shisa V2 HF collection: shisa-ai/shisa-v2-67fc98ecaf940ad6c49f5689

replied to their post 11 months ago

Also, tested it on A100 with TensorRT:

https://colab.research.google.com/drive/1-agoo5ll-hWEecWQAtO1FM39sqavJxph?usp=sharing

Results are not so obvious, but it works base_rfdetr_fp16.onnx model and gives ~10ms/img

posted an update 11 months ago

Post

2706

I have made a Rust project with integration of the latest state-of-the-art model for object detection, it outperforms YOLO!

Check it out: https://github.com/egorsmkv/rf-detr-usls

2 replies

replied to their post 11 months ago

This program does what datasets does. When you push dataset created by the audiofolder script, it creates parquet data and shard them internally.

So, you can use audios-to-dataset instead if you need faster speeds than datasets provides.

posted an update 11 months ago

Post

2137

Convert your audio data to Parquet/DuckDB files with blazingly fast speeds!

Repository with pre-built binaries: https://github.com/crs-org/audios-to-dataset

2 replies

Smoliakov

AI & ML interests

Recent Activity

Organizations

Yehor's activity