A newer version of the Gradio SDK is available:
5.43.1
metadata
title: Polish-English Translation (ByT5)
emoji: π΅π±βοΈπ¬π§
colorFrom: red
colorTo: blue
sdk: gradio
app_file: app.py
license: cc-by-nc-sa-4.0
tags:
- translation
- text2text-generation
- text generation
- language translation
- polish
- english
- byt5
- t5
- tokenizer-free
- nlp
- gradio
sdk_version: 5.34.2
Two-Way Polish π΅π±βοΈπ¬π§ English Translator with ByT5
This Space provides two-way translation between Polish and English using a single, powerful model: Google's byt5-300m
.
The key feature of this model is that it is tokenizer-free. It operates directly on raw text bytes (UTF-8) instead of relying on a fixed vocabulary. This makes it incredibly robust for translation, as it can handle any character, including:
- Polish diacritics (
Δ
,Δ
,Δ
,Ε
,Ε
,Γ³
,Ε
,ΕΊ
,ΕΌ
) - Emojis and special symbols
- Typos or unusual spellings
How to Use
- Enter your text: Type or paste the text you want to translate.
- Select the direction: Choose either
English to Polish
orPolish to English
. The application adds a special prefix to the text to tell the model which way to translate. - Click Submit: The translated text will appear in the output box.
Model and Technical Details
This application is powered by a fine-tuned version of the google/byt5-300m
model.
- Model: google/byt5-300m
- Architecture: ByT5 (Byte-level T5) is a "tokenizer-free" model that processes text as a sequence of bytes. This eliminates "unknown token" errors and allows a single model to handle multiple tasks and languages flexibly.
- Method: Two-way translation is achieved by prepending a task-specific prefix to the input before feeding it to the model (e.g.,
translate English to Polish: Hello world!
).
Created by gregniuki