translate / README.md
Gregniuki's picture
Update README.md
a289668 verified

A newer version of the Gradio SDK is available: 5.43.1

Upgrade
metadata
title: Polish-English Translation (ByT5)
emoji: πŸ‡΅πŸ‡±β†”οΈπŸ‡¬πŸ‡§
colorFrom: red
colorTo: blue
sdk: gradio
app_file: app.py
license: cc-by-nc-sa-4.0
tags:
  - translation
  - text2text-generation
  - text generation
  - language translation
  - polish
  - english
  - byt5
  - t5
  - tokenizer-free
  - nlp
  - gradio
sdk_version: 5.34.2

Two-Way Polish πŸ‡΅πŸ‡±β†”οΈπŸ‡¬πŸ‡§ English Translator with ByT5

This Space provides two-way translation between Polish and English using a single, powerful model: Google's byt5-300m.

The key feature of this model is that it is tokenizer-free. It operates directly on raw text bytes (UTF-8) instead of relying on a fixed vocabulary. This makes it incredibly robust for translation, as it can handle any character, including:

  • Polish diacritics (Δ…, Δ‡, Δ™, Ε‚, Ε„, Γ³, Ε›, ΕΊ, ΕΌ)
  • Emojis and special symbols
  • Typos or unusual spellings

How to Use

  1. Enter your text: Type or paste the text you want to translate.
  2. Select the direction: Choose either English to Polish or Polish to English. The application adds a special prefix to the text to tell the model which way to translate.
  3. Click Submit: The translated text will appear in the output box.

Model and Technical Details

This application is powered by a fine-tuned version of the google/byt5-300m model.

  • Model: google/byt5-300m
  • Architecture: ByT5 (Byte-level T5) is a "tokenizer-free" model that processes text as a sequence of bytes. This eliminates "unknown token" errors and allows a single model to handle multiple tasks and languages flexibly.
  • Method: Two-way translation is achieved by prepending a task-specific prefix to the input before feeding it to the model (e.g., translate English to Polish: Hello world!).

Created by gregniuki