|
--- |
|
license: apache-2.0 |
|
title: Easy-OCR |
|
sdk: gradio |
|
emoji: 📚 |
|
colorFrom: blue |
|
colorTo: purple |
|
thumbnail: >- |
|
https://cdn-uploads.huggingface.co/production/uploads/6495d5a915d8ef6f01bc75eb/TSmoqoWGoatq_GLsau_La.png |
|
short_description: GPU-Accelerated OCR |
|
--- |
|
|
|
# Easy‑OCR · ZeroGPU Multilingual PDF Text Extractor |
|
|
|
**Why this Space?** |
|
All the power of GPU‑accelerated OCR, yet you **only pay for GPU seconds you actually use** – thanks to the HuggingFace **ZeroGPU** backend. |
|
|
|
## 🔑 Key features |
|
|
|
| Category | Details | |
|
|----------|---------| |
|
| 💡 On‑demand GPU | `@spaces.GPU` wraps only the OCR phase – choose **native** mode and the app never even touches a GPU. | |
|
| 📝 Hybrid extraction | First pulls native PDF text with **pdfplumber**, then OCRs any remaining images with **EasyOCR**. | |
|
| 🌍 Multilingual | Pick one or more language codes in the dropdown – the app loads only those EasyOCR models for sharper accuracy and faster warm‑up. | |
|
| ⚡ Streaming UX | Text appears page‑by‑page with a live progress bar. | |
|
| 📥 Download | One‑click `.txt` export of the full extraction. | |
|
| 🛡️ Robust | File‑size guard, CUDA OOM fallback, unsupported‑language warnings. | |
|
|
|
## 🚀 Deploy your own |
|
|
|
**Note**: The first OCR call downloads EasyOCR model weights (~200 MB per language group). |
|
|
|
## 💡 Usage tips |
|
|
|
* Large PDFs can take several minutes; the GPU reservation duration is `60 s`. |
|
* When you know your PDF is **text only**, selecting **native** mode skips GPU altogether for near‑instant results. |