PaddlePaddle/PaddleOCR-VL · PaddleOCR-VL Training example

PaddleOCR-VL Training example

by imneonizer - opened 3 days ago

3 days ago

It's complicated to prepare dataset that includes both text detection and recognition. I am looking for a training method which takes an input image and outputs the text in exact format i want, this should be possible as the final output is augmented by Ernie.

A good usecase is when I want to recognize certain texts in image but it contains a lot of irrelevant text aswell, so finetunine a model to tell exactly what should be recognized with a simple text label per image as training data would be very helpful.

I was already able to achive all of this using florence-2
Github: https://github.com/algofly-oss/vllm-ocr-training/blob/main/notebooks/2_model_training.ipynb
YT: https://www.youtube.com/watch?v=E8lWUjRNMQQ

looking for an example or guide to achieve similar results with PaddleOCR-VL

sunflowerting78

PaddlePaddle org 3 days ago

Hello, I'm very pleased to discuss the technology with you. Our technical report provides a detailed description of our data construction methods, model architecture, and training process. We will also be open-sourcing the fine-tuning code for PaddleOCR-VL soon, and we welcome you to stay tuned for updates.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment