PaddlePaddle/PaddleOCR-VL · The Kernel crashed, anyone can please help.

6 days ago

What I done as follows in a window 10 PC with only CPU:

pip install paddlepaddle
pip install "paddleocr[doc-parser]"

it installs successfully.

then, I use VScode to create ipynb file with the following codes

from paddleocr import PaddleOCRVL
pipeline = PaddleOCRVL()
output = pipeline.predict("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/paddleocr_vl_demo.png")
for res in output:
res.print()
res.save_to_json(save_path="output")
res.save_to_markdown(save_path="output")

then it gives me the following output(it actually downloads a lot of files into my disk about 2GB):

Creating model: ('PP-DocLayoutV2', None)
Model files already exist. Using cached files. To redownload, please delete the directory manually: C:\Users\Administrator.DESKTOP-6EEI2PG\.paddlex\official_models\PP-DocLayoutV2.
Creating model: ('PaddleOCR-VL-0.9B', None)
Model files already exist. Using cached files. To redownload, please delete the directory manually: C:\Users\Administrator.DESKTOP-6EEI2PG\.paddlex\official_models\PaddleOCR-VL.

but at last, it gives me this and stop running:

The Kernel crashed while executing code in the current cell or a previous cell.
Please review the code in the cell(s) to identify a possible cause of the failure.
Click here for more info.
View Jupyter log for further details.

PrinceZaman

5 days ago

Okay, so what’s happening is actually super normal when you try to run these huge PaddleOCRVL models on a CPU-only laptop. Those models you’re loading (PP-DocLayoutV2 and PaddleOCR-VL-0.9B) are huge, like 1–2GB each. And Jupyter Notebook + VSCode just freaks out when Python tries to load all that into memory at once. That’s why your kernel crashes.

Basically, your computer is saying:

“Bro… too much RAM for me, I’m out 😵‍💫”

Here’s what you can do:

Use a smaller model.
The VL model is massive. PaddleOCR has “lite” or smaller CPU-friendly versions. They do almost the same job for a demo and only take ~200–300MB.

from paddleocr import PaddleOCR

ocr = PaddleOCR(use_angle_cls=True, lang='en') # much smaller
result = ocr.ocr("demo_image.png")

Don’t run it in Jupyter for now.
Just make a Python script (demo.py) and run it with python demo.py. Jupyter eats extra RAM and sometimes causes crashes even if your CPU could handle it.
Check your downloads.
If the models partially downloaded, your system can crash trying to load them. Go to:

C:\Users\Administrator.DESKTOP-6EEI2PG.paddlex\official_models\

and delete the folders. Then run with the smaller model — it’ll download fresh files that won’t fry your RAM.

Optional: Cloud or GPU.
If you really need the big VL model, then Colab or a GPU machine is way easier. Otherwise, stick to smaller stuff.

Honestly, you’re not doing anything wrong — your PC just hit its limits. 😅

ahnulxy

5 days ago

Okay, so what’s happening is actually super normal when you try to run these huge PaddleOCRVL models on a CPU-only laptop. Those models you’re loading (PP-DocLayoutV2 and PaddleOCR-VL-0.9B) are huge, like 1–2GB each. And Jupyter Notebook + VSCode just freaks out when Python tries to load all that into memory at once. That’s why your kernel crashes.

Basically, your computer is saying:

“Bro… too much RAM for me, I’m out 😵‍💫”

Here’s what you can do:

Use a smaller model.
The VL model is massive. PaddleOCR has “lite” or smaller CPU-friendly versions. They do almost the same job for a demo and only take ~200–300MB.

from paddleocr import PaddleOCR

ocr = PaddleOCR(use_angle_cls=True, lang='en') # much smaller
result = ocr.ocr("demo_image.png")

Don’t run it in Jupyter for now.
Just make a Python script (demo.py) and run it with python demo.py. Jupyter eats extra RAM and sometimes causes crashes even if your CPU could handle it.

Check your downloads.
If the models partially downloaded, your system can crash trying to load them. Go to:

C:\Users\Administrator.DESKTOP-6EEI2PG.paddlex\official_models\

and delete the folders. Then run with the smaller model — it’ll download fresh files that won’t fry your RAM.

Optional: Cloud or GPU.
If you really need the big VL model, then Colab or a GPU machine is way easier. Otherwise, stick to smaller stuff.

Honestly, you’re not doing anything wrong — your PC just hit its limits. 😅

Thank you so much, I can get the result now. how can i translate the results into a human-readable doc?

PrinceZaman

5 days ago

Hey bro, awesome that it’s working now! 😄

So yeah, the output you get from ocr.ocr() is basically a list of detected text boxes with their coordinates and confidence scores. To make it more “human-readable,” you can just extract the text part and write it into a simple .txt or .docx file.

Here’s a quick example that’ll do the job 👇

from paddleocr import PaddleOCR

ocr = PaddleOCR(use_angle_cls=True, lang='en')
result = ocr.ocr("demo_image.png")

# Extract text lines
lines = []
for line in result[0]:
    text = line[1][0]
    lines.append(text)

# Save as a readable file
with open("output.txt", "w", encoding="utf-8") as f:
    f.write("\n".join(lines))

print("✅ Done! Check output.txt for the extracted text.")

If you want a more formatted doc (like Word), you can install python-docx:

pip install python-docx

and then:

from docx import Document
doc = Document()
for text in lines:
    doc.add_paragraph(text)
doc.save("output.docx")

That’ll give you a clean, readable Word file version

I wish this will help 😅🐱

ahnulxy

5 days ago

Hey bro, awesome that it’s working now! 😄

So yeah, the output you get from ocr.ocr() is basically a list of detected text boxes with their coordinates and confidence scores. To make it more “human-readable,” you can just extract the text part and write it into a simple .txt or .docx file.

Here’s a quick example that’ll do the job 👇
from paddleocr import PaddleOCR

ocr = PaddleOCR(use_angle_cls=True, lang='en')
result = ocr.ocr("demo_image.png")

# Extract text lines
lines = []
for line in result[0]:
    text = line[1][0]
    lines.append(text)

# Save as a readable file
with open("output.txt", "w", encoding="utf-8") as f:
    f.write("\n".join(lines))

print("✅ Done! Check output.txt for the extracted text.")
If you want a more formatted doc (like Word), you can install python-docx:
pip install python-docx
and then:
from docx import Document
doc = Document()
for text in lines:
    doc.add_paragraph(text)
doc.save("output.docx")
That’ll give you a clean, readable Word file version

I wish this will help 😅🐱

Thank you, Bro, I try your code but I did not get the correct output in docx file. It only gives me a line of letters. Here are the image and the output in docx file
image:
docx:

PrinceZaman

5 days ago

Nice 🐱

ahnulxy

5 days ago

Nice 🐱

Thank you, but the code does not work well as I have mentioned above. Can you kindly help me to check anything wrong?

PrinceZaman

5 days ago

Ahhh, got it , Sorry i mistakenly thought the image is the Output file 😅
Cause: Math/Science PDFs are tricky because they’re not just normal text — they have formulas, tables, and special symbols. PaddleOCR’s simple text extraction just pulls lines of text, it doesn’t parse math expressions into LaTeX or structured format.
You can use specialized OCR for it: MathPix API or im2latex models can convert images to LaTeX.
For MathPix: If you are a student, then can sign up with educational email to get the 20 free Snips per month. This should be sufficient for occasional use. For more frequent usage, you have might consider the Pro plans.
For im2latex: you can clone the im2latex repository and run it locally. This is completely free and doesn't require any API keys.

PrinceZaman

5 days ago

Ahhh, got it , Sorry i mistakenly thought the image is the Output file 😅
Cause: Math/Science PDFs are tricky because they’re not just normal text — they have formulas, tables, and special symbols. PaddleOCR’s simple text extraction just pulls lines of text, it doesn’t parse math expressions into LaTeX or structured format.
You can use specialized OCR for it: MathPix API or im2latex models can convert images to LaTeX.
For MathPix: If you are a student, then can sign up with educational email to get the 20 free Snips per month. This should be sufficient for occasional use. For more frequent usage, you have might consider the Pro plans.
For im2latex: you can clone the im2latex repository and run it locally. This is completely free and doesn't require any API keys.

Note: Derivatives are still overkill for this specialized OCR, Manual checking is essential

ChengCui

PaddlePaddle org 5 days ago

•

edited 5 days ago

If your computer has limited resources but you still want to recognize complex formulas, I suggest using PaddleOCR’s PP-StructureV3, which is a pipeline solution that combines multiple models. Here is the introduction and user guide. You can choose models with different parameter sizes according to your computer’s capabilities.

PrinceZaman

5 days ago

If your computer has limited resources but you still want to recognize complex formulas, I suggest using PaddleOCR’s PP-StructureV3, which is a pipeline solution that combines multiple models. Here is the introduction and user guide. You can choose models with different parameter sizes according to your computer’s capabilities.

Ahh, I really don't know about this model earlier
Well done team 👏