NexaAI
/

paddleocr-npu

Model card Files Files and versions

paddleocr-npu / README.md

zackli4ai's picture

Update README.md

67ff64a verified 21 days ago

|

history blame contribute delete

2.83 kB

	# PaddleOCR v4 (PP-OCRv4)

	## Model Description
	PP-OCRv4 is the fourth-generation end-to-end optical character recognition system from the PaddlePaddle team.
	It combines a lightweight text detection → angle classification → text recognition pipeline with improved training techniques and data augmentation, delivering higher accuracy and robustness while staying efficient for real-time use.

	PP-OCRv4 supports multilingual OCR (Latin and non-Latin scripts), irregular layouts (rotated/curved text), and challenging inputs such as noisy or low-resolution images often found in mobile and document-scan scenarios.

	## Features
	- End-to-end OCR: text detection, optional angle classification, and text recognition in one pipeline.
	- Multilingual support: pretrained models for English, Chinese, and dozens of other languages; easy finetuning for domain text.
	- Robust in real-world conditions: handles rotation, perspective distortion, blur, low light, and complex backgrounds.
	- Lightweight & fast: practical for both mobile apps and large-scale server deployments.
	- Flexible I/O: works with photos, scans, screenshots, receipts, invoices, ID cards, dashboards, and UI text.
	- Extensible: swap components (detector/recognizer), add language packs, or finetune on domain datasets.

	## Use Cases
	- Document digitization (invoices, receipts, forms, contracts)
	- RPA and back-office automation (screen/OCR flows)
	- Mobile scanning apps and camera-based translation/read-aloud
	- Industrial and retail analytics (labels, price tags, shelf tags)
	- Accessibility (screen-readers and read-aloud applications)

	## Inputs and Outputs
	Input: Image (photo, scan, or screenshot).
	Output: A list of detected text regions, each with:
	- bounding box (rectangular or polygonal)
	- recognized text string
	- optional confidence score and orientation


	---

	## How to use

	> ⚠️ Hardware requirement: the model currently runs only on Qualcomm NPUs (e.g., Snapdragon-powered AIPC).
	> Apple NPU support is planned next.

	### 1) Install Nexa-SDK

	- Download and follow the steps under "Deploy Section" Nexa's model page: [Download Windows arm64 SDK](https://sdk.nexa.ai/model/PaddleOCR%20v4)
	- (Other platforms coming soon)

	### 2) Get an access token
	Create a token in the Model Hub, then log in:

	```bash
	nexa config set license '<access_token>'
	```

	### 3) Run the model
	Running:

	```bash
	nexa infer NexaAI/paddleocr-npu
	```

	---

	## License
	- Licensed under [Apache-2.0](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/LICENSE)

	## References
	- GitHub repo: [https://github.com/PaddlePaddle/PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)
	- Model zoo & documentation: [Models list](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_en/models_list_en.md)