oi-OCR

oi-OCR is Open Innovation AI's document-parsing tool. It extracts structured Markdown, layout, tables, and chart data from PDFs for downstream RAG ingestion, agentic workflows, and document understanding tasks.

ParseBench Results (April 2026)

Dimension	Score	Rank on the public leaderboard
Charts	78.48	#1 of 47
Tables	87.06	#9
Content Faithfulness	87.24	#18
Semantic Formatting	65.65	#6
Visual Grounding	68.71	#6 (tied with Reducto)
Overall (mean of 5)	77.43	#2 of 47

Evaluated on the full ParseBench-Full suite — 2,037 single-page PDFs across chart, layout, table, and text groups.

oi-OCR is #1 on the Charts dimension — ahead of LlamaParse Agentic (78.11), Reducto Agentic (73.40), Google Gemini 3 Flash Thinking High (64.79), Anthropic Opus 4.7 (55.84), and OpenAI GPT-5.5 Reasoning Medium (65.53).

On Overall, only LlamaParse Agentic ranks higher.

Structured eval data: .eval_results/parsebench.yaml.

Evaluation methodology

Benchmark: ParseBench-Full — 2,037 single-page PDFs from real enterprise documents (insurance, finance, government, scientific, etc.)
Evaluator: official parse-bench CLI
Scoring mode: rule-only (LLAMACLOUD_BENCH_LLM_NORMALIZATION=off) — stricter than the leaderboard's default judge mode.

Public leaderboard

Full benchmark comparison across all 47 entries: parsebench.ai

About

Open Innovation AI builds enterprise AI tools for the GCC and beyond, with first-class English and Arabic document support.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results

Mean on llamaindex/ParseBench View evaluation results

source leaderboard

77.43 ^*
Chart on llamaindex/ParseBench View evaluation results

source leaderboard

78.48 ^*
Table on llamaindex/ParseBench View evaluation results

source leaderboard

87.06 ^*
Text Content on llamaindex/ParseBench View evaluation results

source leaderboard

87.24 ^*
Text Formatting on llamaindex/ParseBench View evaluation results

source leaderboard

65.65 ^*
Layout on llamaindex/ParseBench View evaluation results

source leaderboard

68.71 ^*