📦 OCR Time Capsule

Link copied!

About OCR Time Capsule

OCR Time Capsule helps researchers and digital humanities professionals compare original OCR text with AI-improved versions from historical documents. This tool is designed for browsing pre-processed OCR improvements stored in HuggingFace datasets.

🎯 Use Cases

  • Review OCR corrections from historical newspapers
  • Quality assessment of digitization projects
  • Training data validation for OCR models
  • Accessibility improvements for scanned texts

⚡ Key Features

  • Side-by-side text comparison
  • Character, word, and line-level diffs
  • Keyboard navigation (J/K or arrows)
  • Direct HuggingFace dataset integration

💡 Tip: For live OCR processing with vision-language models, check out OCR Time Machine. OCR Time Capsule focuses on exploring already-processed datasets for faster navigation and analysis.

Loading dataset...

Error

Model Information

Model
Processed
Batch Size
Max Tokens

OCR Quality Metrics

Similarity
Characters
Added
Removed
Words
Markdown Detected Reasoning Trace

Original OCR


                            

Improved OCR Markdown


                                
Final Output