File size: 1,283 Bytes
18e4106
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
==================================
Installation
==================================

In this section, we will demonstrate how to install PDF-Extract-Kit.

Best Practices
==============

We recommend users follow our best practices for installing PDF-Extract-Kit. It is recommended to use a Python 3.10 conda virtual environment for the installation.

**Step 1.** Create a Python 3.10 virtual environment using conda.

.. code-block:: console

    $ conda create -n pdf-extract-kit-1.0 python=3.10 -y
    $ conda activate pdf-extract-kit-1.0

**Step 2.** Install the dependencies for PDF-Extract-Kit.

.. code-block:: console

    $ # For GPU devices
    $ pip install -r requirements.txt
    $ # For CPU-only devices
    $ pip install -r requirements-cpu.txt

.. note::

    For the convenience of user environment configuration, requirements.txt only includes the environment needed for the current best models, which currently include:
   
    - Layout Detection: YOLO series (YOLOv10, DocLayout-YOLO)  
    - Formula Detection: YOLO series (YOLOv8)  
    - Formula Recognition: UniMERNet  
    - OCR: PaddleOCR  

    For other models, such as LayoutLMv3, additional environment setup is required. For details, see \ :ref:`Layout Detection Algorithms <algorithm_layout_detection>`.