File size: 1,160 Bytes
978cbf1
 
bc3b289
978cbf1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bc3b289
 
 
 
 
 
 
 
5c92fe7
 
 
 
deb0ca6
4c2e294
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# document_translator

Project to translate files using BSC's models while keeping the formatting and style of the original file.

## Requirements
### python 3.12

### fast_align 

Clone https://github.com/clab/fast_align, run the compilation commands indicated in the project's readme, place fast_align and atools (.exe if using windows) in this project's root.

### fast_align fine-tuning files

I took the 4 files (ca-en.params, ca-en.err, en-ca.params and en-ca.err) from https://huggingface.co/projecte-aina/aina-translator-ca-en/tree/main. Maybe we could automatize the download of these files. For now, place these files in config_folder (defined in main.py).

### python requirements

    pip install -r requirements.txt

### mtuoc_aina_translator

To use this class you also need to be running MTUOC's translation server with the proper translation models. There's also no 
need to use fastalign on that side since the current project already runs it. 

### salamandrata7b_translator

Class that uses huggingface's demo.

## Docker

docker build -t document-translator .
docker run -p 7860:7860 -e HF_TOKEN=your_token_here --rm -it document-translator