Model Card for Model ID
Open sourced Lingvanex translation models.
Model Description
Open source Lingvanex translation models using ctranslate2.
- Developed by: Lingvanex
- Model type: [ctranslate2]
- Language(s) (NLP): [English, Corsican]
- License: [MIT]
Model Sources [optional]
- Github Repository: https://github.com/lingvanex-mt/models
- HF Repository: https://huggingface.co/lingvanex
- Demo [optional]: https://huggingface.co/spaces/lingvanex/language_translator
Uses
- Translation
Direct Use
import sentencepiece as spm
from ctranslate2 import Translator
path_to_model = 'en_co'
source = 'en'
target = 'co'
translator = Translator(path_to_model, compute_type='int8')
source_tokenizer = spm.SentencePieceProcessor(f'{path_to_model}/{source}.spm.model')
target_tokenizer = spm.SentencePieceProcessor(f'{path_to_model}/{target}.spm.model')
text = [
'I need to make a phone call.',
'Can I help you prepare food?',
'We want to go for a walk.'
]
input_tokens = source_tokenizer.EncodeAsPieces(text)
translator_output = translator.translate_batch(
input_tokens,
batch_type='tokens',
beam_size=2,
max_input_length=0,
max_decoding_length=256
)
output_tokens = [item.hypotheses[0] for item in translator_output]
translation = target_tokenizer.DecodePieces(output_tokens)
print('\n'.join(translation))
Evaluation
- Unknown
Metrics
- Unknown
Technical Specifications [optional]
Model Architecture and Objective, Software
[ctranslate2]