Model Card for Model ID

Open sourced Lingvanex translation models.

Model Description

Open source Lingvanex translation models using ctranslate2.

Developed by: Lingvanex
Model type: [ctranslate2]
Language(s) (NLP): [English, Corsican]
License: [MIT]

Model Sources [optional]

Github Repository: https://github.com/lingvanex-mt/models
HF Repository: https://huggingface.co/lingvanex
Demo [optional]: https://huggingface.co/spaces/lingvanex/language_translator

Uses

Translation

Direct Use

import sentencepiece as spm
from ctranslate2 import Translator

path_to_model = 'en_co'
source = 'en'
target = 'co'

translator = Translator(path_to_model, compute_type='int8')
source_tokenizer = spm.SentencePieceProcessor(f'{path_to_model}/{source}.spm.model')
target_tokenizer = spm.SentencePieceProcessor(f'{path_to_model}/{target}.spm.model')

text = [
  'I need to make a phone call.',
  'Can I help you prepare food?',
  'We want to go for a walk.'
]

input_tokens = source_tokenizer.EncodeAsPieces(text)
translator_output = translator.translate_batch(
  input_tokens,
  batch_type='tokens',
  beam_size=2,
  max_input_length=0,
  max_decoding_length=256
)

output_tokens = [item.hypotheses[0] for item in translator_output]
translation = target_tokenizer.DecodePieces(output_tokens)
print('\n'.join(translation))

Evaluation

Unknown

Metrics

Unknown

Technical Specifications [optional]

Model Architecture and Objective, Software

[ctranslate2]

Model Card Authors

Lingvanex

Downloads last month: -; Downloads are not tracked for this model. How to track