PULI-LlumiX-Llama-3.1 8B base (8.03B billion parameter)

Trained with LLaMA-Factory github
The Llama 3.1 8B Instruct model were continual pretrained on Hungarian dataset

Dataset for continued pretraining

Hungarian (8.08 billion words): documents (763K) that exceed 5000 words in length + Hungarian Wikipedia
English: Long Context QA (2 billion words), BookSum (78 million words)

Limitations

max_seq_length = 16 384
bfloat16

Usage with pipeline

from transformers import pipeline, LlamaForCausalLM, AutoTokenizer

model = LlamaForCausalLM.from_pretrained("NYTK/PULI-LlumiX-Llama-3.1")
tokenizer = AutoTokenizer.from_pretrained("NYTK/PULI-LlumiX-Llama-3.1")
prompt = "Elmesélek egy történetet a nyelvtechnológiáról."
generator = pipeline(task="text-generation", model=model, tokenizer=tokenizer, device=0)

print(generator(prompt, max_new_tokens=30)[0]["generated_text"])

Since the model was continuously pre-trained from Llama 3.1 8B Instruct, it can be used as a chat model.

import torch
from transformers import pipeline

model_id = "NYTK/PULI-LlumiX-Llama-3.1"
pipe = pipeline(
    "text-generation",
    model=model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
messages = [
    {"role": "system", "content": "You are a helpful assistant"},
    {"role": "user", "content": "Mit gondolsz a nyelvtechnológiáról?"},
]
outputs = pipe(
    messages,
    max_new_tokens=256,
)
print(outputs[0]["generated_text"][-1])

Citation

If you use this model, please cite the following paper:

@inproceedings {yang-llumix-llama,
    title = {PULI Chat: Our First Hungarian Conversational Model},
    booktitle = {International Conference on Formal Methods and Foundations of Artificial Intelligence},
    year = {2025},
    publisher = {Eszterházy Károly Catholic University},
    address = {Eger, Hungary},
    author = {Yang, Zijian Győző and Bánfi, Ágnes and Dodé, Réka and Ferenczi, Gergő and Földesi, Flóra and Hatvani, Péter and Héja, Enikő and Lengyel, Mariann and  Madarász, Gábor and Osváth, Mátyás and Sárossy, Bence and Varga, Kristóf and Váradi, Tamás and Prószéky, Gábor and Ligeti-Nagy, Noémi},
    pages = {1--3},
    pubstate={accepted abstract},
    url ={https://uni-eszterhazy.hu/api/media/file/7f9158bd443acc29dbd2a211971fe8677768257c}
}

Downloads last month: 65

Safetensors

Model size

8B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NYTK/PULI-LlumiX-Llama-3.1

Adapters

2 models

Finetunes

2 models

Quantizations

1 model