metadata
license: cc-by-nc-4.0
language:
- en
pipeline_tag: text-generation
model-index:
- name: BrtGPT 124M-Base
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: HuggingFaceH4/ifeval
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 1
name: strict accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=MaziyarPanahi/calme-3.2-instruct-78b
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: BBH
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 1
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=MaziyarPanahi/calme-3.2-instruct-78b
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: hendrycks/competition_math
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 1
name: exact match
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=MaziyarPanahi/calme-3.2-instruct-78b
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 1
name: acc_norm
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=MaziyarPanahi/calme-3.2-instruct-78b
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 1
name: acc_norm
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=MaziyarPanahi/calme-3.2-instruct-78b
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 1
name: accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=MaziyarPanahi/calme-3.2-instruct-78b
name: Open LLM Leaderboard
BrtGPT-1-124M-Base
- This model trained on 306K (306.000) Tokens, With "English" Sentences.
- Model is not for QA (UNLIKE ChatGPT or LLama), this model is only Pre-Trained on Large Corpus, so this is a Base Model
Model Details
Model Description
- Developed by: Bertug Gunel (Bertuğ Günel)
- Funded by [optional]: Nobody
- Shared by [optional]: Nobody
- Model type: Decoder-Only Transformer
- Language(s) (NLP): Turkish
- License: CC-BY-NC-4.0
- Finetuned from model [optional]: Not Fine-Tuned
Model Sources [optional]
- Repository: Cooming Soon!
- Paper [optional]: "Attention All You Need", 1706.03762
- Demo [optional]: Model is already a demo model.
Uses
This codes, loads the model (BrtGPT-124M-Base), you can use it!
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_name = "Bertug1911/BrtGPT-124m-Base"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
input_text = "They're hiding"
inputs = tokenizer(input_text, return_tensors="pt")
# Test etmek istediğin repetition_penalty değerleri
penalty_values = [penalty]
print(f"🧪 Input: {input_text}\n")
for penalty in penalty_values:
outputs = model.generate(
inputs["input_ids"],
max_new_tokens=30,
do_sample=True,
temperature=1.2,
top_p=0.95,
repetition_penalty=penalty
)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
generated_text = generated_text.replace(" ", "")
generated_text = generated_text.replace("Ġ", " ")
print(f"📌 repetition_penalty={penalty}:\n{generated_text}\n{'-'*50}")
NEW: Repetition Penalty Added!
Direct Use
You Can't Direct use! (Web use is, coming soon!)
Out-of-Scope Use
Model only generates (completes) "English" sentences with "English" tokens.
Bias, Risks, and Limitations
Model can generates: "Political" contents, USE YOUR OWN RISK
Recommendations
No big risks or biases!
How to Get Started with the Model
You can use model for generating English texts.
Training Details
Training Data
Model trained on: Train.csv (300k tokens, 100 lines)
| Data Type | Training Type | Tokens (Total) |
|---|---|---|
| Raw (sentences) | Pre Training | About 306K |
| Instruction (Coming soon!) | Cooming soon! | Cooming soon! |
Training Procedure
Model Trained on: A100 GPU (Nvidia) for 15 Minutes.
Training Hyperparameters
- Training regime: Training precision is: FP32, Sparsity: ON
Evaluation
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: GPU
- Hours used: 0.25 (15 Mins)
- Cloud Provider: "Runpod" (https://www.runpod.io/)
- Compute Region: EU
- Carbon Emitted: 0.024 KG (24 Gram(s))
Model Card Authors [optional]
- Bertug Gunel
- Turkey/Eskisehir