β Artha-1.1B
Artha (Sanskrit: ΰ€ ΰ€°ΰ₯ΰ€₯) β meaning "essence", "purpose", "meaning"
A fine-tuned TinyLlama 1.1B model trained on Artha β a token-efficient, math-based language for human-AI communication.
The Idea
Why are we talking to AI in English?
There are 7,000 languages in the world. We picked English almost by accident β because that's what the training data was in. But what's truly universal across every language, every culture, every human mind?
Mathematics and Logic.
Artha is a compressed language built on mathematical notation where every token carries maximum meaning and nothing else survives.
What it does
"Please summarise this article in 3 bullet points, focus on key facts" β sum[article](#3, fmt:bullets) +facts "Fix the bug in this Python code and explain what was wrong" β fixcode β {diff+explain} "Write a formal email to a client, under 150 words" β gen[eml](@client, tone:formal, ~150w) "Compare React and Vue for a beginner, format as table" β cmp[React, Vue] β {table}
Results
| Prompt Type | English Tokens | Artha Tokens | Saving |
|---|---|---|---|
| Summarisation | 12 | 3 | 75% |
| Code fix | 12 | 3 | 75% |
| Email generation | 14 | 3 | 79% |
| Comparison | 10 | 4 | 60% |
| Explanation | 11 | 3 | 73% |
| Average | 12 | 3 | 73% |
Key Insight
Compression only works when the model is natively trained on the language.
A standard English tokenizer sees fmt:bullets as 4 tokens.
An Artha tokenizer sees it as 1.
The model doesn't translate β it thinks natively in Artha.
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
# Load base model
base = AutoModelForCausalLM.from_pretrained(
"TinyLlama/TinyLlama-1.1B-Chat-v1.0",
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Load Artha tokenizer and resize embeddings
tokenizer = AutoTokenizer.from_pretrained("siddsukh/artha-1.1b")
base.resize_token_embeddings(len(tokenizer), mean_resizing=False)
# Load LoRA adapter
model = PeftModel.from_pretrained(base, "siddsukh/artha-1.1b")
model.eval()
def compress(prompt):
input_text = f"<|artha|>\n{prompt}\n<|compress|>\n"
inputs = tokenizer(input_text, return_tensors="pt").to(
next(model.parameters()).device
)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=64,
do_sample=False,
pad_token_id=tokenizer.eos_token_id,
)
decoded = tokenizer.decode(outputs[0], skip_special_tokens=False)
return decoded.split("<|compress|>")[-1].split("<|end|>")[0].strip()
# Example
print(compress("Please summarise this in 3 bullet points, focus on facts"))
# β sum[this](#3, fmt:bullets) +facts
Training Details
- Base model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
- Method: LoRA (r=16, alpha=32, target: q_proj, v_proj)
- Training data: 50,000 EnglishβArtha pairs
- Hardware: Google Colab TPU v5e
- Epochs: 3
- Compression achieved: 73% average token reduction
Links
- GitHub: https://github.com/sidvsukhi/artha
- Language Spec: SPEC.md in the repo
- Training notebook: examples/artha_colab.ipynb
Citation
@misc{artha2025,
author = {Siddhantsukhatankar},
title = {Artha: A Token-Efficient Math-Based Language for Human-AI Communication},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/siddsukh/artha-1.1b}
}
- Downloads last month
- 25
Model tree for siddsukh/artha-1.1b
Base model
TinyLlama/TinyLlama-1.1B-Chat-v1.0