MTP Mini - Modelo Mejorado 20x

Modelo transformer con arquitectura avanzada entrenado en GPU T4.

Arquitectura

  • Parámetros: ~310.7M (310,708,225)
  • Vocabulario: 8000 tokens
  • Capas: 24
  • Dimensión: 1024
  • Contexto: 2048 tokens

Mejoras

  • ✅ RoPE, RMSNorm, SwiGLU
  • ✅ Flash Attention
  • ✅ Gradient Checkpointing
  • ✅ Mixed Precision FP16
  • ✅ Anti-alucinación
  • ✅ Confidence Scoring

Uso

import torch, pickle
from tokenizer import MTPTokenizer
from model import MTPMiniModel

with open('mtp_mini.pkl', 'rb') as f:
    data = pickle.load(f)

tokenizer = MTPTokenizer('mtp_tokenizer.model')
model = MTPMiniModel(**data['config']['model'])
model.load_state_dict(data['model_state_dict'])
model.eval()

prompt = "¿Qué es la IA?"
ids = torch.tensor([tokenizer.encode(prompt)]).unsqueeze(0)
output = model.generate(ids, max_new_tokens=150)
print(tokenizer.decode(output[0].tolist()))

Entrenado en Google Colab con GPU T4.

Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using TeszenAI/MTP3.6 1