YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Nyx: Core-Outline Transformer Model

Nyx is a transformer-based language model designed for efficient text generation and understanding. This model is part of the Core-Outline project, focusing on providing high-quality text generation capabilities with a focus on financial, SaaS, social media, customer, and customer feedback analytics data.

Model Architecture

Nyx is built on a transformer decoder-only architecture with the following key components:

  • Rotary Position Embeddings (RoPE): For better handling of sequence positions
  • Multi-head Self-Attention: With grouped-query attention for efficient inference
  • SwiGLU Activation: For the feed-forward networks
  • RMSNorm: For layer normalization
  • Sliding Window Attention: For handling longer sequences efficiently

Model Specifications

Parameter Value
Hidden Size 1024
Number of Layers 24
Number of Attention Heads 16
Number of Key-Value Heads 16
Intermediate Size 2816
Max Sequence Length 32,768 tokens
Vocabulary Size 151,936
Activation SwiGLU (SiLU)

Usage

Prerequisites

  • Python 3.11+
  • PyTorch 2.0+
  • Transformers library
  • FastAPI (for API server)

Loading the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "core-outline/nyx"
model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("core-outline/nyx")  # Using Qwen tokenizer

Text Generation

def generate_text(prompt, max_length=100, temperature=0.7):
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model.generate(
        inputs.input_ids,
        max_length=max_length,
        temperature=temperature,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

Model Configuration

The model uses the following key configuration parameters (from config.json):

{
  "hidden_size": 1024,
  "intermediate_size": 2816,
  "num_hidden_layers": 24,
  "num_attention_heads": 16,
  "num_key_value_heads": 16,
  "max_position_embeddings": 32768,
  "rms_norm_eps": 1e-6,
  "rope_theta": 1000000.0
}

Tokenizer

The model uses the Qwen tokenizer, which is a BPE-based tokenizer with a vocabulary size of 151,936 tokens.

Training Data

The model has been trained on a diverse dataset including:

  • Financial analytics
  • SaaS metrics
  • Social media data
  • Customer data
  • Customer feedback

License

[Specify your license here]

Acknowledgements

  • The model architecture is based on the Qwen/Llama architecture
  • Uses Rotary Position Embeddings (RoPE) for position encoding
  • Implements grouped-query attention for efficient inference
Downloads last month
28
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support