|
# Nyx: Core-Outline Transformer Model |
|
|
|
Nyx is a transformer-based language model designed for efficient text generation and understanding. This model is part of the Core-Outline project, focusing on providing high-quality text generation capabilities with a focus on financial, SaaS, social media, customer, and customer feedback analytics data. |
|
|
|
## Model Architecture |
|
|
|
Nyx is built on a transformer decoder-only architecture with the following key components: |
|
|
|
- **Rotary Position Embeddings (RoPE)**: For better handling of sequence positions |
|
- **Multi-head Self-Attention**: With grouped-query attention for efficient inference |
|
- **SwiGLU Activation**: For the feed-forward networks |
|
- **RMSNorm**: For layer normalization |
|
- **Sliding Window Attention**: For handling longer sequences efficiently |
|
|
|
### Model Specifications |
|
|
|
| Parameter | Value | |
|
|-----------|-------| |
|
| Hidden Size | 1024 | |
|
| Number of Layers | 24 | |
|
| Number of Attention Heads | 16 | |
|
| Number of Key-Value Heads | 16 | |
|
| Intermediate Size | 2816 | |
|
| Max Sequence Length | 32,768 tokens | |
|
| Vocabulary Size | 151,936 | |
|
| Activation | SwiGLU (SiLU) | |
|
|
|
## Usage |
|
|
|
### Prerequisites |
|
|
|
- Python 3.11+ |
|
- PyTorch 2.0+ |
|
- Transformers library |
|
- FastAPI (for API server) |
|
|
|
### Loading the Model |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model_path = "core-outline/nyx" |
|
model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True) |
|
tokenizer = AutoTokenizer.from_pretrained("core-outline/nyx") # Using Qwen tokenizer |
|
``` |
|
|
|
### Text Generation |
|
|
|
```python |
|
def generate_text(prompt, max_length=100, temperature=0.7): |
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
outputs = model.generate( |
|
inputs.input_ids, |
|
max_length=max_length, |
|
temperature=temperature, |
|
do_sample=True, |
|
pad_token_id=tokenizer.eos_token_id |
|
) |
|
return tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
``` |
|
|
|
|
|
## Model Configuration |
|
|
|
The model uses the following key configuration parameters (from `config.json`): |
|
|
|
```json |
|
{ |
|
"hidden_size": 1024, |
|
"intermediate_size": 2816, |
|
"num_hidden_layers": 24, |
|
"num_attention_heads": 16, |
|
"num_key_value_heads": 16, |
|
"max_position_embeddings": 32768, |
|
"rms_norm_eps": 1e-6, |
|
"rope_theta": 1000000.0 |
|
} |
|
``` |
|
|
|
## Tokenizer |
|
|
|
The model uses the Qwen tokenizer, which is a BPE-based tokenizer with a vocabulary size of 151,936 tokens. |
|
|
|
## Training Data |
|
|
|
The model has been trained on a diverse dataset including: |
|
- Financial analytics |
|
- SaaS metrics |
|
- Social media data |
|
- Customer data |
|
- Customer feedback |
|
|
|
## License |
|
|
|
[Specify your license here] |
|
|
|
## Acknowledgements |
|
|
|
- The model architecture is based on the Qwen/Llama architecture |
|
- Uses Rotary Position Embeddings (RoPE) for position encoding |
|
- Implements grouped-query attention for efficient inference |