core-outline
/

nyx

core_outline_nyx

Model card Files Files and versions

nyx / README.md

t-tsuma's picture

Upload folder using huggingface_hub

04a77a3 verified 26 days ago

|

history blame contribute delete

2.83 kB

	# Nyx: Core-Outline Transformer Model

	Nyx is a transformer-based language model designed for efficient text generation and understanding. This model is part of the Core-Outline project, focusing on providing high-quality text generation capabilities with a focus on financial, SaaS, social media, customer, and customer feedback analytics data.

	## Model Architecture

	Nyx is built on a transformer decoder-only architecture with the following key components:

	- Rotary Position Embeddings (RoPE): For better handling of sequence positions
	- Multi-head Self-Attention: With grouped-query attention for efficient inference
	- SwiGLU Activation: For the feed-forward networks
	- RMSNorm: For layer normalization
	- Sliding Window Attention: For handling longer sequences efficiently

	### Model Specifications

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Hidden Size \| 1024 \|
	\| Number of Layers \| 24 \|
	\| Number of Attention Heads \| 16 \|
	\| Number of Key-Value Heads \| 16 \|
	\| Intermediate Size \| 2816 \|
	\| Max Sequence Length \| 32,768 tokens \|
	\| Vocabulary Size \| 151,936 \|
	\| Activation \| SwiGLU (SiLU) \|

	## Usage

	### Prerequisites

	- Python 3.11+
	- PyTorch 2.0+
	- Transformers library
	- FastAPI (for API server)

	### Loading the Model

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_path = "core-outline/nyx"
	model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True)
	tokenizer = AutoTokenizer.from_pretrained("core-outline/nyx") # Using Qwen tokenizer
	```

	### Text Generation

	```python
	def generate_text(prompt, max_length=100, temperature=0.7):
	inputs = tokenizer(prompt, return_tensors="pt")
	outputs = model.generate(
	inputs.input_ids,
	max_length=max_length,
	temperature=temperature,
	do_sample=True,
	pad_token_id=tokenizer.eos_token_id
	)
	return tokenizer.decode(outputs[0], skip_special_tokens=True)
	```


	## Model Configuration

	The model uses the following key configuration parameters (from `config.json`):

	```json
	{
	"hidden_size": 1024,
	"intermediate_size": 2816,
	"num_hidden_layers": 24,
	"num_attention_heads": 16,
	"num_key_value_heads": 16,
	"max_position_embeddings": 32768,
	"rms_norm_eps": 1e-6,
	"rope_theta": 1000000.0
	}
	```

	## Tokenizer

	The model uses the Qwen tokenizer, which is a BPE-based tokenizer with a vocabulary size of 151,936 tokens.

	## Training Data

	The model has been trained on a diverse dataset including:
	- Financial analytics
	- SaaS metrics
	- Social media data
	- Customer data
	- Customer feedback

	## License

	[Specify your license here]

	## Acknowledgements

	- The model architecture is based on the Qwen/Llama architecture
	- Uses Rotary Position Embeddings (RoPE) for position encoding
	- Implements grouped-query attention for efficient inference