reflection_model / README.md
olabs-ai's picture
Update README.md
deec282 verified
---
license: apache-2.0
---
---
language: en
tags:
- text-generation
- causal-lm
- fine-tuning
- unsupervised
---
# Model Name: olabs-ai/reflection_model
## Model Description
The `olabs-ai/reflection_model` is a fine-tuned language model based on [Meta-Llama-3.1-8B-Instruct](https://huggingface.co/Meta-Llama-3.1-8B-Instruct). It has been further fine-tuned using LoRA (Low-Rank Adaptation) for improved performance in specific tasks. This model is designed for text generation and can be used for various applications like conversational agents, content creation, and more.
## Model Details
- **Base Model**: Meta-Llama-3.1-8B-Instruct
- **Fine-Tuning Method**: LoRA
- **Architecture**: LlamaForCausalLM
- **Number of Parameters**: 8 Billion (Base Model)
- **Training Data**: [Details about the training data used for fine-tuning, if available]
## Usage
To use this model, you need to have the `transformers` and `unsloth` libraries installed. You can load the model and tokenizer as follows:
```python
from transformers import AutoConfig, AutoModel, AutoTokenizer
from unsloth import FastLanguageModel
# Load base model configuration
base_model_name = "olabs-ai/Meta-Llama-3.1-8B-Instruct"
base_config = AutoConfig.from_pretrained(base_model_name)
base_model = AutoModel.from_pretrained(base_model_name, config=base_config)
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
# Load LoRA adapter
adapter_config_path = "path_to_your_adapter_config.json"
adapter_weights_path = "path_to_your_adapter_weights"
# Use FastLanguageModel to apply LoRA adapter
model = FastLanguageModel.from_pretrained(
model_name=base_model_name,
adapter_weights=adapter_weights_path,
config=adapter_config_path
)
# Set inference mode for LoRA
FastLanguageModel.for_inference(model)
# Prepare inputs
custom_prompt = "What is a famous tall tower in Paris?"
inputs = tokenizer([custom_prompt], return_tensors="pt").to("cuda")
from transformers import TextStreamer
text_streamer = TextStreamer(tokenizer)
# Generate outputs
outputs = model.generate(**inputs, streamer=text_streamer, max_new_tokens=1000)