dotnet-runtime / README.md
kotlarmilos's picture
Update README.md
d5031d4 verified
---
base_model: microsoft/Phi-4-mini-instruct
library_name: peft
tags:
- text-generation
- instruction-tuning
- lora
- fine-tuned
- phi-4
- pytorch
- transformers
license: mit
language:
- en
pipeline_tag: text-generation
inference: true
---
# Model Card for Phi-4 LoRA Fine-tuned Model
This model is a LoRA fine-tuned version of Microsoft's Phi-4-mini-instruct, optimized for improved code review using GitHub data.
## Model Details
### Model Description
This is a fine-tuned version of Microsoft's Phi-4-mini-instruct model using LoRA (Low-Rank Adaptation) technique. The model has been trained on 10k instruction-response pairs to enhance its ability to follow instructions and generate high-quality responses across various tasks.
The model uses 4-bit quantization with NF4 for efficient inference while maintaining performance quality. It's designed to be a lightweight yet capable language model suitable for various text generation tasks.
- **Developed by:** Milos Kotlar
- **Model type:** Causal Language Model
- **Language(s) (NLP):** English
- **License:** MIT
- **Finetuned from model:** microsoft/Phi-4-mini-instruct
### Model Sources
- **Repository:** https://github.com/kotlarmilos/phi4-finetuned
- **Demo:** https://huggingface.co/spaces/kotlarmilos/dotnet-runtime
## Uses
### Direct Use
The model is designed for:
- **Instruction Following**: Generate responses to user instructions and queries
- **Conversational AI**: Engage in multi-turn conversations
- **Task Completion**: Help with various text-based tasks like summarization, explanation, and creative writing
- **Educational Support**: Provide explanations and assistance for learning
### Downstream Use
The model can be integrated into:
- **Chatbot Applications**: Web applications, mobile apps, and customer service systems
- **Content Generation Tools**: Writing assistants and creative content platforms
- **Educational Platforms**: Tutoring systems and interactive learning environments
- **API Services**: Text generation services and intelligent automation workflows
### Out-of-Scope Use
The model is **not intended for**:
- **Factual Information Retrieval**: May generate plausible but incorrect information
- **Professional Medical/Legal Advice**: Not qualified for specialized professional guidance
- **Real-time Critical Systems**: Not suitable for safety-critical applications
- **Harmful Content Generation**: Should not be used to create misleading, harmful, or malicious content
## How to Get Started with the Model
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
# Load base model with quantization
base_model = "kotlarmilos/Phi-4-mini-instruct"
lora_path = "artifacts/phi4-finetuned"
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained(base_model, use_fast=True)
base = AutoModelForCausalLM.from_pretrained(
base_model,
quantization_config=bnb_config,
device_map="auto",
trust_remote_code=True
)
# Load LoRA adapter
model = PeftModel.from_pretrained(base, lora_path)
# Generate text
def generate(prompt):
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output = model.generate(
**inputs,
max_new_tokens=256,
do_sample=True,
temperature=0.7,
pad_token_id=tokenizer.eos_token_id
)
return tokenizer.decode(output[0], skip_special_tokens=True)
# Example usage
prompt = "Review the following code changes:"
response = generate(prompt)
print(response)
```
## Training Details
### Training Data
The model was fine-tuned on approximately 10,000 high-quality instruction-response pairs designed to improve the model's ability to follow instructions and generate helpful, accurate responses across various domains.
**Data Characteristics**:
- **Size**: ~10,000 instruction-response pairs
- **Format**: Structured instruction-following conversations
- **Coverage**: Diverse topics and instruction types
### Training Procedure
#### Preprocessing
1. **Data Preparation**: Instruction-response pairs formatted for causal language modeling
2. **Tokenization**: Text processed using Phi-4's tokenizer with appropriate special tokens
3. **Sequence Formatting**: Proper formatting for instruction-following tasks
4. **Quality Filtering**: Removal of low-quality or potentially harmful content
#### Training Hyperparameters
**LoRA Configuration**:
- **LoRA Rank (r)**: 8
- **LoRA Alpha**: 16
- **LoRA Dropout**: 0.05
- **Target Modules**: ["qkv_proj", "gate_up_proj"]
- **Task Type**: CAUSAL_LM
**Training Setup**:
- **Base Model**: microsoft/Phi-4-mini-instruct
- **Training Method**: LoRA (Low-Rank Adaptation)
- **Quantization**: 4-bit NF4 with BitsAndBytes
- **Training regime**: Mixed precision training with appropriate optimization
## Usage Examples
If you use this model, please refer to https://github.com/kotlarmilos/phi4-finetuned