File size: 5,102 Bytes
e3d78b4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d5031d4
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
---
base_model: microsoft/Phi-4-mini-instruct
library_name: peft
tags:
  - text-generation
  - instruction-tuning
  - lora
  - fine-tuned
  - phi-4
  - pytorch
  - transformers
license: mit
language:
  - en
pipeline_tag: text-generation
inference: true
---

# Model Card for Phi-4 LoRA Fine-tuned Model

This model is a LoRA fine-tuned version of Microsoft's Phi-4-mini-instruct, optimized for improved code review using GitHub data.

## Model Details

### Model Description

This is a fine-tuned version of Microsoft's Phi-4-mini-instruct model using LoRA (Low-Rank Adaptation) technique. The model has been trained on 10k instruction-response pairs to enhance its ability to follow instructions and generate high-quality responses across various tasks.

The model uses 4-bit quantization with NF4 for efficient inference while maintaining performance quality. It's designed to be a lightweight yet capable language model suitable for various text generation tasks.

- **Developed by:** Milos Kotlar
- **Model type:** Causal Language Model
- **Language(s) (NLP):** English
- **License:** MIT
- **Finetuned from model:** microsoft/Phi-4-mini-instruct

### Model Sources

- **Repository:** https://github.com/kotlarmilos/phi4-finetuned
- **Demo:** https://huggingface.co/spaces/kotlarmilos/dotnet-runtime

## Uses

### Direct Use

The model is designed for:

- **Instruction Following**: Generate responses to user instructions and queries
- **Conversational AI**: Engage in multi-turn conversations
- **Task Completion**: Help with various text-based tasks like summarization, explanation, and creative writing
- **Educational Support**: Provide explanations and assistance for learning

### Downstream Use

The model can be integrated into:

- **Chatbot Applications**: Web applications, mobile apps, and customer service systems
- **Content Generation Tools**: Writing assistants and creative content platforms
- **Educational Platforms**: Tutoring systems and interactive learning environments
- **API Services**: Text generation services and intelligent automation workflows

### Out-of-Scope Use

The model is **not intended for**:

- **Factual Information Retrieval**: May generate plausible but incorrect information
- **Professional Medical/Legal Advice**: Not qualified for specialized professional guidance
- **Real-time Critical Systems**: Not suitable for safety-critical applications
- **Harmful Content Generation**: Should not be used to create misleading, harmful, or malicious content

## How to Get Started with the Model

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

# Load base model with quantization
base_model = "kotlarmilos/Phi-4-mini-instruct"
lora_path = "artifacts/phi4-finetuned"

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
)

tokenizer = AutoTokenizer.from_pretrained(base_model, use_fast=True)

base = AutoModelForCausalLM.from_pretrained(
    base_model,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base, lora_path)

# Generate text
def generate(prompt):
    inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
    output = model.generate(
        **inputs, 
        max_new_tokens=256, 
        do_sample=True, 
        temperature=0.7,
        pad_token_id=tokenizer.eos_token_id
    )
    return tokenizer.decode(output[0], skip_special_tokens=True)

# Example usage
prompt = "Review the following code changes:"
response = generate(prompt)
print(response)
```

## Training Details

### Training Data

The model was fine-tuned on approximately 10,000 high-quality instruction-response pairs designed to improve the model's ability to follow instructions and generate helpful, accurate responses across various domains.

**Data Characteristics**:
- **Size**: ~10,000 instruction-response pairs
- **Format**: Structured instruction-following conversations
- **Coverage**: Diverse topics and instruction types

### Training Procedure

#### Preprocessing

1. **Data Preparation**: Instruction-response pairs formatted for causal language modeling
2. **Tokenization**: Text processed using Phi-4's tokenizer with appropriate special tokens
3. **Sequence Formatting**: Proper formatting for instruction-following tasks
4. **Quality Filtering**: Removal of low-quality or potentially harmful content

#### Training Hyperparameters

**LoRA Configuration**:
- **LoRA Rank (r)**: 8
- **LoRA Alpha**: 16
- **LoRA Dropout**: 0.05
- **Target Modules**: ["qkv_proj", "gate_up_proj"]
- **Task Type**: CAUSAL_LM

**Training Setup**:
- **Base Model**: microsoft/Phi-4-mini-instruct
- **Training Method**: LoRA (Low-Rank Adaptation)
- **Quantization**: 4-bit NF4 with BitsAndBytes
- **Training regime**: Mixed precision training with appropriate optimization


## Usage Examples
If you use this model, please refer to https://github.com/kotlarmilos/phi4-finetuned