---
license: mit
language:
- ru
base_model:
- unsloth/DeepSeek-R1-Distill-Llama-8B-bnb-4bit
---

# DeepSeek-Meeting-Summary

## 📌 Model Overview

This is a fine-tuned version of `DeepSeek-R1-Distill-Llama-8B` using `Unsloth` and `LoRA` for meeting summarization and structured insights extraction. 
The model is designed to analyze meeting transcripts and generate structured summaries in **JSON format**, 
extracting key elements like **summary, topics, actions, problems, and decisions**.

### 🚀 Features
- **100% valid JSON generation**
- **Trained for long-sequence summarization (16K tokens)**
- **Optimized for structured meeting insights extraction**
- **Fine-tuned with LoRA for efficient training**

## 🔥 Performance Metrics

| Metric         | Value  |
|---------------|--------|
| **ROUGE-L**   | `0.5217` |
| **BERT-F1**   | `0.7112` |
| **JSON Validity** | `1.0` (100% valid JSON responses) |
| **Validation Loss** | `1.6732` |

## 🚀 Usage

### 1️⃣ **Install Dependencies**
```bash
pip install transformers torch
```

### 2️⃣ **Load the Model**
```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "UDZH/deepseek-meeting-summary"

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
```

### 3️⃣ **Run Inference**
```python
prompt = """
Analyze the following meeting transcript and extract the key points:
1. **Summarization** – a brief summary of the meeting.
2. **Topics** – a list of topics discussed.
3. **Decisions** – key decisions made.
4. **Problems** – challenges or issues identified.
5. **Actions** – planned or taken actions.

Return the output **STRICTLY in the following JSON format**:
{
  "Summarization": "Brief meeting summary...",
  "Topics": ["Topic 1", "Topic 2"],
  "Actions": ["Action 1", "Action 2"],
  "Problems": ["Problem 1", "Problem 2"],
  "Decisions": ["Decision 1", "Decision 2"]
}

Meeting transcript (in Russian):
{}

**Return only a valid JSON response in Russian language.**
**Do not include explanations, introductions, or extra text.**
**If a category is missing, return an empty array [].**

### Response:
{}
"""

input_text = "Your meeting transcript here"
inputs = tokenizer(prompt.format(input_text, ""), return_tensors="pt", truncation=True, max_length=16384)

with torch.no_grad():
    output_ids = model.generate(**inputs, max_new_tokens=500)

response = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print("Generated Summary:", response)
```

## 📌 License & Citation
This model is fine-tuned for research and production use. If you use it in your projects, please cite this repository.