README.md · V7W3D/qwen-code-doc-ft at main

File size: 2,041 Bytes

47a7297
3fe9b2a
 
 
 
 
 
 
 
 
 
 
 
47a7297
3aa6639
3fe9b2a
 
47a7297
 
3fe9b2a
47a7297
3fe9b2a
47a7297
3fe9b2a
47a7297
3fe9b2a
 
 
 
47a7297
3fe9b2a
47a7297
3fe9b2a
47a7297
3fe9b2a
 
 
 
47a7297
3fe9b2a
47a7297
3fe9b2a
 
47a7297
199bc7e
3fe9b2a
 
47a7297
3fe9b2a
47a7297
3fe9b2a
 
47a7297
3fe9b2a

---
license: apache-2.0
language:
  - en
tags:
  - code
  - cobol
  - code-documentation
  - qwen
  - qwen2.5
  - instruction-tuning
  - llm
  - generative-model
library_name: transformers
pipeline_tag: text-generation
base_model: Qwen/Qwen2.5-Coder-3B-Instruct
model_name: qwen-code-doc-ft
---

# Qwen2.5-Coder-3B-Instruct – Fine-tuned for COBOL Code Documentation

This model is a fine-tuned version of [Qwen/Qwen2.5-Coder-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct), optimized for generating natural language documentation from COBOL source code. The fine-tuning was done using **freeze fine-tuning** on the **last transformer layer only**, preserving the rest of the model's pretrained weights.

## 🔧 Model Description

- **Architecture**: Qwen2.5-Coder-3B (decoder-only transformer)
- **Base Model**: [Qwen/Qwen2.5-Coder-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct)
- **Fine-tuning Method**: Freeze fine-tuning (only last transformer block's parameters were updated)
- **Training Objective**: Instruction-following text generation for COBOL code documentation

## 🧠 Use Cases

This model is specialized in generating descriptive documentation for legacy COBOL code, especially useful for:

- **Legacy system maintenance**
- **Automated codebase documentation**
- **Migration planning**
- **COBOL code understanding and onboarding**

## ✍️ Example Usage

```python
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

model_name = "V7W3D/qwen-code-doc-ft"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

doc_gen = pipeline("text-generation", model=model, tokenizer=tokenizer)

prompt = "### Document this COBOL code:\n\n       IDENTIFICATION DIVISION.\n       PROGRAM-ID. HELLO-WORLD.\n       PROCEDURE DIVISION.\n           DISPLAY 'HELLO, WORLD!'\n           STOP RUN.\n\n### Documentation:"
response = doc_gen(prompt, max_new_tokens=200, do_sample=False)

print(response[0]["generated_text"])