File size: 2,041 Bytes
47a7297 3fe9b2a 47a7297 3aa6639 3fe9b2a 47a7297 3fe9b2a 47a7297 3fe9b2a 47a7297 3fe9b2a 47a7297 3fe9b2a 47a7297 3fe9b2a 47a7297 3fe9b2a 47a7297 3fe9b2a 47a7297 3fe9b2a 47a7297 3fe9b2a 47a7297 199bc7e 3fe9b2a 47a7297 3fe9b2a 47a7297 3fe9b2a 47a7297 3fe9b2a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
---
license: apache-2.0
language:
- en
tags:
- code
- cobol
- code-documentation
- qwen
- qwen2.5
- instruction-tuning
- llm
- generative-model
library_name: transformers
pipeline_tag: text-generation
base_model: Qwen/Qwen2.5-Coder-3B-Instruct
model_name: qwen-code-doc-ft
---
# Qwen2.5-Coder-3B-Instruct – Fine-tuned for COBOL Code Documentation
This model is a fine-tuned version of [Qwen/Qwen2.5-Coder-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct), optimized for generating natural language documentation from COBOL source code. The fine-tuning was done using **freeze fine-tuning** on the **last transformer layer only**, preserving the rest of the model's pretrained weights.
## 🔧 Model Description
- **Architecture**: Qwen2.5-Coder-3B (decoder-only transformer)
- **Base Model**: [Qwen/Qwen2.5-Coder-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct)
- **Fine-tuning Method**: Freeze fine-tuning (only last transformer block's parameters were updated)
- **Training Objective**: Instruction-following text generation for COBOL code documentation
## 🧠 Use Cases
This model is specialized in generating descriptive documentation for legacy COBOL code, especially useful for:
- **Legacy system maintenance**
- **Automated codebase documentation**
- **Migration planning**
- **COBOL code understanding and onboarding**
## ✍️ Example Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
model_name = "V7W3D/qwen-code-doc-ft"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
doc_gen = pipeline("text-generation", model=model, tokenizer=tokenizer)
prompt = "### Document this COBOL code:\n\n IDENTIFICATION DIVISION.\n PROGRAM-ID. HELLO-WORLD.\n PROCEDURE DIVISION.\n DISPLAY 'HELLO, WORLD!'\n STOP RUN.\n\n### Documentation:"
response = doc_gen(prompt, max_new_tokens=200, do_sample=False)
print(response[0]["generated_text"]) |