File size: 4,424 Bytes
928613c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
---
license: apache-2.0
base_model: Qwen/Qwen2.5-Coder-7B-Instruct
tags:
- generated_from_trainer
- x3d
- 3d-generation
- lora
- code-generation
datasets:
- stratplans/savage-x3d-generation
language:
- en
library_name: transformers
pipeline_tag: text-generation
---

# X3D Generation Model - Qwen2.5-Coder-7B LoRA

This model is a fine-tuned version of [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) for generating X3D (Extensible 3D) scene descriptions from natural language prompts.

## Model Description

This model generates syntactically valid and semantically meaningful X3D scene descriptions from natural language prompts. X3D is an ISO-standard XML-based format for representing 3D graphics, widely used in simulation, scientific visualization, and web-based 3D applications.

### Key Features
- Generates valid X3D XML code from natural language descriptions
- Trained on 19,712 instruction-response pairs derived from the Naval Postgraduate School Savage X3D Archive
- Uses LoRA (Low-Rank Adaptation) for efficient fine-tuning
- 4-bit quantization compatible for reduced memory usage

## Training Details

### Dataset
- **Source**: Naval Postgraduate School (NPS) Savage X3D Archive
- **Base models**: 1,232 unique X3D files
- **Augmented dataset**: 19,712 instruction-response pairs
- **Categories**: Military equipment, vehicles, buildings, terrain, humanoids, and abstract geometries

### Model Architecture
- **Base Model**: Qwen2.5-Coder-7B-Instruct (7.7B parameters)
- **Fine-tuning Method**: LoRA with 4-bit quantization
- **LoRA Configuration**:
  - Rank: 32
  - Alpha: 64
  - Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  - Trainable parameters: 80.7M (1.05% of total)

### Training Configuration
- **Hardware**: 5x NVIDIA RTX 4090 GPUs (24GB VRAM each)
- **Training time**: 11.5 hours
- **Epochs**: 3
- **Effective batch size**: 80
- **Learning rate**: 2e-4 with cosine decay
- **Final training loss**: 0.0086
- **Final validation loss**: 0.0112

## Usage

### Installation

```bash
pip install transformers peft accelerate bitsandbytes
```

### Loading the Model

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model with 4-bit quantization
base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-Coder-7B-Instruct",
    load_in_4bit=True,
    device_map="auto",
    trust_remote_code=True
)

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "stratplans/x3d-qwen2.5-coder-7b-lora")

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("stratplans/x3d-qwen2.5-coder-7b-lora")

# Generate X3D
prompt = """<|im_start|>system
You are an X3D 3D model generator. Generate valid X3D XML code based on the user's description.
<|im_end|>
<|im_start|>user
Create an X3D model of a red sphere with radius 2 units
<|im_end|>
<|im_start|>assistant
"""

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_length=2048, temperature=0.7)
x3d_code = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(x3d_code)
```

### Example Prompts

1. "Create an X3D model of a blue cube with metallic surface"
2. "Generate an X3D scene with a rotating pyramid"
3. "Build an X3D model of a simple robot with movable joints"
4. "Design an X3D terrain with hills and valleys"

## Performance

- **Generation speed**: ~50 tokens/second on single RTX 4090
- **Memory requirement**: 8GB VRAM for inference with 4-bit quantization
- **Validity rate**: Estimated 85% syntactically valid X3D on first generation
- **Semantic accuracy**: Follows input specifications in 70% of test cases

## Limitations

1. Maximum context length limited to 2048 tokens during training
2. Complex scenes may require multiple generation attempts
3. Animation and interaction features have limited support
4. Best performance on object types similar to training data

## Citation

If you use this model, please cite:

```bibtex
@misc{x3d-qwen-2024,
  title={X3D Generation with Fine-tuned Qwen2.5-Coder},
  author={stratplans},
  year={2024},
  publisher={HuggingFace}
}
```

## License

This model inherits the Apache 2.0 license from the base Qwen2.5-Coder model.

## Acknowledgments

- Naval Postgraduate School for the Savage X3D Archive
- Qwen team for the base model
- The X3D and Web3D Consortium community