Q-Bridge
Model Details
- Model type: LoRA fine-tuned causal language model for instruction following
- Base model:
Qwen/Qwen3-1.7B - Parameter-efficient method: Low-Rank Adaptation (LoRA) applied to transformer projection layers.
- Libraries: Transformers, PEFT, Datasets, Weights & Biases logging.
Intended Use
The adapter specializes a base LLM to translate classical machine learning (CML) module descriptions into quantum machine learning (QML) implementations. Use it by loading the base model (Qwen/Qwen3-1.7B) and applying the LoRA weights through PEFT before prompting with CML descriptions.
Training Data
- Source dataset:
runjiazeng/CML-2-QML(train split only). - Filtering: examples whose reported average length exceeds half of the
max_lengthargument are removed to stay within the tokenizer context window. - Prompt template:
You are an expert quantum machine learning researcher. Translate the provided classical machine learning (CML) description into its quantum machine learning (QML) counterpart. CML Description: <cml_text> QML Solution: - Targets: Ground-truth QML solutions appended after the prompt.
Training Procedure
- Tokenization: Uses the base model tokenizer with right padding and EOS padding when no explicit pad token exists. Labels for prompt tokens are masked with
-100to ensure loss is computed only on generated answers. - Batching: Custom data collator pads inputs dynamically and aligns masked labels.
- Hardware setup: Script detects distributed settings via
LOCAL_RANK/WORLD_SIZEand optionally enables DeepSpeed ZeRO-3. - Optimization:
- Learning rate default
2e-5with cosine schedule and0.03warmup ratio. - AdamW optimizer with
weight_decay=0.1,max_grad_norm=1.0. - Gradient accumulation steps default to
8, per-device batch size1. - Training runs for
3epochs with gradient checkpointing enabled.
- Learning rate default
- LoRA configuration: rank
64, alpha128, dropout0.05, bias disabled. Target modules default togate_proj,down_proj,up_projif present; otherwise all linear layers exceptlm_head. - Logging & checkpoints: Weights & Biases run configured via CLI arguments; checkpoints saved every
500steps with a cap of2.
Evaluation
No automatic evaluation metrics are computed in the training script. Users should validate generations on held-out CML descriptions.
Usage
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("runjiazeng/Q-Bridge", torch_dtype="auto")
tokenizer = AutoTokenizer.from_pretrained("runjiazeng/Q-Bridge", use_fast=False)
prompt = """You are an expert quantum machine learning researcher. Translate the provided classical machine learning (CML) description into its quantum machine learning (QML) counterpart.
CML Description:
<your description>
QML Solution:"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Environmental Impact
The script supports DeepSpeed ZeRO-3 and gradient checkpointing to reduce memory consumption. Exact training footprint depends on the user's hardware and run duration.
Risks and Limitations
- The model inherits biases from the base
Qwen3-1.7Bmodel. - Generated QML code may be unverified or non-executable. Users must review outputs before deployment.
- Dataset focuses on pairwise ML→QML translations; performance on unrelated tasks is likely poor.
Training Script
The full training procedure, CLI, and data processing logic are provided in q-bridge-lora.py within this repository.
- Downloads last month
- -