π Obfuscated Variable Renaming with Qwen-Code
This repository hosts a Qwen-Codeβbased model fine-tuned to rename obfuscated variables in source code, improving readability while preserving program semantics.
The model is designed for use cases such as malware analysis, reverse engineering, digital forensics, and general program comprehension.
π Task Overview
Task: Code Deobfuscation / Variable Renaming
Base Model: Qwen-Code
Input: Source code with obfuscated variable names
Output: Semantically equivalent source code with readable variable names
Example
Input
function _0x12af(a, b) {
let _0x9c3e = a * b;
return _0x9c3e + 10;
}
Output
function multiplyAndAdd(a, b) {
let product = a * b;
return product + 10;
}
π§ Model Description
- Architecture: Qwen-Code (Transformer-based)
- Fine-tuning Objective: Context-aware variable renaming
- Approach: AST-guided identifier alignment + sequence generation
- Languages: JavaScript (primary), extendable to others
The model learns to infer meaningful variable names from usage context, not from superficial patterns.
π Training Details
Dataset
- Paired samples of:
- Obfuscated code
- Original / readable code
- Variable mappings extracted using AST-based analysis
- Realistic obfuscation patterns (minifiers, packers, name mangling)
Training Objectives
- Identifier-aware sequence-to-sequence learning
- Contextual name prediction
- Syntax preservation
π¦ Installation
pip install transformers torch accelerate
βΆοΈ Usage
Inference Example
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "Neo111x/Variables-Renaming"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
trust_remote_code=True
)
code = '''
function _0x12af(a, b) {
let _0x9c3e = a * b;
return _0x9c3e + 10;
}
'''
inputs = tokenizer(code, return_tensors="pt")
outputs = model.generate(
**inputs,
max_new_tokens=256,
do_sample=False
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
π§ͺ Evaluation
- Identifier exact-match accuracy
- AST equivalence checks
- Manual readability assessment
β οΈ Limitations
- Generated names are semantic approximations, not original identifiers
- Performance degrades on:
- Extremely short contexts
- Heavy control-flow flattening
- Single-file scope only
π Ethical Considerations
This model is intended for:
- Malware and binary analysis
- Digital forensics and incident response (DFIR)
- Code maintenance and auditing
It should not be used to violate software licenses or intellectual property rights.
π§© Future Work
- Multi-language support (C/C++, Python)
- Function and class renaming
- Control-flowβaware modeling
- Integration with decompilers and IR tools
π License
Specify the license here (e.g., Apache-2.0, MIT).
π Citation
@misc{qwen_code_variable_renamer,
title={Context-Aware Variable Renaming for Obfuscated Code using Qwen-Code},
author={Your Name},
year={2026},
url={https://huggingface.co/Neo111x/Variables-Renaming}
}