--- language: - en - code language_bcp47: - en - javascript license: apache-2.0 tags: - text-generation - code - javascript - coding-assistant - fine-tuning - merged - unsloth - gpt-oss - vllm base_model: openai/gpt-oss-20b library_name: transformers pipeline_tag: text-generation model-index: - name: gpt-oss-coder-v0.1-javascript results: [] --- # gpt-oss-coder-v0.1-javascript A **language-specialized coding model for JavaScript**, fine-tuned from OpenAI's open-weight **gpt-oss** base with **very small, curated JS data** using **Unsloth**. This release prioritizes **practical code generation quality** over benchmark scores. The model weights have been **merged** and are ready for deployment. > **Status**: Experimental preview (`v0.1-javascript`) > **Focus**: JS coding tasks (function-level completion, small refactors, idiomatic patterns) > **Testing**: Currently undergoing validation with vLLM deployment > **Note**: This repository contains merged weights, not LoRA adapters --- ## Model Details - **Model type**: Causal LM (decoder-only), JS-specialized fine-tune - **Base model**: `openai/gpt-oss-20b` (open-weight, Apache-2.0) - **Fine-tuning**: LoRA via **Unsloth**, weights merged post-training - **License**: Apache-2.0 (derivative weights released under Apache-2.0) - **Author / Maintainer**: `hokar3361` - **Intended Languages**: JavaScript (ES6+); English prompts recommended - **Weight Format**: Merged (full model weights) --- ## Intended Use & Limitations ### Intended Use - Code completion and synthesis for **JavaScript** - Small refactors, idiomatic rewrites, test scaffolding, JSDoc/docstrings - Snippet-level reasoning and bug fixes ### Out of Scope / Limitations - Not a substitute for static analysis, linters, or security review - May hallucinate APIs or types; verify before production use - Trained on **small** domain data → expect gaps on rare frameworks or edge APIs --- ## Quickstart ### 1. Start vLLM Server Since this repository contains **merged weights**, you can run directly with vLLM: ```bash vllm serve hokar3361/gpt-oss-coderjs-v0.1 \ --async-scheduling \ --max-model-len 16000 \ --gpu-memory-utilization 0.90 ``` **Recommended**: Use `--max-model-len 16000` for optimal context handling. ### 2. Client Usage (Recommended) Use the **OpenAI Python client** to call the vLLM server: ```python from openai import OpenAI # Point to your vLLM server client = OpenAI( base_url="http://localhost:8000/v1", api_key="dummy" # vLLM doesn't require auth by default ) response = client.completions.create( model="hokar3361/gpt-oss-coderjs-v0.1", prompt="// JavaScript function to validate email addresses\nfunction validateEmail(email) {", # DO NOT specify temperature or max_tokens - let the model use defaults ) print(response.choices[0].text) ``` **Important**: - **Do not specify** `temperature` or `max_tokens` parameters - the model performs best with default values - Use the OpenAI Python client for best compatibility and stability --- ## Testing & Validation ### Current Status The model is currently being validated using vLLM deployment. Initial testing shows **improved performance** compared to pre-fine-tuning baseline. ### Evaluation Methodology - **Test Set**: 50 programming questions from GitHub and Stack Overflow - **Judges**: GPT-5 and Claude Opus for response quality assessment - **Preliminary Results**: The fine-tuned model demonstrates better code generation quality on JavaScript-specific tasks compared to the base model - **Note**: Full benchmark validation is still in progress --- ## Acknowledgements This work was made possible thanks to the open-weight release of **gpt-oss** by OpenAI, which provided a strong foundation under the Apache-2.0 license. Special thanks to the open-source community around **Unsloth** for enabling memory-efficient and rapid LoRA fine-tuning on limited hardware. We also thank the **Hugging Face** and **vLLM** ecosystems for lowering the barrier to experimentation. --- ## Disclaimer & Experimental Status This model (`v0.1-javascript`) is highly experimental: - **Small data**: Fine-tuned on a very small JavaScript-focused dataset, mainly to validate the workflow and feasibility of language specialization. - **Not production-ready**: The model may generate incomplete, insecure, or non-idiomatic code; do not rely on it for production use without careful review. - **Testing in progress**: While initial results from GPT-5 and Opus evaluation show improvements, comprehensive benchmarking is ongoing. - **Early stage**: This is only an initial exploration; future versions with larger, more diverse training corpora are expected to improve stability and coverage. We share this release to contribute to the community and gather early feedback. **Use responsibly, validate outputs, and treat this as a proof-of-concept.**