|
--- |
|
license: apache-2.0 |
|
base_model: |
|
- openai/gpt-oss-20b |
|
library_name: transformers |
|
--- |
|
# gpt-oss-20b ONNX model (deqauntized to BF16) |
|
|
|
This repository contains an ONNX export of the [openai/gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b) from Hugging Face, generated using the official [onnxruntime-genai](https://github.com/microsoft/onnxruntime-genai) builder. The choice of setting the precision to BF16 was mainly out of lack of resources on my M4 mini followed by my limited knowledge of the GenAI engineering ecosystem. |
|
|
|
## Model Overview |
|
|
|
- **Source Model:** [openai/gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b) from 🤗 |
|
- **Exported Format:** ONNX |
|
- **Precision:** BF16 (dequantized from MXFP4 for GPU compatibility) |
|
- **Layers:** 24 decoder layers, embedding layer, final normalization, and language modeling (LM) head |
|
|
|
This repository includes all files: tokenizer, chat templates and configuration files. |
|
|
|
## Generation Details |
|
|
|
The ONNX model was generated using the `builder.py` script from the onnxruntime-genai toolkit. The process involved: |
|
|
|
- Loading the original gpt-oss-20b checkpoint from 🤗 |
|
- Reading and converting all model layers (embedding, decoder, final norm, LM head) |
|
- Saving the ONNX model and associated external data file |
|
- Exporting tokenizer and configuration files |
|
- Model layers and weights were successfully read and converted |
|
- MXFP4 quantized weights were dequantized to BF16 |
|
- All necessary files for GenAI runtime and Hugging Face integration were generated |
|
|
|
## Usage |
|
|
|
To use this ONNX model: |
|
|
|
1. Download the model files and tokenizer assets from this repository. |
|
2. Load the ONNX model using [onnxruntime](https://onnxruntime.ai/) or compatible inference engines. |
|
|
|
## Acknowledgements |
|
|
|
- Original model: [openai/gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b) from 🤗 |
|
- ONNX export: [onnxruntime-genai](https://github.com/microsoft/onnxruntime-genai) by Microsoft |
|
|
|
--- |