amd
/

Phi-3.5-mini-instruct-awq-g128-int4-asym-fp16-onnx-dml

Text Generation

Model card Files Files and versions Community

microsoft/Phi-3.5-mini-instruct

Introduction
- Quantization Tool: Quark 0.6.0
- OGA Model Builder: v0.5.1

Quantization Strategy

AWQ / Group 128 / Asymmetric / UINT4 Weights / FP16 activations
Excluded Layers: None

python3 quantize_quark.py \
      --model_dir "$model" \
      --output_dir "$output_dir" \
      --quant_scheme w_uint4_per_group_asym \
      --num_calib_data 128 \
      --quant_algo awq \
      --dataset pileval_for_awq_benchmark \
      --seq_len 512 \
      --model_export quark_safetensors \
      --data_type float16 \
      --exclude_layers [] \
      --custom_mode awq

OGA Model Builder

python builder.py \
  -i <quantized safetensor model dir> \
  -o <oga model output dir> \
  -p int4 \
  -e dml

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for amd/Phi-3.5-mini-instruct-awq-g128-int4-asym-fp16-onnx-dml

Base model

microsoft/Phi-3.5-mini-instruct

Quantized

(152)

this model