Error when trying to load model

#2
by ashikns - opened

When trying to load this model, ORT returns a number as error: 11136048. I tried with onnxruntime-web, it returned the same error from ORT. The error is returned when trying to create InferenceSession.

ONNX Community org

This is most likely an out of memory error with WASM, which you should be able to fix by setting device: "webgpu"

Pretty sure device is already set as "webgpu". I am basically just modifying the model_id in this sample: https://github.com/huggingface/transformers.js-examples/blob/main/phi-3.5-webgpu/src/worker.js. Could it be a worker specific thing?

ONNX Community org

Ah I see. In that case, could you remove the use_external_data_format: true? It should be set in the config already: https://huggingface.co/onnx-community/Phi-4-mini-instruct-web-q4f16/blob/main/config.json#L145

Also, make sure you're on the latest version of Transformers.js :)

I commented out use_external_data_format from here: https://github.com/huggingface/transformers.js-examples/blob/7109a0a6e23873eca1ea7003ed98591e6aaed27d/phi-3.5-webgpu/src/worker.js#L23. Unfortunately it still returns the same number as error. In the model load progress shown I can see that it is downloading both .onnx_data and .onnx_data_1.

Oh and yes, I'm on the latest version of transformers.js already since the project has already been updated to latest release :)

ONNX Community org

Oh yes, I remember what the problem is now.

image.png

We were running into some issues when loading the embedding layer into browser memory, due to its large size. We're still working on a solution, most likely involving additional quantization to these layers.

Got it. Thank you for the response (y)

same error loading the model

Sign up or log in to comment