[Question] failed to call OrtRun(). error code = 1. When I try to load Xenova/pygmalion-350m #334

sebinthomas · 2023-09-28T01:34:36Z

I'm getting an error failed to call OrtRun(). error code = 1. When I try to load Xenova/pygmalion-350m. The error is as follows

wasm-core-impl.ts:392 Uncaught Error: failed to call OrtRun(). error code = 1.
    at e.run (wasm-core-impl.ts:392:19)
    at e.run (proxy-wrapper.ts:215:17)
    at e.OnnxruntimeWebAssemblySessionHandler.run (session-handler.ts:100:15)
    at InferenceSession.run (inference-session-impl.ts:108:40)
    at sessionRun (models.js:191:36)
    at async Function.decoderForward [as _forward] (models.js:478:26)
    at async Function.forward (models.js:743:16)
    at async Function.decoderRunBeam [as _runBeam] (models.js:564:18)
    at async Function.runBeam (models.js:1284:16)
    at async Function.generate (models.js:1009:30)

And my Code for running it is this


let text = 'Once upon a time, there was';
let generator = await pipeline('text-generation', 'Xenova/pygmalion-350m');
let output = await generator(text, {
  temperature: 2,
  max_new_tokens: 10,
  repetition_penalty: 1.5,
  no_repeat_ngram_size: 2,
  num_beams: 2,
  num_return_sequences: 2,
});

console.log(output);

I see that OrtRun is something returned by the OnnxRuntime on a failure but have you had success in running the Pygmalion-350m model ?

The text was updated successfully, but these errors were encountered:

sebinthomas · 2023-09-30T12:32:52Z

I'm using a mac with brave browser and chrome 117.

kungfooman · 2023-09-30T12:45:14Z

Same on Linux / Chrome 117, not browser related. We had these errors before and sometimes it's "easy" to fix them:

#140 (comment)

Sometimes they crash the entire browser 🙈

error code = 1 means ORT_FAIL, it couldn't be more descriptive than that 😅 Sometimes I question the WASM overhead, it makes everything just so much more difficult to debug and I had a project where WASM is actually slower than V8 jitted/optimized JS (which would still be easy to debug).

Here is a full error:

ort-wasm.js:25 2023-09-30 15:13:51.698699 [E:onnxruntime:, sequential_executor.cc:494 ExecuteKernel] Non-zero status code returned while running DynamicQuantizeMatMul node. Name:'/decoder/layers.0/fc1/Gemm_MatMul_quant' Status Message: matmul_helper.h:61 Compute MatMul dimension mismatch
ort-wasm.js:25 2023-09-30 15:13:51.699100 [E:onnxruntime:, sequential_executor.cc:494 ExecuteKernel] Non-zero status code returned while running If node. Name:'optimum::if' Status Message: Non-zero status code returned while running DynamicQuantizeMatMul node. Name:'/decoder/layers.0/fc1/Gemm_MatMul_quant' Status Message: matmul_helper.h:61 Compute MatMul dimension mismatch
ort-wasm.js:25 Non-zero status code returned while running If node. Name:'optimum::if' Status Message: Non-zero status code returned while running DynamicQuantizeMatMul node. Name:'/decoder/layers.0/fc1/Gemm_MatMul_quant' Status Message: matmul_helper.h:61 Compute MatMul dimension mismatch

xenova · 2023-10-02T10:00:07Z

Hi there 👋 thanks for making the report. Indeed, this is a known issue (see here for the PR which introduced the OPT models): #276

For now, the only way I've found to get it working is by using the unquantized versions of the models. Example code:

const generator = await pipeline('text-generation', 'Xenova/opt-125m', {
    quantized: false, // NOTE: quantized models are currently broken
});
const prompt = 'Once upon a';
const output = await generator(prompt);
// [{ generated_text: 'Once upon a time, I was a student at the University of California, Berkeley. I was' }]

@kungfooman do you maybe know if this still happens in onnxruntime-web 1.16? cc @fs-eire @fxmarty too

kungfooman · 2023-10-02T11:53:14Z

I didn't spend the time yet to figure out how to use 1.16, for some reason is just doesn't find any backends:

Whereas using this works like a charm:

https://cdnjs.cloudflare.com/ajax/libs/onnxruntime-web/1.14.0/ort.es6.min.js

xenova · 2023-10-02T12:38:06Z

Ah yes, it defaults to use WASM files served via jsdelivr, which is what the errors indicate. No worries then, I can do further testing for an upcoming release.

fs-eire · 2023-10-02T20:37:05Z

For the DynamicQuantizeMatMul, sorry but I do not know much details. Maybe need who write this kernel to take a look.

https://cdnjs.cloudflare.com/ajax/libs/onnxruntime-web/1.16.0/ort.es6.min.js <-- this seems working on my html when I use <script src="https://cdnjs.cloudflare.com/ajax/libs/onnxruntime-web/1.16.0/ort.es6.min.js"></script> . Can I get a reproduce steps?

sebinthomas · 2023-10-02T22:47:53Z

@xenova I haven't tried 1.16.0 but when I try with the unquantized version, it gives an error saying Error: Can't create a session.
Will try with 1.16.0 and see

fxmarty · 2023-10-03T12:49:04Z

Let me know if you think the issue is related to the export.

kungfooman · 2023-10-04T09:05:55Z

Can I get a reproduce steps?

Yep, I have a simple example project here: https://github.com/kungfooman/transformers-object-detection/

It's a bit hacky because there doesn't seem to be any published ESM build yet.

Because I like to use repo's as "source of truth", I also converted what I needed once from TS to ESM here: microsoft/onnxruntime@main...kungfooman:onnxruntime:main

(but I don't have the time to maintain it rn)

In general the future is ESM + importmap, every browser aims for that nowadays and npm packages follow, for example:

https://babeljs.io/docs/v8-migration (only ships as ESM now)

Would be nice if ONNX ships browser-compatible/working ESM files too (aka don't drop the file extensions).

Thank you for looking into this @fs-eire 👍

uahmad235 · 2023-12-16T16:36:04Z

Getting same error when trying to run a gpt2 model on the latest 2.11.0.
Any leads on this issue?

xenova · 2023-12-16T16:58:28Z

@uahmad235 The repo you linked to does not include ONNX weights in a subfolder called "onnx". You can do the conversion yourself by installing these requirements and then following the tutorial here. Please ensure the repo structure is the same as this one.

uahmad235 · 2023-12-16T17:01:59Z

Apologies @xenova . My bad. I forgot to mention that i have already converted weights using the given instructions. The converted weights does have a file called onnx/decoder_model_merged.onnx alongwith others.

xenova · 2023-12-16T17:03:40Z

@uahmad235 No worries :) In that case, could you try run the unquantized version?

const generator = await pipeline('text-generation', '<your_model_id>', {
    quantized: false, // <-- HERE
});

This may be due to a missing op support in v1.14.0 of onnxruntime-web.

uahmad235 · 2023-12-16T17:12:03Z

Thank you @xenova . I have tried that already following the thread but unfortunately it does not work. It is strange that gpt works but not the danish variant that i am trying to work with.

xenova · 2023-12-16T17:13:11Z

@uahmad235 Feel free to open up a new issue with code that allows me to reproduce this. It might just be a configuration issue. 😇

uahmad235 · 2023-12-16T17:14:11Z

Sure. Let me give another try and I can add a separate issue in case it does not work. Thanks for the prompt response :)

sebinthomas added the question Further information is requested label Sep 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] failed to call OrtRun(). error code = 1. When I try to load Xenova/pygmalion-350m #334

[Question] failed to call OrtRun(). error code = 1. When I try to load Xenova/pygmalion-350m #334

sebinthomas commented Sep 28, 2023

sebinthomas commented Sep 30, 2023

kungfooman commented Sep 30, 2023 •

edited

Loading

xenova commented Oct 2, 2023 •

edited

Loading

kungfooman commented Oct 2, 2023

xenova commented Oct 2, 2023

fs-eire commented Oct 2, 2023

sebinthomas commented Oct 2, 2023

fxmarty commented Oct 3, 2023

kungfooman commented Oct 4, 2023

uahmad235 commented Dec 16, 2023

xenova commented Dec 16, 2023

uahmad235 commented Dec 16, 2023

xenova commented Dec 16, 2023

uahmad235 commented Dec 16, 2023

xenova commented Dec 16, 2023

uahmad235 commented Dec 16, 2023

[Question] failed to call OrtRun(). error code = 1. When I try to load Xenova/pygmalion-350m #334

[Question] failed to call OrtRun(). error code = 1. When I try to load Xenova/pygmalion-350m #334

Comments

sebinthomas commented Sep 28, 2023

sebinthomas commented Sep 30, 2023

kungfooman commented Sep 30, 2023 • edited Loading

xenova commented Oct 2, 2023 • edited Loading

kungfooman commented Oct 2, 2023

xenova commented Oct 2, 2023

fs-eire commented Oct 2, 2023

sebinthomas commented Oct 2, 2023

fxmarty commented Oct 3, 2023

kungfooman commented Oct 4, 2023

uahmad235 commented Dec 16, 2023

xenova commented Dec 16, 2023

uahmad235 commented Dec 16, 2023

xenova commented Dec 16, 2023

uahmad235 commented Dec 16, 2023

xenova commented Dec 16, 2023

uahmad235 commented Dec 16, 2023

kungfooman commented Sep 30, 2023 •

edited

Loading

xenova commented Oct 2, 2023 •

edited

Loading