[Question] Why running transformer in js is faster than python? #125

pitieu · 2023-05-28T05:23:05Z

I created a repo to test how to use transformers.
https://github.com/pitieu/huggingface-transformers

I was wondering why is it that running the same models in javascript is faster than running them in python?
Is Xenova/vit-gpt2-image-captioning optimized somehow compared to nlpconnect/vit-gpt2-image-captioning ?

I run it on my MAC M1.

The text was updated successfully, but these errors were encountered:

xenova · 2023-05-28T07:27:52Z

Oh wow that's very interesting 👀. Thanks for putting that repo together!

My first guess would be related to quantization. Because Transformers.js is designed to be used in a browser, we default to use quantized weights (8-bit). However, since you're running server-side, you may not need to worry about that. So, you can use the unquantized weights with:

let pipe = await pipeline("task", "model", { quantized: false })

Let me know what results you get in that case!

pitieu · 2023-05-28T08:54:42Z

I tried using { quantized: false } but it failed to execute it.

It get's stuck in @xenova/transformers/src/pipelines.js file
line 1462 let items = await Promise.all(promises)

According to ChatGPT:

When you use the pipeline function with quantized = false, the script might get stuck because it's trying to load ONNX weights that haven't been converted or provided in the model repository you're using. If the model repository doesn't have ONNX weights in a directory named "onnx", the script could potentially hang or fail to load the model properly1.

I do see ONNX files in Xenova/vit-gpt2-image-captioning but maybe he gets the wrong filename and keeps trying to load it.

Running let pipe = await pipeline("task", "model") works fine.

I don't have an issue here i'm just pointing this out in case there's a bug somewhere.

xenova · 2023-05-28T20:18:28Z

I've run it locally (both quantized and unquantized) and both seem to work.

import { pipeline } from '@xenova/transformers';

(async () => {

    let url = 'https://huggingface.co/datasets/mishig/sample_images/resolve/main/savanna.jpg';

    let pipe1 = await pipeline('image-to-text', 'Xenova/vit-gpt2-image-captioning')
    console.log(await pipe1(url))
    // [ { generated_text: 'a herd of giraffes and zebras grazing in a field' } ]

    let pipe2 = await pipeline('image-to-text', 'Xenova/vit-gpt2-image-captioning', { quantized: false })
    console.log(await pipe2(url))
    // [ { generated_text: 'a herd of giraffes and zebras grazing in a field' } ]
})();

Times:

quantized: 1576.2211000025272 ms
unquantized: 1424.2003999948502 ms

It is worth noting that the unquantized files are ~4x larger, and considering it hangs at the downloading portion, you may just have to wait for a bit longer. Check your network tab to make sure it is in fact downloading. There may have also been connectivity issues to the HF hub when you were downloading.

We are also planning on adding proper progress bars for downloading files (#117)

xenova · 2023-07-16T17:21:38Z

Closing the issue since there's nothing for me to do 😅... if you'd like to open a new issue at https://github.com/huggingface/transformers, you're more than welcome to! 😃

pitieu added the question Further information is requested label May 28, 2023

xenova closed this as completed Jul 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Why running transformer in js is faster than python? #125

[Question] Why running transformer in js is faster than python? #125

pitieu commented May 28, 2023

xenova commented May 28, 2023

pitieu commented May 28, 2023

xenova commented May 28, 2023

xenova commented Jul 16, 2023

[Question] Why running transformer in js is faster than python? #125

[Question] Why running transformer in js is faster than python? #125

Comments

pitieu commented May 28, 2023

xenova commented May 28, 2023

pitieu commented May 28, 2023

xenova commented May 28, 2023

xenova commented Jul 16, 2023