[Question] Model type for tt/ee not found, assuming encoder-only architecture #283

josephrocca · 2023-09-07T05:01:34Z

Reporting this as requested by the warning message, but as a question because I'm not entirely sure if it's a bug:

Here's the code I ran:

let quantized = false; // change to `true` for a much smaller model (e.g. 87mb vs 345mb for image model), but lower  accuracy
let { AutoProcessor, CLIPVisionModelWithProjection, RawImage, AutoTokenizer, CLIPTextModelWithProjection } = await import('https://cdn.jsdelivr.net/npm/@xenova/[email protected]/dist/transformers.min.js');
let imageProcessor = await AutoProcessor.from_pretrained('Xenova/clip-vit-base-patch16');
let visionModel = await CLIPVisionModelWithProjection.from_pretrained('Xenova/clip-vit-base-patch16', {quantized});
let tokenizer = await AutoTokenizer.from_pretrained('Xenova/clip-vit-base-patch16');
let textModel = await CLIPTextModelWithProjection.from_pretrained('Xenova/clip-vit-base-patch16', {quantized});

function cosineSimilarity(A, B) {
  if(A.length !== B.length) throw new Error("A.length !== B.length");
  let dotProduct = 0, mA = 0, mB = 0;
  for(let i = 0; i < A.length; i++){
    dotProduct += A[i] * B[i];
    mA += A[i] * A[i];
    mB += B[i] * B[i];
  }
  mA = Math.sqrt(mA);
  mB = Math.sqrt(mB);
  let similarity = dotProduct / (mA * mB);
  return similarity;
}

// get image embedding:
let image = await RawImage.read('https://i.imgur.com/RKsLoNB.png');
let imageInputs = await imageProcessor(image);
let { image_embeds } = await visionModel(imageInputs);
console.log(image_embeds.data);

// get text embedding:
let texts = ['a photo of an astronaut'];
let textInputs = tokenizer(texts, { padding: true, truncation: true });
let { text_embeds } = await textModel(textInputs);
console.log(text_embeds.data);

let similarity = cosineSimilarity(image_embeds.data, text_embeds.data);
console.log(similarity);

xenova · 2023-09-07T21:38:04Z

Thanks for the report - will fix :) It appears to be using the name of the minified class, which is why it is tt or ee.

e.g., to fix it, you can just use https://cdn.jsdelivr.net/npm/@xenova/[email protected]/dist/transformers.js instead of https://cdn.jsdelivr.net/npm/@xenova/[email protected]/dist/transformers.min.js

Fixes #283

* Add `CodeLlamaTokenizer` * Add `codellama` for testing * Update default quantization settings * Refactor `PretrainedModel` * Remove unnecessary error message * Update llama-code-tokenizer test * Add support for `GPTNeoX` models * Fix `GPTNeoXPreTrainedModel` config * Add support for `GPTJ` models * Add support for `WavLM` models * Update list of supported models - CodeLlama - GPT NeoX - GPT-J - WavLM * Add support for XLM models * Add support for `ResNet` models * Add support for `BeiT` models * Fix casing of `BeitModel` * Remove duplicate code * Update variable name * Remove `ts-ignore` * Remove unnecessary duplication * Update demo model sizes * [demo] Update default summarization parameters * Update default quantization parameters for new models * Remove duplication in mapping * Update list of supported marian models * Add support for `CamemBERT` models * Add support for `MBart` models * Add support for `OPT` models * Add `MBartTokenizer` and `MBart50Tokenizer` * Add example of multilingual translation with MBart models * Add `CamembertTokenizer` * Add support for `HerBERT` models * Add support for `XLMTokenizer` * Fix `fuse_unk` config * Do not remove duplicate keys for `Unigram` models See https://huggingface.co/camembert-base for an example of a Unigram tokenizer that has two tokens with the same value (`<unk>`) * Update HerBERT supported model text * Update generate_tests.py * Update list of supported models * Use enum object instead of classes for model types Fixes #283 * Add link to issue * Update dependencies for unit tests * Add `sentencepiece` as a testing requirement * Add `protobuf` to test dependency * Remove duplicated models to test

josephrocca added the question Further information is requested label Sep 7, 2023

xenova added a commit that referenced this issue Sep 8, 2023

Use enum object instead of classes for model types

cdb4814

Fixes #283

xenova linked a pull request Sep 8, 2023 that will close this issue

Optimizations #276

Merged

xenova closed this as completed in #276 Sep 8, 2023

This was referenced Sep 12, 2023

[Bug] Error: Unsupported model type: whisper when running in a app bundled with vite #300

Closed

Fix issues with minification (Closes #300) #307

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Model type for tt/ee not found, assuming encoder-only architecture #283

[Question] Model type for tt/ee not found, assuming encoder-only architecture #283

josephrocca commented Sep 7, 2023 •

edited

Loading

xenova commented Sep 7, 2023 •

edited

Loading

[Question] Model type for tt/ee not found, assuming encoder-only architecture #283

[Question] Model type for tt/ee not found, assuming encoder-only architecture #283

Comments

josephrocca commented Sep 7, 2023 • edited Loading

xenova commented Sep 7, 2023 • edited Loading

josephrocca commented Sep 7, 2023 •

edited

Loading

xenova commented Sep 7, 2023 •

edited

Loading