Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'Qwen2TokenizerFast' object has no attribute 'tokenizer'. Did you mean: '_tokenizer'? #135

Open
jrp2014 opened this issue Nov 28, 2024 · 7 comments

Comments

@jrp2014
Copy link

jrp2014 commented Nov 28, 2024

Using the latest versions of mlx and mlx_vlm, on

import mlx.core as mx
from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_config

from PIL import Image

import os
from pathlib import Path

# model_path = "mlx-community/llava-1.5-7b-4bit"
# model_path = "mlx-community/llava-v1.6-mistral-7b-8bit"
# model_path = "mlx-community/pixtral-12b-8bit" # To the point
# model_path = "Qwen/Qwen2-VL-7B-Instruct"  # libc++abi: terminating due to uncaught exception of type std::runtime_error: Attempting to allocate 269535412224 bytes which is greater than the maximum allowed buffer size of 28991029248 bytes.###
# model_path = "mlx-community/llava-v1.6-34b-8bit" # Slower but more precise
# model_path = "mlx-community/Phi-3.5-vision-instruct-bf16" # OK, but doesn't provide keywords
# model_path = "mistral-community/pixtral-12b"
# model_path = "meta-llama/Llama-3.2-11B-Vision-Instruct"  # needs about 95Gb, but is slow
# model_path ="mlx-community/Qwen2-VL-72B-Instruct-8bit" # libc++abi: terminating due to uncaught exception of type std::runtime_error: Attempting to allocate 135383101952 bytes which is greater than the maximum allowed buffer size of 77309411328 bytes.
model_path ="mlx-community/dolphin-vision-72b-4bit"

print("Model: ", model_path)

# Load the model
model, processor = load(model_path)
config = load_config(model_path)

prompt = "Provide a factual caption, description and comma-separated keywords or tags for this image so that it can be searched for easily"

picpath = "/Users/xxx/Pictures/Processed"
pics = sorted(Path(picpath).iterdir(), key=os.path.getmtime, reverse=True)
pic = str(pics[0])
print("Image: ", pic)

# Apply chat template
formatted_prompt = apply_chat_template(processor, config, prompt, num_images=1)

# Generate output
output = generate(model, processor, pic, formatted_prompt, max_tokens=500, verbose=True)
print(output)

I get:

>  python mytest.py
(mlx) ~/Documents/AI/mlx/scripts/vlm % python mytest.py
Model:  mlx-community/dolphin-vision-72b-4bit
Fetching 19 files: 100%|█████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 23570.48it/s]
The repository for /Users/xxx/.cache/huggingface/hub/models--mlx-community--dolphin-vision-72b-4bit/snapshots/82156979ae25603e5d1bbec346559fe27d279f22 contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//Users/xxx/.cache/huggingface/hub/models--mlx-community--dolphin-vision-72b-4bit/snapshots/82156979ae25603e5d1bbec346559fe27d279f22.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.

Do you wish to run the custom code? [y/N] y
Fetching 19 files: 100%|█████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 13077.09it/s]
Image:  /Users/xxx/Pictures/Processed/20241123-231118_DSC02850_DxO.jpg
==========
Image: /Users/xxx/Pictures/Processed/20241123-231118_DSC02850_DxO.jpg 

Prompt: <|im_start|>system
Answer the questions.<|im_end|><|im_start|>user
<image>
Provide a factual caption, description and comma-separated keywords or tags for this image so that it can be searched for easily<|im_end|><|im_start|>assistant

Traceback (most recent call last):
  File "/Users/xxx/Documents/AI/mlx/scripts/vlm/mytest.py", line 39, in <module>
    output = generate(model, processor, pic, formatted_prompt, max_tokens=500, verbose=True)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/utils.py", line 1181, in generate
    prompt_tokens = mx.array(processor.tokenizer.encode(prompt))
                             ^^^^^^^^^^^^^^^^^^^
AttributeError: 'Qwen2TokenizerFast' object has no attribute 'tokenizer'. Did you mean: '_tokenizer'?
@Blaizzy
Copy link
Owner

Blaizzy commented Nov 28, 2024

Try updating your transformers to the latest as well

@jrp2014
Copy link
Author

jrp2014 commented Nov 28, 2024

yes, I think that I have all the latest via a pip install -U -r requirements.txt (transformers 4.46.3)

The full list is

Package            Version
------------------ ----------------------------
accelerate         1.1.1
aiofiles           23.2.1
aiohappyeyeballs   2.4.3
aiohttp            3.11.2
aiosignal          1.3.1
annotated-types    0.7.0
anyio              4.6.2.post1
attrs              24.2.0
certifi            2024.8.30
charset-normalizer 3.4.0
click              8.1.7
cmake              3.31.1
datasets           3.1.0
dill               0.3.8
fastapi            0.115.5
ffmpy              0.4.0
filelock           3.16.1
frozenlist         1.5.0
fsspec             2024.9.0
gradio             5.7.1
gradio_client      1.5.0
h11                0.14.0
hf_transfer        0.1.8
httpcore           1.0.7
httpx              0.27.2
huggingface-hub    0.26.2
idna               3.10
inquirerpy         0.3.4
Jinja2             3.1.4
llvmlite           0.43.0
markdown-it-py     3.0.0
MarkupSafe         2.1.5
mdurl              0.1.2
mlx                0.21.0.dev20241128+974bb54ab
mlx-lm             0.20.1
mlx-vlm            0.1.3
mlx-whisper        0.4.1
more-itertools     10.5.0
mpmath             1.3.0
multidict          6.1.0
multiprocess       0.70.16
nanobind           2.2.0
networkx           3.4.2
numba              0.60.0
numpy              1.26.4
orjson             3.10.11
packaging          24.2
pandas             2.2.3
pfzy               0.3.4
pillow             11.0.0
pip                24.3.1
prompt_toolkit     3.0.48
propcache          0.2.0
protobuf           5.28.3
psutil             6.1.0
pyarrow            18.0.0
pydantic           2.9.2
pydantic_core      2.23.4
pydub              0.25.1
Pygments           2.18.0
python-dateutil    2.9.0.post0
python-multipart   0.0.12
pytz               2024.2
PyYAML             6.0.2
regex              2024.11.6
requests           2.32.3
rich               13.9.4
ruff               0.7.4
safehttpx          0.1.1
safetensors        0.4.5
scipy              1.13.1
semantic-version   2.10.0
sentencepiece      0.2.0
setuptools         75.6.0
shellingham        1.5.4
six                1.16.0
sniffio            1.3.1
starlette          0.41.2
sympy              1.13.1
tiktoken           0.8.0
tokenizers         0.20.3
tomlkit            0.12.0
torch              2.5.1
torchaudio         2.5.1
torchvision        0.20.1
tqdm               4.67.1
tqdn               0.2.1
transformers       4.46.3
typer              0.13.0
typing_extensions  4.12.2
tzdata             2024.2
urllib3            2.2.3
uvicorn            0.32.0
wcwidth            0.2.13
websockets         12.0
wheel              0.44.0
xxhash             3.5.0
yarl               1.17.1

@Blaizzy
Copy link
Owner

Blaizzy commented Nov 28, 2024

I see,

Dolphin like NanoLLaVA use image_processor :)

Here is an example:

image_processor = load_image_processor(model_path)

@Blaizzy
Copy link
Owner

Blaizzy commented Nov 28, 2024

I will probably simplify this on a future release by adding it as a attribute in the processor at load time 👌🏽

So we can avoid this type of issues.

@jrp2014
Copy link
Author

jrp2014 commented Nov 28, 2024

Thanks. I'm not sure how I was supposed to know that, and I'm still not sure what I was supposed to know. I don't use nanolava when I can use the full fat version.

I was kind of hoping that whatever needed to be done was done under the hood... particularly in the absence of documentation to the contrary.

@Blaizzy
Copy link
Owner

Blaizzy commented Nov 29, 2024

My bad!

I promise, I'm working on it :)

@jrp2014
Copy link
Author

jrp2014 commented Nov 29, 2024

I realise that you are busy, but it can take an hour or two to download a different model...

In this case, even when setting processor = load_image_processor(model_path) I get:

    formatted_prompt = apply_chat_template(processor, config, prompt, num_images=1)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Caskroom/miniconda/base/envs/mlx/lib/python3.12/site-packages/mlx_vlm/prompt_utils.py", line 170, in apply_chat_template
    raise ValueError(
ValueError: Error: processor does not have 'chat_template' or 'tokenizer' attribute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants