Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to pass the post-build test which related to webm images #1171

Open
2 of 4 tasks
8ar10der opened this issue Nov 12, 2024 · 1 comment
Open
2 of 4 tasks

Failed to pass the post-build test which related to webm images #1171

8ar10der opened this issue Nov 12, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@8ar10der
Copy link

8ar10der commented Nov 12, 2024

  • This is actually a bug report.
  • I am not getting good LLM Results
  • I have tried asking for help in the community on discord or discussions and have not received a response.
  • I have tried searching the documentation and have not found an answer.

Environment
Archlinux 6.6.60-1-lts
python 3.12.7

Installed pip packages

Click Me
Package                   Version
------------------------- ------------------------------
aiohappyeyeballs          2.4.3
aiohttp                   3.10.5
aiosignal                 1.3.1
annotated-types           0.7.0
anthropic                 0.39.0
anyio                     4.4.0
arch-signoff              0.5.2
argon2-cffi               23.1.0
argon2-cffi-bindings      21.2.0
asgi-lifespan             2.1.0
asgiref                   3.8.1
asn1crypto                1.5.1
attrs                     23.2.1.dev0
autocommand               2.2.2
backoff                   2.2.1
bcrypt                    4.1.3
black                     24.10.0
boto3                     1.35.36
botocore                  1.35.36
btrfs                     14.1
btrfsutil                 6.10
build                     1.2.2
CacheControl              0.14.1
cachy                     0.3.0
certifi                   2024.8.30
cffi                      1.17.1
charset-normalizer        3.4.0
cleo                      2.1.0
click                     8.1.7
cockpit                   328
cohere                    5.11.3
colorama                  0.4.6
configobj                 5.0.8
contourpy                 1.3.0
coverage                  7.6.3
cramjam                   2.9.0
crashtest                 0.4.1
crccheck                  1.3.0
crit                      4.0
cryptography              42.0.8
cuda-python               12.4.0
cycler                    0.12.1
Cython                    3.0.11
dbus-python               1.3.2
dirty-equals              0.8.0
distlib                   0.3.8
distro                    1.9.0
dnspython                 2.6.1
docstring_parser          0.16
dulwich                   0.22.5
editables                 0.5
fastapi                   0.115.4
fastavro                  1.9.7
fastjsonschema            2.20.0
filelock                  3.13.3
fonttools                 4.54.1
frozenlist                1.4.1
future                    1.0.0
gunicorn                  23.0.0
h11                       0.14.0
h2                        4.1.0
hatch-fancy-pypi-readme   24.1.0
hatchling                 1.25.0
hpack                     4.0.0
html5lib                  1.1
httpcore                  1.0.5
httptools                 0.6.1
httpx                     0.27.2
httpx-sse                 0.4.0
hyperframe                6.0.1
hypothesis                6.118.7
idna                      3.10
importlib_metadata        7.2.1
inflect                   7.4.0
iniconfig                 2.0.0
installer                 0.7.0
instructor                1.3.7
jaraco.classes            3.4.0
jaraco.collections        5.0.1
jaraco.context            5.3.0
jaraco.functools          4.0.2
jaraco.text               4.0.0
jeepney                   0.8.0
Jinja2                    3.1.4
jiter                     0.6.1
jmespath                  1.0.1
jsonschema                4.23.0
jsonschema-specifications 2023.12.1
keyring                   25.2.1
kiwisolver                1.4.5
lark                      1.2.2
libfdt                    1.7.1
libvirt-python            10.9.0
litellm                   1.51.3
lockfile                  0.12.2
lz4                       4.3.3
Markdown                  3.7
markdown-it-py            3.0.0
MarkupSafe                2.1.5
matplotlib                3.9.2
maturin                   1.7.4
mdurl                     0.1.2
meson                     1.6.0
more-itertools            10.3.0
msgpack                   1.0.5
multidict                 6.0.5
mypy_extensions           1.0.0
nftables                  0.1
numpy                     2.1.3
openai                    1.52.2
ordered-set               4.1.0
orjson                    3.10.11
packaging                 24.1
PacVis                    0.2.7
pathspec                  0.12.1
pcp                       5.0
pdm-backend               2.4.3
perf                      0.1
pexpect                   4.9.0
pillow                    11.0.0
pip                       24.3.1
pkginfo                   1.10.0
platformdirs              4.3.6
PLTable                   1.1.0
pluggy                    1.5.0
podman-compose            1.2.0
poetry                    1.8.4
poetry-core               1.9.0
poetry-plugin-export      1.8.0
protobuf                  5.28.3
psutil                    6.1.0
ptyprocess                0.7.0
pwquality                 1.4.5
pyalpm                    0.10.6
pyasn1                    0.6.0
pyasn1_modules            0.4.0
pyclibrary                0.2.1
pycparser                 2.22
pycriu                    4.0
pydantic                  2.9.2
pydantic_core             2.23.4
pydantic-extra-types      2.10.0
pydantic-settings         2.6.1
Pygments                  2.18.0
PyGObject                 3.50.0
PyJWT                     2.9.0
pyparsing                 3.1.2
pyperf                    2.6.3
pyproject_hooks           1.2.0
pyproject-metadata        0.9.0
pyrsistent                0.20.0
pytest                    8.3.3
pytest-asyncio            0.24.0
pytest-cov                5.0.0
pytest-examples           0.0.13
python-dateutil           2.9.0
python-dotenv             1.0.1
python-linux-procfs       0.7.3
python-multipart          0.0.16
python-snappy             0.7.2
pytz                      2024.2
pyudev                    0.24.3
PyYAML                    6.0.2
rapidfuzz                 3.6.2
referencing               0.35.1
regex                     2024.9.11
requests                  2.32.3
requests-toolbelt         1.0.0
responses                 0.25.3
respx                     0.21.1
rich                      13.9.4
rpds-py                   0.19.0
ruff                      0.7.3
s3transfer                0.10.3
SecretStorage             3.3.3
setuptools                75.2.0
setuptools-scm            8.1.0
shell_gpt                 1.4.4
shellingham               1.5.4
six                       1.16.0
smbus                     1.1
sniffio                   1.3.1
sortedcontainers          2.4.0
speedtest-cli             2.1.3
sse-starlette             2.1.3
starlette                 0.41.2
TBB                       0.2
tenacity                  9.0.0
tiktoken                  0.7.0
tokenizers                0.20.3
tomli                     2.0.1
tomli_w                   1.0.0
tomlkit                   0.13.2
tornado                   6.4.1
tpm2-pkcs11-tools         1.33.7
tpm2-pytss                2.2.1
tqdm                      4.67.0
trove-classifiers         2024.10.21.16
typeguard                 4.3.0
typer                     0.12.3
typing_extensions         4.12.2
uc-micro-py               1.0.3
urllib3                   1.26.20
uvicorn                   0.31.0
uvloop                    0.20.0
validate                  5.0.8
validate-pyproject        0.22
virtualenv                20.27.1
webencodings              0.5.1
websockets                12.0
wheel                     0.44.0
yarl                      1.9.4
zipp                      3.19.3.dev0+gc6a3339.d20240728
zstandard                 0.22.0

Testing Log

Click Me
=================================================================================== FAILURES ===================================================================================
__________________________________________________________________ test_image_from_url_with_unusual_extension __________________________________________________________________

    def test_image_from_url_with_unusual_extension():
        url = "https://example.com/image.webp"
>       image = Image.from_url(url)

tests/test_multimodal.py:226: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

cls = <class 'instructor.multimodal.Image'>, url = 'https://example.com/image.webp'

    @classmethod
    @lru_cache
    def from_url(cls, url: str) -> Image:
        if cls.is_base64(url):
            return cls.from_base64(url)
    
        parsed_url = urlparse(url)
        media_type, _ = mimetypes.guess_type(parsed_url.path)
    
        if not media_type:
            try:
                response = requests.head(url, allow_redirects=True)
                media_type = response.headers.get("Content-Type")
            except requests.RequestException as e:
                raise ValueError(f"Failed to fetch image from URL") from e
    
        if media_type not in VALID_MIME_TYPES:
>           raise ValueError(f"Unsupported image format: {media_type}")
E           ValueError: Unsupported image format: text/html

instructor/multimodal.py:145: ValueError
_________________________________________________________ test_image_from_various_urls[https://example.com/image.webp] _________________________________________________________

url = 'https://example.com/image.webp', request = <FixtureRequest for <Function test_image_from_various_urls[https://example.com/image.webp]>>

    @pytest.mark.parametrize(
        "url",
        [
            "http://example.com/image.jpg",
            "https://example.com/image.png",
            "https://example.com/image.webp",
            "https://example.com/image.jpg?param=value",
            "base64_png",
        ],
    )
    def test_image_from_various_urls(url, request):
        if url.startswith("base64"):
            url = request.getfixturevalue(url)
>       image = Image.from_url(url)

tests/test_multimodal.py:271: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

cls = <class 'instructor.multimodal.Image'>, url = 'https://example.com/image.webp'

    @classmethod
    @lru_cache
    def from_url(cls, url: str) -> Image:
        if cls.is_base64(url):
            return cls.from_base64(url)
    
        parsed_url = urlparse(url)
        media_type, _ = mimetypes.guess_type(parsed_url.path)
    
        if not media_type:
            try:
                response = requests.head(url, allow_redirects=True)
                media_type = response.headers.get("Content-Type")
            except requests.RequestException as e:
                raise ValueError(f"Failed to fetch image from URL") from e
    
        if media_type not in VALID_MIME_TYPES:
>           raise ValueError(f"Unsupported image format: {media_type}")
E           ValueError: Unsupported image format: text/html

instructor/multimodal.py:145: ValueError
__________________________________________________________ test_image_autodetect[/path/to/image.webp-file-image/webp] __________________________________________________________

input_data = '/path/to/image.webp', expected_type = 'file', expected_media_type = 'image/webp'
request = <FixtureRequest for <Function test_image_autodetect[/path/to/image.webp-file-image/webp]>>

    @pytest.mark.parametrize(
        "input_data, expected_type, expected_media_type",
        [
            # URL tests
            ("http://example.com/image.jpg", "url", "image/jpeg"),
            ("https://example.com/image.png", "url", "image/png"),
            ("https://example.com/image.webp", "url", "image/webp"),
            ("https://example.com/image.jpg?param=value", "url", "image/jpeg"),
            (
                "https://example.com/image",
                "url",
                "image/jpeg",
            ),  # Default to JPEG if no extension
            # Base64 data URI tests
            (
                "base64_png",
                "base64",
                "image/png",
            ),
            (
                "base64_jpeg",
                "base64",
                "image/jpeg",
            ),
            # File path tests (mocked)
            ("/path/to/image.jpg", "file", "image/jpeg"),
            ("/path/to/image.png", "file", "image/png"),
            ("/path/to/image.webp", "file", "image/webp"),
        ],
    )
    def test_image_autodetect(input_data, expected_type, expected_media_type, request):
        with (
            patch("pathlib.Path.is_file", return_value=True),
            patch("pathlib.Path.stat", return_value=MagicMock(st_size=1000)),
            patch("pathlib.Path.read_bytes", return_value=b"fake image data"),
            patch("requests.head") as mock_head,
        ):
            mock_head.return_value = MagicMock(
                headers={"Content-Type": expected_media_type}
            )
            if input_data.startswith("base64"):
                input_data = request.getfixturevalue(input_data)
    
>           image = Image.autodetect(input_data)

tests/test_multimodal.py:331: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
instructor/multimodal.py:70: in autodetect
    return cls.from_path(source)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

cls = <class 'instructor.multimodal.Image'>, path = PosixPath('/path/to/image.webp')

    @classmethod
    @lru_cache
    def from_path(cls, path: Union[str, Path]) -> Image:  # noqa: UP007
        path = Path(path)
        if not path.is_file():
            raise FileNotFoundError(f"Image file not found: {path}")
    
        if path.stat().st_size == 0:
            raise ValueError("Image file is empty")
    
        media_type, _ = mimetypes.guess_type(str(path))
        if media_type not in VALID_MIME_TYPES:
>           raise ValueError(f"Unsupported image format: {media_type}")
E           ValueError: Unsupported image format: None

instructor/multimodal.py:160: ValueError
=============================================================================== warnings summary ===============================================================================
../../../../../../../../usr/lib/python3.12/site-packages/pydantic/_internal/_generate_schema.py:312: 11 warnings
  /usr/lib/python3.12/site-packages/pydantic/_internal/_generate_schema.py:312: PydanticDeprecatedSince20: `json_encoders` is deprecated. See https://docs.pydantic.dev/2.9/concepts/serialization/#custom-serializers for alternatives. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.9/migration/
    warnings.warn(

../../../../../../../../usr/lib/python3.12/site-packages/pydantic/_internal/_config.py:291
  /usr/lib/python3.12/site-packages/pydantic/_internal/_config.py:291: PydanticDeprecatedSince20: Support for class-based `config` is deprecated, use ConfigDict instead. Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.9/migration/
    warnings.warn(DEPRECATION_MESSAGE, DeprecationWarning)

../../../../../../../../usr/lib/python3.12/site-packages/litellm/utils.py:140
  /usr/lib/python3.12/site-packages/litellm/utils.py:140: DeprecationWarning: open_text is deprecated. Use files() instead. Refer to https://importlib-resources.readthedocs.io/en/latest/using.html#migrating-from-legacy for migration advice.
    with resources.open_text("litellm.llms.tokenizers", "anthropic_tokenizer.json") as f:

tests/test_function_calls.py::test_complete_output_no_exception
  /home/amao/.cache/paru/clone/python-instructor/src/instructor-1.6.3/instructor/function_calls.py:143: DeprecationWarning: The FUNCTIONS mode is deprecated and will be removed in future versions
    Mode.warn_mode_functions_deprecation()

tests/test_function_calls.py::test_incomplete_output_exception_raise[mock_completion0]
  tests/test_function_calls.py:138: PytestWarning: The test <Function test_incomplete_output_exception_raise[mock_completion0]> is marked with '@pytest.mark.asyncio' but it is not an async function. Please remove the asyncio mark. If the test is not marked explicitly, check for global marks applied via 'pytestmark'.
    @pytest.mark.asyncio  # type: ignore[misc]

tests/test_multimodal.py::test_raw_base64_autodetect_jpeg
  /home/amao/.cache/paru/clone/python-instructor/src/instructor-1.6.3/instructor/multimodal.py:113: DeprecationWarning: 'imghdr' is deprecated and slated for removal in Python 3.13
    import imghdr

tests/test_patch.py::test_apatch_completes_successfully
  /home/amao/.cache/paru/clone/python-instructor/src/instructor-1.6.3/tests/test_patch.py:14: DeprecationWarning: apatch is deprecated, use patch instead
    instructor.apatch(AsyncOpenAI())

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=========================================================================== short test summary info ============================================================================
FAILED tests/test_multimodal.py::test_image_from_url_with_unusual_extension - ValueError: Unsupported image format: text/html
FAILED tests/test_multimodal.py::test_image_from_various_urls[https://example.com/image.webp] - ValueError: Unsupported image format: text/html
FAILED tests/test_multimodal.py::test_image_autodetect[/path/to/image.webp-file-image/webp] - ValueError: Unsupported image format: None
===================================================== 3 failed, 119 passed, 2 skipped, 69 deselected, 17 warnings in 3.90s =====================================================
@github-actions github-actions bot added the bug Something isn't working label Nov 12, 2024
@8ar10der
Copy link
Author

8ar10der commented Nov 12, 2024

https://example.com/image.webp has not included any webp content, it should be replaced by another link.

For example, we can use the Google's webp sample link: https://www.gstatic.com/webp/gallery3/1_webp_ll.webp

I can try to fix it and make a PR if I find a time. If any developer could do a quick fix please just do it and let me know :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant