You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm testing a pipeline that utilizes the Unstructured Converter component for processing PDFs. The pipeline works locally, but fails with Hayhooks.
The error is as follows:
2024-06-07 14:41:27
Converting files to Haystack Documents: 0it [00:00, ?it/s]Unstructured could not process file /data/file.pdf. Error: Traceback (most recent call last):
2024-06-07 14:41:27 File "/opt/venv/lib/python3.12/site-packages/haystack_integrations/components/converters/unstructured/converter.py", line 198, in _partition_file_into_elements
2024-06-07 14:41:27 elements = partition_via_api(
2024-06-07 14:41:27 ^^^^^^^^^^^^^^^^^^
2024-06-07 14:41:27 File "/opt/venv/lib/python3.12/site-packages/unstructured/partition/api.py", line 70, in partition_via_api
2024-06-07 14:41:27 sdk = UnstructuredClient(api_key_auth=api_key, server_url=base_url)
2024-06-07 14:41:27 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-06-07 14:41:27 File "/opt/venv/lib/python3.12/site-packages/unstructured_client/sdk.py", line 54, in __init__
2024-06-07 14:41:27 self.sdk_configuration = SDKConfiguration(
2024-06-07 14:41:27 ^^^^^^^^^^^^^^^^^
2024-06-07 14:41:27 File "<string>", line 13, in __init__
2024-06-07 14:41:27 File "/opt/venv/lib/python3.12/site-packages/unstructured_client/sdkconfiguration.py", line 38, in __post_init__
2024-06-07 14:41:27 self._hooks = SDKHooks()
2024-06-07 14:41:27 ^^^^^^^^^^
2024-06-07 14:41:27 File "/opt/venv/lib/python3.12/site-packages/unstructured_client/_hooks/sdkhooks.py", line 15, in __init__
2024-06-07 14:41:27 init_hooks(self)
2024-06-07 14:41:27 File "/opt/venv/lib/python3.12/site-packages/unstructured_client/_hooks/registration.py", line 28, in init_hooks
2024-06-07 14:41:27 split_pdf_hook = SplitPdfHook()
2024-06-07 14:41:27 ^^^^^^^^^^^^^^
2024-06-07 14:41:27 File "/opt/venv/lib/python3.12/site-packages/unstructured_client/_hooks/custom/split_pdf_hook.py", line 73, in __init__
2024-06-07 14:41:27 nest_asyncio.apply()
2024-06-07 14:41:27 File "/opt/venv/lib/python3.12/site-packages/nest_asyncio.py", line 19, in apply
2024-06-07 14:41:27 _patch_loop(loop)
2024-06-07 14:41:27 File "/opt/venv/lib/python3.12/site-packages/nest_asyncio.py", line 193, in _patch_loop
2024-06-07 14:41:27 raise ValueError('Can\'t patch loop of type %s' % type(loop))
2024-06-07 14:41:27 ValueError: Can't patch loop of type <class 'uvloop.Loop'>
The line of code that causes this to fail is nest_asyncio.apply().
After some research, I fixed the issue for myself by updating the cli code to the following. I not an expert here and wanted to know if this approach is fine?
I'm testing a pipeline that utilizes the Unstructured Converter component for processing PDFs. The pipeline works locally, but fails with Hayhooks.
The error is as follows:
The line of code that causes this to fail is
nest_asyncio.apply()
.After some research, I fixed the issue for myself by updating the cli code to the following. I not an expert here and wanted to know if this approach is fine?
The text was updated successfully, but these errors were encountered: