Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openai.APIConnectionError when using batch mode not caught #279

Open
RyanMarten opened this issue Dec 20, 2024 · 5 comments
Open

openai.APIConnectionError when using batch mode not caught #279

RyanMarten opened this issue Dec 20, 2024 · 5 comments

Comments

@RyanMarten
Copy link
Contributor

-12-19_07-40-49_467973_1/runtime_resources/conda/01b738786d663797e9dc09585bc0cbdc81b2bf5f/lib/python3.10/site-packages/openai/_base_client.py", line 1666, in _retry_request
    return await self._request(
  File "/tmp/ray/session_2024-12-19_07-40-49_467973_1/runtime_resources/conda/01b738786d663797e9dc09585bc0cbdc81b2bf5f/lib/python3.10/site-packages/openai/_base_client.py", line 1606, in _request
    raise APIConnectionError(request=request) from err
openai.APIConnectionError: Connection error.

From the docs
https://help.openai.com/en/articles/6897191-apiconnectionerror

An APIConnectionError indicates that your request could not reach our servers or establish a secure connection. This could be due to a network issue, a proxy configuration, an SSL certificate, or a firewall rule.

If you encounter an APIConnectionError, please try the following steps:

  1. Check your network settings and make sure you have a stable and fast internet connection. You may need to switch to a different network, use a wired connection, or reduce the number of devices or applications using your bandwidth.
  2. Check your proxy configuration and make sure it is compatible with our services. You may need to update your proxy settings, use a different proxy, or bypass the proxy altogether.
  3. Check your SSL certificates and make sure they are valid and up-to-date. You may need to install or renew your certificates, use a different certificate authority, or disable SSL verification.
  4. Check your firewall rules and make sure they are not blocking or filtering our services. You may need to modify your firewall settings.

If the issue persists, contact our support team and provide them with the following information:

  • The model you were using
  • The error message and code you received
  • The request data and headers you sent
  • The timestamp and timezone of your request
  • Any other relevant details that may help us diagnose the issue

Is this something we can catch and then handle more robustly?

@RyanMarten
Copy link
Contributor Author

how to replicate? (maybe turning off the internet in the middle of batch submission?)

@RyanMarten
Copy link
Contributor Author

Yes, you can replicate this error by disabling the internet on your computer and then running the following.

python tests/batch/simple_batch.py --log-level DEBUG --n-requests 1 --batch-size 1 --batch-check-interval 10 --model gpt-4o-mini

@RyanMarten
Copy link
Contributor Author

I think the solution here is to have a wrapper function for openai client actions in which

  • catch the error
  • log an ERROR message with potential reasons the API Connection error is happening (per the docs above)
  • retry self.config.retry times

@RyanMarten
Copy link
Contributor Author

We should also do the same thing for anthropic client actions.

python tests/batch/simple_batch.py --log-level DEBUG --n-requests 3 --batch-size 1 --batch-check-interval 10 --model claude-3-5-haiku-20241022

With the internet disabled gives

  File "/Users/ryan/curator/.venv/lib/python3.12/site-packages/anthropic/_base_client.py", line 1609, in _request
    raise APIConnectionError(request=request) from err
anthropic.APIConnectionError: Connection error.

we can just have a single wrapper that catches both of these exception types

@RyanMarten
Copy link
Contributor Author

Also we want to catch and log these errors because they are difficult to see in the ray setup (requires going into the task table).

E.g. this is the dump in the logs

 File "/tmp/ray/session_2024-12-26_11-34-26_122960_1/runtime_resources/working_dir_files/_ray_pkg_b96f23d2264d172b/engine/operators/function_operator.py", line 391, in <dictcomp>
 processed_mapped_inputs = {k: ray.get(v) for k, v in mapped_inputs.items()}
ray.exceptions.RaySystemError: System error: Failed to unpickle serialized exception
traceback: Traceback (most recent call last):
 File "/tmp/ray/session_2024-12-26_11-34-26_122960_1/runtime_resources/conda/924d633951bc4260001fa77b4f1add9b7e9f4d68/lib/python3.10/site-packages/ray/exceptions.py", line 50, in from_ray_exception
 return pickle.loads(ray_exception.serialized_exception)
TypeError: APIStatusError.__init__() missing 2 required keyword-only arguments: 'response' and 'body'

The above exception was the direct cause of the following exception:

�[36mray::convert_instruction_response_to_sharegpt()�[39m (pid=758455, ip=10.120.5.14)
 File "/tmp/ray/session_2024-12-26_11-34-26_122960_1/runtime_resources/conda/924d633951bc4260001fa77b4f1add9b7e9f4d68/lib/python3.10/site-packages/ray/exceptions.py", line 44, in from_bytes
 return RayError.from_ray_exception(ray_exception)
 File "/tmp/ray/session_2024-12-26_11-34-26_122960_1/runtime_resources/conda/924d633951bc4260001fa77b4f1add9b7e9f4d68/lib/python3.10/site-packages/ray/exceptions.py", line 53, in from_ray_exception
 raise RuntimeError(msg) from e
RuntimeError: Failed to unpickle serialized exception

This is what the failure is reported as

Job entrypoint command failed with exit code 1, last available logs (truncated to 20,000 chars):
TypeError: APIStatusError.__init__() missing 2 required keyword-only arguments: 'response' and 'body'

The above exception was the direct cause of the following exception:

�[36mray::convert_instruction_response_to_sharegpt()�[39m (pid=758455, ip=10.120.5.14)
  File "/tmp/ray/session_2024-12-26_11-34-26_122960_1/runtime_resources/conda/924d633951bc4260001fa77b4f1add9b7e9f4d68/lib/python3.10/site-packages/ray/exceptions.py", line 44, in from_bytes
    return RayError.from_ray_exception(ray_exception)
  File "/tmp/ray/session_2024-12-26_11-34-26_122960_1/runtime_resources/conda/924d633951bc4260001fa77b4f1add9b7e9f4d68/lib/python3.10/site-packages/ray/exceptions.py", line 53, in from_ray_exception
    raise RuntimeError(msg) from e
RuntimeError: Failed to unpickle serialized exception

And this is what is in the task table

anthropic.BadRequestError: Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits.'}}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant