Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support cross-region inference for Amazon Bedrock #1114

Open
dlqqq opened this issue Nov 20, 2024 · 2 comments
Open

Support cross-region inference for Amazon Bedrock #1114

dlqqq opened this issue Nov 20, 2024 · 2 comments
Labels
enhancement New feature or request
Milestone

Comments

@dlqqq
Copy link
Member

dlqqq commented Nov 20, 2024

Description

Cross-region inference (CRI) allows requests to be automatically routed within any set of regions, which mitigates restrictions imposed by service quotas or peak usage times.

CRI is also required to use some models on Amazon Bedrock, notably Llama 3.2. A previous attempt at implementing Llama 3.2 support in Amazon Bedrock was stalled due to lack of existing support for CRI: #1014

Proposed solution

Jupyter AI needs to provide some user interface for supporting CRI. Tentatively, our proposal is to:

  • Implement a new dropdown field feature that allows for one option to be selected out of multiple.
  • Use this dropdown field in the Amazon Bedrock provider to allow users to specify a region area. Region areas include: us, us-gov, eu, apac.
  • Prepend the region area to the model ID to produce an inference profile ID in the format <region-area>.<model-id>. When passed to Bedrock APIs, this allows for CRI and allows for usage of Llama 3.2 models on Amazon Bedrock.
@dlqqq
Copy link
Member Author

dlqqq commented Nov 25, 2024

After some discussion with @ellisonbg, it seems to make more sense to always default to using CRI in the "us" region area if it is available. This removes the need for additional user input in specifying the region area, and removes the need to handle edge cases of a model supporting CRI in some region areas but not others.

This change will allow models available through CRI to be used from any region. I'll update #1113 accordingly.

@dlqqq
Copy link
Member Author

dlqqq commented Nov 25, 2024

We received some valuable feedback from other stakeholders. We concluded that we can't default to the "us" region area as it may violate data residency laws set by GDPR in the EU. Furthermore, having a simple global dropdown for the region area is a poor user experience, as not all models support CRI, and models which support CRI are not necessarily available in all CRI region areas.

Given that this effort will take longer than we had originally estimated, and the fact that v3 development shouldn't be delayed any longer, I will move this issue to the v3 milestone for future work.

As a short-term fix, we will recommend users use the "Bedrock (custom/provisioned)" provider and type the inference profile ID manually to use CRI. I will open a new issue for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant