-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bedrock Embeddings ThrottlingException #2905
Comments
🤖 Hello, I understand you're encountering a ThrottlingException when trying to generate a vector store using the A similar issue was encountered and resolved in the past with the OpenAIEmbedding. The solution was to add concurrency settings to control the number of promises that can be pending at the same time. This was implemented in version 0.0.130 of langchainjs. You can find more details about this solution in the issue #2140 and the update #2294. To apply a similar solution to your case, you can modify the First, install the 'bluebird' library: npm install bluebird Then, modify the 'fromDocuments' method in the 'HNSWLib' class: import * as Bluebird from 'bluebird';
// ...
static async fromDocuments(docs: Document[], embeddings: Embeddings, concurrency: number = 1000): Promise<HNSWLib> {
const vectorStore = new HNSWLib();
await Bluebird.map(docs, async (doc) => {
const vector = await embeddings.embedDocument(doc);
vectorStore.add(doc.id, vector);
}, {concurrency});
return vectorStore;
} In this code, 'Bluebird.map' is used to process the documents in parallel but with a limit on the number of concurrent operations. The 'concurrency' parameter controls the number of promises that can be pending at the same time. You can adjust this value based on your rate limit. Finally, use the modified 'fromDocuments' method: const vectorStore = await HNSWLib.fromDocuments(docs, new OpenAIEmbeddings(), 500); In this example, the concurrency level is set to 500. This means that at most 500 documents will be processed at the same time. As for the current rate limit for the 'BedrockEmbeddings.embedDocuments' method in the langchainjs codebase, I wasn't able to find this information in the repository. You might want to check the Bedrock Embeddings documentation or contact their support for this information. I hope this helps! Let me know if you have any other questions. Sources
This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot. |
I'll look into fixing this to better use concurrency, thank you for opening this! |
When attempting to generate a vector store using the
fromDocuments
method on approximately 10,000 documents (specifically on theHNSWLib
class) , it reaches the rate limit quickly. Most likely due to theBedrockEmbeddings.embedDocuments
method just being a simplePromise.all
on every document at once.The text was updated successfully, but these errors were encountered: