Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] bulk ignoring more than one operation #593

Open
sireliah opened this issue Aug 22, 2023 · 5 comments
Open

[BUG] bulk ignoring more than one operation #593

sireliah opened this issue Aug 22, 2023 · 5 comments
Labels
🐛 bug Something isn't working

Comments

@sireliah
Copy link

What is the bug?

Firstly thanks for maintaining the library folks!

The bug I found is related to the bulk call that performs multiple operations, possibly on different indices.

How can one reproduce the bug?

Implement your onDocument callback to return more than one operation. Below example with upserting data to two different indices.

client.helpers.bulk({
  datasource: [{ id: '1234', someField: 1, maybeAnotherField: 0 }],
  onDocument (document) {
    return [
      {
        update: { _id: document.id, _index: 'index_a' }
      },
      {
        doc: { someField: document.someField },
        doc_as_upsert: true
      },
      {
        update: { _id: document.id, _index: 'index_b' }
      },
      {
        doc: { maybeAnotherField: document.maybeAnotherField },
        doc_as_upsert: true
      }
    ]
  }
})

What is the expected behavior?

Expected behaviour is that both index_a and index_b are updated, however the library is making bulk requests to only the first index.

The OpenSearch API allows you to specify different operations to different indices. I've tested this manually through curl to confirm and can attach example request if needed.

What is your host/environment?

OS: Ubuntu 22.04 x86_64
node: v18.13.0
opensearch-js: 2.3.1

Do you have any screenshots?

n/a

Do you have any additional context?

The bug is caused by assumption that you will perform just one operation, see this line.

Please take a look and let me know what do you think.

@dblock
Copy link
Member

dblock commented Aug 22, 2023

Looks legit. Want to try to write a unit test that reproduces this? Maybe a fix? :)

@sireliah
Copy link
Author

Sure, I can do this in spare time. For my personal patch fix I've just added iteration over all { action, payload } pairs and it does the job. I just need to cover edge cases, for example: what if user uses odd number of elements in the array.

@wbeckler wbeckler added the 🐛 bug Something isn't working label Oct 3, 2023
@AbhinavGarg90
Copy link

Is this issue something we are still looking to fix? I can work on creating the unit test for this.

@sireliah
Copy link
Author

Hi @AbhinavGarg90! Sure, unit tests would be super helpful! Here's the patch I mentioned before.

diff --git a/node_modules/@opensearch-project/opensearch/lib/Helpers.js b/node_modules/@opensearch-project/opensearch/lib/Helpers.js
index a648aea..09e689b 100644
--- a/node_modules/@opensearch-project/opensearch/lib/Helpers.js
+++ b/node_modules/@opensearch-project/opensearch/lib/Helpers.js
@@ -530,28 +530,36 @@ class Helpers {
         if (shouldAbort === true) break;
         timeoutRef.refresh();
         const result = onDocument(chunk);
-        const [action, payload] = Array.isArray(result) ? result : [result, chunk];
-        const operation = Object.keys(action)[0];
-        if (operation === 'index' || operation === 'create') {
-          actionBody = serializer.serialize(action);
-          payloadBody = typeof payload === 'string' ? payload : serializer.serialize(payload);
-          chunkBytes += Buffer.byteLength(actionBody) + Buffer.byteLength(payloadBody);
-          bulkBody.push(actionBody, payloadBody);
-        } else if (operation === 'update') {
-          actionBody = serializer.serialize(action);
-          payloadBody =
-            typeof chunk === 'string'
-              ? `{"doc":${chunk}}`
-              : serializer.serialize({ doc: chunk, ...payload });
-          chunkBytes += Buffer.byteLength(actionBody) + Buffer.byteLength(payloadBody);
-          bulkBody.push(actionBody, payloadBody);
-        } else if (operation === 'delete') {
-          actionBody = serializer.serialize(action);
-          chunkBytes += Buffer.byteLength(actionBody);
-          bulkBody.push(actionBody);
-        } else {
-          clearTimeout(timeoutRef);
-          throw new ConfigurationError(`Bulk helper invalid action: '${operation}'`);
+
+        // This is temporary patch to support multiple operations, instead of just one.
+        // https://github.com/opensearch-project/opensearch-js/issues/593
+        for (let i = 0; i < result.length - 1; i = i + 2) {
+          const action = result[i]
+          const payload = result[i + 1]
+
+          const operation = Object.keys(action)[0];
+          if (operation === 'index' || operation === 'create') {
+            actionBody = serializer.serialize(action);
+            payloadBody = typeof payload === 'string' ? payload : serializer.serialize(payload);
+            chunkBytes += Buffer.byteLength(actionBody) + Buffer.byteLength(payloadBody);
+            bulkBody.push(actionBody, payloadBody);
+          } else if (operation === 'update') {
+            actionBody = serializer.serialize(action);
+            payloadBody =
+              typeof chunk === 'string'
+                ? `{"doc":${chunk}}`
+                : serializer.serialize({ doc: chunk, ...payload });
+            chunkBytes += Buffer.byteLength(actionBody) + Buffer.byteLength(payloadBody);
+            bulkBody.push(actionBody, payloadBody);
+          } else if (operation === 'delete') {
+            actionBody = serializer.serialize(action);
+            chunkBytes += Buffer.byteLength(actionBody);
+            bulkBody.push(actionBody);
+          } else {
+            clearTimeout(timeoutRef);
+            throw new ConfigurationError(`Bulk helper invalid action: '${operation}'`);
+          }
+
         }
 
         if (chunkBytes >= flushBytes) {

@AbhinavGarg90
Copy link

Thanks, I'll look into this and make the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐛 bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants