Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using the types TransferDescriptor in typescript #348

Closed
CMCDragonkai opened this issue Mar 26, 2021 · 21 comments · Fixed by #352
Closed

Using the types TransferDescriptor in typescript #348

CMCDragonkai opened this issue Mar 26, 2021 · 21 comments · Fixed by #352
Labels

Comments

@CMCDragonkai
Copy link

CMCDragonkai commented Mar 26, 2021

I have a worker function like:

  encryptWithPublicKeyAsn1(
    publicKeyAsn1: PublicKeyAsn1,
    plainText: ArrayBuffer
  ): TransferDescriptor<ArrayBuffer> {
    const plainText_ = Buffer.from(plainText);
    const publicKey = keysUtils.publicKeyFromAsn1(publicKeyAsn1);
    const cipherText = keysUtils.encryptWithPublicKey(publicKey, plainText_);
    return Transfer(cipherText.buffer);
  },

However when I call this function from the main side:

      cipherText = await this.workerManager.call(
        async w => {
          const publicKeyAsn1 = keysUtils.publicKeyToAsn1(publicKey);
          return Buffer.from(
            await w.encryptWithPublicKeyAsn1(
              publicKeyAsn1,
              Transfer(plainText.buffer)
            )
          );
        }
      );

There's a type error:

No overload matches this call.
  The last overload gave the following error.
    Argument of type 'TransferDescriptor<ArrayBuffer>' is not assignable to parameter of type 'WithImplicitCoercion<string> | { [Symbol.toPrimitive](hint: "string"): string; }'.
      Property '[Symbol.toPrimitive]' is missing in type 'TransferDescriptor<ArrayBuffer>' but required in type '{ [Symbol.toPrimitive](hint: "string"): string; }'.

The problem appears that the return type is a TransferDescriptor<ArrayBuffer>.

Then when I call it, it must also be ArrayBuffer.

I have to typecast like this:

      cipherText = await this.workerManager.call(
        async w => {
          const publicKeyAsn1 = keysUtils.publicKeyToAsn1(publicKey);
          return Buffer.from(
            await w.encryptWithPublicKeyAsn1(
              publicKeyAsn1,
              Transfer(plainText.buffer) as unknown as ArrayBuffer
            ) as unknown as ArrayBuffer
          );
        }
      );

Is this the right way to do this?

@andywer andywer added the bug label Mar 27, 2021
@andywer
Copy link
Owner

andywer commented Mar 27, 2021

Hey @CMCDragonkai, thanks for reporting! I do think it's an issue with the types.

Workers can return transferables and the main thread will see no difference. The code normalizing the worker's result can be found here.

Now the worker function type defs in src/types/master.ts need to reflect that by applying something like this: type StripTransferDescriptor<T> = T extends TransferDescriptor<infer BaseT> ? BaseT : T

Give me a little bit of time to prepare a PR :)

@CMCDragonkai
Copy link
Author

Hey @andywer I got around to trying this again and it still doesn't work.

This is the function I have inside my worker:

  encryptWithKey(
    key: ArrayBuffer,
    plainText: ArrayBuffer
  ): TransferDescriptor<ArrayBuffer> {
    const cipherText = utils.encryptWithKey(
      Buffer.from(key),
      Buffer.from(plainText)
    );
    return Transfer(cipherText.buffer);
  },

Am I supposed to have TransferDescriptor on my parameter types as well?

I'm not able to use Transfer on my node buffers nor my node buffer array buffers.

@CMCDragonkai
Copy link
Author

Do you have an example that uses Node buffers, and not ArrayBuffer?

@CMCDragonkai
Copy link
Author

This is how I call it:

  cipherText = await workerManager.call(
    async w => {
      const buf = await w.encryptWithKey(
        Transfer(key.buffer),
        Transfer(plainText.buffer)
      );
      return Buffer.from(buf);
    }
  );

There is still a type error on Transfer(plainText.buffer). It says:

Argument of type 'TransferDescriptor<any>' is not assignable to parameter of type 'ArrayBuffer'.
  Type 'TransferDescriptor<any>' is missing the following properties from type 'ArrayBuffer': byteLength, slice, [Symbol.toStringTag]

It seems that either there's some corruption sending a Node buffer's array buffer over and there's also a type error, in that encryptWithkey has 2 ArrayBuffer parameters, but it only makes the first one a TransferDescriptor in typescript inference.

@CMCDragonkai
Copy link
Author

Ok I think I understood what happened. Node buffers use a large ArrayBuffer. To properly use them I have to also transfer the offset and length information.

However there is in fact another type error. This happens when there are multiple ArrayBuffer parameters in the worker function.

Typescript complains that using Transfer on the second time doesn't match.

For example:

oneForOne(args_0: ArrayBuffer | TransferDescriptor<ArrayBuffer>, args_1: ArrayBuffer): ObservablePromise<ArrayBuffer>

That's the type inferred in vscode. See how the second parameter is just ArrayBuffer instead of TransferDescriptor?

Now it is possible for me to specify the array buffer variable without using Transfer on the call site. But does this mean it's not zero-copy? I am wary of not using Transfer since the docs say it is needed to ensure that it is in fact zero copied.

For now I have to use //@ts-ignore.

@CMCDragonkai
Copy link
Author

Ok I have got it working now. Is this right way to do all of this:

  encryptWithKey(
    key: ArrayBuffer,
    keyOffset: number,
    keyLength: number,
    plainText: ArrayBuffer,
    plainTextOffset: number,
    plainTextLength: number
  ): TransferDescriptor<[ArrayBuffer, number, number]> {
    const key_ = Buffer.from(key, keyOffset, keyLength);
    const plainText_ = Buffer.from(plainText, plainTextOffset, plainTextLength);
    const cipherText = utils.encryptWithKey(
      key_,
      plainText_
    );
    return Transfer(
      [
        cipherText.buffer,
        cipherText.byteOffset,
        cipherText.byteLength
      ],
      [
        cipherText.buffer
      ]
    );
  },

And

  cipherText = await workerManager.call(
    async w => {
      const [arrayBuf, arrayBufOffset, arrayBufLength]= await w.encryptWithKey(
        Transfer(key.buffer),
        key.byteOffset,
        key.byteLength,
        // @ts-ignore
        Transfer(plainText.buffer),
        plainText.byteOffset,
        plainText.byteLength
      );
      return Buffer.from(arrayBuf, arrayBufOffset, arrayBufLength);
    }
  );

I'm concerned about the part where I have to use @ts-ignore.

Furthermore the comments say that, if I use Transfer the thread cannot longer use it. But I find that I am able to use key and plainText. Is this a concern? Since I will need to share that key with other threads later.

@andywer
Copy link
Owner

andywer commented Jul 16, 2021

Hey @CMCDragonkai!

Good to see you figured most of it out already. Yes, so the way to invoke functions that use transferable objects is very closely related to the way you would use transferable objects without threads.js – mainly for performance reasons.

There are still a few misconceptions, I think, and the code can also be improved. How about this (haven't tried to run it, but it should convey the idea):

encryptWithKey(
    key: {
        data: ArrayBuffer,
        offset: number,
        length: number,
    },
    plainText: {
        data: ArrayBuffer,
        offset: number,
        length: number,
    }
  ): TransferDescriptor<[ArrayBuffer, number, number]>

and

  cipherText = await workerManager.call(
    async w => {
      const [arrayBuf, arrayBufOffset, arrayBufLength]= await w.encryptWithKey(
        Transfer({ data: key.buffer, offset: key.byteOffset, length: key.byteLength }, key.buffer),
        Transfer({ data: plainText.buffer, offset: plainText.byteOffset, length: plainText.byteLength }, plainText.buffer),
      );
      return Buffer.from(arrayBuf, arrayBufOffset, arrayBufLength);
    }
  );

You could then simplify it further by not crafting a new kind of object that resembles the node.js Buffer, but actually passing the Buffer. To make that work you will probably need to write a serializer/deserializer for Buffers, though, I guess:

import { registerSerializer, SerializerImplementation } from "threads"

interface SerializedBuffer {
  __type: "$$Buffer"
  buffer: ArrayBuffer
  byteOffset: number
  byteLength: number
}

const BufferSerializer: SerializerImplementation = {
  deserialize(thing, defaultHandler) {
    if (thing && thing.__type === "$$Buffer") {
      return Buffer.from((thing as any).buffer, (thing as any).byteOffset, (thing as any).byteLength)
    } else {
      return defaultHandler(thing)
    }
  },
  serialize(thing, defaultHandler) {
    if (thing instanceof Buffer) {
      return {
        __type: "$$Buffer",
        buffer: thing.buffer,
        byteOffset: thing.byteOffset,
        byteLength: thing.byteLength
      }
    } else {
      return defaultHandler(thing)
    }
  }
}

registerSerializer(BufferSerializer)

See https://threads.js.org/usage-advanced#custom-message-serializers for details. I have to admit, the documentation is not so easy to understand for these advanced features…

@CMCDragonkai
Copy link
Author

@andywer thanks for the advice. However I'm curious as to how to deal with the detachment of the array buffer. If the array buffer from the main thread is detached, how can I re-use the key and plaintext buffers for a subsequent operation on threadsjs?

Furthermore the Node Buffer's ArrayBuffer did not get detached, so I found that it was still a copy.

@andywer
Copy link
Owner

andywer commented Jul 17, 2021

If the array buffer from the main thread is detached, how can I re-use the key and plaintext buffers for a subsequent operation on threadsjs?

That's why it's called transferable objects: You transfer the data from one thread to another instead of copying it. Of course you cannot use it in the source thread anymore, unless the second thread transfers it back after it's done.

Your use case sounds as if you really aim to be able to use the keys in both threads independently. If that's the case then you might actually want to not transfer the data, but have it copied.

Furthermore the Node Buffer's ArrayBuffer did not get detached, so I found that it was still a copy.

Using my code sample?

@CMCDragonkai
Copy link
Author

CMCDragonkai commented Jul 18, 2021 via email

@andywer
Copy link
Owner

andywer commented Jul 18, 2021

So that's why I think it's a good idea for it to do zero copy for each call using the same key.

I get that, but it really comes down to copying vs. moving between threads. If you need it in both threads simultaneously you should copy it.

What would make sense, though, is to copy it once only. So you might want to assign each key an ID or calculate a hash, copy the keys to the other threads once only and then pass their ID/hash on each call instead of passing the whole key.

Can't help but think that this would be much easier to do if #273 was merged, so other threads can call the main thread (to request a key by it ID/hash if the key has not been cached in the thread yet).

@andywer
Copy link
Owner

andywer commented Jul 18, 2021

The other question is: What's the size of those keys? If it's insignificant compared to the size of the data that you want to encrypt, it might not even be worth optimizing… 😉

PS: #273 could also make it possible to pass streams to workers. Might be really valuable if you need to encrypt large amounts of data.

@CMCDragonkai
Copy link
Author

CMCDragonkai commented Jul 18, 2021 via email

@andywer
Copy link
Owner

andywer commented Jul 18, 2021

I think node.js Buffers should work in principle. However, have you considered just passing the array buffers? They are transferable as a whole and the node.js Buffers are basically just a fancy wrapper around them.

Might be easier to pass the array buffers themselves and if you really need some of the Buffer functionality, you could Buffer.from(arrayBuffer) on the other side.

The Transfer() really transfers ownership. It's gone from the source thread afterwards. There is, however, a shared memory array buffer: See SharedArrayBuffer. Should work with node.

@CMCDragonkai
Copy link
Author

CMCDragonkai commented Jul 18, 2021 via email

@CMCDragonkai
Copy link
Author

If I used SharedArrayBuffer, would Transfer still be required?

@andywer
Copy link
Owner

andywer commented Sep 7, 2021

I didn't try yet myself, but according to this answer you don't need Transfer(), according to the spec you apparently must not, in fact (that's 5 years old information, though).

@CMCDragonkai
Copy link
Author

CMCDragonkai commented Sep 17, 2021

Hi @andywer I think there's still a problem with Transfer types. I remember you added StripTransfer, however I believe this only applies to the first parameter. The second parameter still requires a @ts-ignore.

w.transferBuffers(
  Transfer(x),
  // @ts-ignore: threadsjs type is wrong
  Transfer(y)
);

The second Transfer isn't allowed by typescript inference.

Also I have a question about this situation...

In the actual method signature for the worker modules, we don't use TransferDescriptor on the argument types, why do we have to put it on the return type. Either they should be on both input and output, or neither. I'd argue for neither... but it doesn't work atm.

@andywer
Copy link
Owner

andywer commented Sep 17, 2021

I am not a 100% sure about the specifics anymore, but I will give it a shot…

In the actual method signature for the worker modules, we don't use TransferDescriptor on the argument types, why do we have to put it on the return type.

I don't think you have to type the worker function return type as TransferDescriptor at all. It just ends up there if you return some Transfer(x). That's why the StripTransfer<> util type is then needed in the main thread. The worker function does return a Transfer(x), telling threads.js that x shall be transferred, not copied, but the main thread eventually receives just the plain x, either way.

We would have to do the same thing with the worker function parameters, but there is one good reason why we don't need to: The worker function signature used in the main thread is derived from the worker's types, but we never use any main thread types in the worker, so we only need to do StripTransfer<> one way, not in the other direction.

The second Transfer isn't allowed by typescript inference.

What's the error?

@CMCDragonkai
Copy link
Author

The error is:

image

Argument of type 'TransferDescriptor<any>' is not assignable to parameter of type 'ArrayBuffer'.
  Type 'TransferDescriptor<any>' is missing the following properties from type 'ArrayBuffer': byteLength, slice, [Symbol.toStringTag]ts(2345)

The inferred type from w.encrypt is:

(method) encrypt(args_0: ArrayBuffer | TransferDescriptor<ArrayBuffer>, args_1: ArrayBuffer): ObservablePromise<ArrayBuffer>

Do you see how the second parameter doesn't get the TransferDescriptor option?

This is my "worker module"

const dbWorker = {
  async encrypt(
    key: ArrayBuffer,
    plainText: ArrayBuffer,
  ): Promise<TransferDescriptor<ArrayBuffer>> {
    const cipherText = await utils.encrypt(key, plainText);
    return Transfer(cipherText);
  },
  async decrypt(
    key: ArrayBuffer,
    cipherText: ArrayBuffer,
  ): Promise<TransferDescriptor<ArrayBuffer> | undefined> {
    const plainText = await utils.decrypt(key, cipherText);
    if (plainText != null) {
      return Transfer(plainText);
    } else {
      return;
    }
  },
};

@CMCDragonkai
Copy link
Author

This is still a problem btw, the StripTransfer only solves the problem for the first parameter. Not the second or subsequent parameters. The StripTransfer should be applied to all potential parameters to the worker function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants