Skip to content

Commit

Permalink
refactor(experimental): add getUnionCodec to @solana/codecs-data-stru…
Browse files Browse the repository at this point in the history
…ctures (#2398)

This PR adds a new `getUnionCodec` helper that can be used to encode/decode any TypeScript union.

It accepts the following arguments:

-   An array of codecs, each defining a variant of the union.
-   A `getIndexFromValue` function which, given a value of the union, returns the index of the codec that should be used to encode that value.
-   A `getIndexFromBytes` function which, given the byte array to decode at a given offset, returns the index of the codec that should be used to decode the next bytes.

```ts
const codec: Codec<number | boolean> = getUnionCodec(
    [getU16Codec(), getBooleanCodec()],
    value => (typeof value === 'number' ? 0 : 1),
    (bytes, offset) => (bytes.slice(offset).length > 1 ? 0 : 1),
);

codec.encode(42); // 0x2a00
codec.encode(true); // 0x01
```
  • Loading branch information
lorisleiva authored Apr 2, 2024
1 parent a548de2 commit bef9604
Show file tree
Hide file tree
Showing 10 changed files with 402 additions and 23 deletions.
17 changes: 17 additions & 0 deletions .changeset/nervous-deers-roll.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
'@solana/codecs-data-structures': patch
'@solana/errors': patch
---

Added a new `getUnionCodec` helper that can be used to encode/decode any TypeScript union.

```ts
const codec: Codec<number | boolean> = getUnionCodec(
[getU16Codec(), getBooleanCodec()],
value => (typeof value === 'number' ? 0 : 1),
(bytes, offset) => (bytes.slice(offset).length > 1 ? 0 : 1),
);

codec.encode(42); // 0x2a00
codec.encode(true); // 0x01
```
72 changes: 50 additions & 22 deletions packages/codecs-data-structures/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -243,6 +243,40 @@ const bytes = getEnumEncoder(Direction).encode(Direction.Left);
const direction = getEnumDecoder(Direction).decode(bytes);
```

## Literal union codec

The `getLiteralUnionCodec` function works similarly to the `getUnionCodec` function but does not require a JavaScript `enum` to exist.

It accepts an array of literal values — such as `string`, `number`, `boolean`, etc. — and returns a codec that encodes and decodes such values using by using their index in the array. It uses TypeScript unions to represent all the possible values.

```ts
const codec = getLiteralUnionCodec(['left', 'right', 'up', 'down']);
// ^? FixedSizeCodec<"left" | "right" | "up" | "down">

const bytes = codec.encode('left'); // 0x00
const value = codec.decode(bytes); // 'left'
```

As you can see, it uses a `u8` number by default to store the index of the value. However, you may provide a number codec as the `size` option of the `getLiteralUnionCodec` function to customise that behaviour.

```ts
const codec = getLiteralUnionCodec(['left', 'right', 'up', 'down'], {
size: getU32Codec(),
});

codec.encode('left'); // 0x00000000
codec.encode('right'); // 0x01000000
codec.encode('up'); // 0x02000000
codec.encode('down'); // 0x03000000
```

Separate `getLiteralUnionEncoder` and `getLiteralUnionDecoder` functions are also available.

```ts
const bytes = getLiteralUnionEncoder(['left', 'right']).encode('left'); // 0x00
const value = getLiteralUnionDecoder(['left', 'right']).decode(bytes); // 'left'
```

## Discriminated union codec

In Rust, enums are powerful data types whose variants can be one of the following:
Expand Down Expand Up @@ -365,38 +399,32 @@ const bytes = getDiscriminatedUnionEncoder(variantEncoders).encode({ __kind: 'Qu
const message = getDiscriminatedUnionDecoder(variantDecoders).decode(bytes);
```

## Literal union codec

The `getLiteralUnionCodec` function works similarly to the `getUnionCodec` function but does not require a JavaScript `enum` to exist.
## Union codec

It accepts an array of literal values — such as `string`, `number`, `boolean`, etc. — and returns a codec that encodes and decodes such values using by using their index in the array. It uses TypeScript unions to represent all the possible values.
The `getUnionCodec` is a lower-lever codec helper that can be used to encode/decode any TypeScript union.

```ts
const codec = getLiteralUnionCodec(['left', 'right', 'up', 'down']);
// ^? FixedSizeCodec<"left" | "right" | "up" | "down">

const bytes = codec.encode('left'); // 0x00
const value = codec.decode(bytes); // 'left'
```
It accepts the following arguments:

As you can see, it uses a `u8` number by default to store the index of the value. However, you may provide a number codec as the `size` option of the `getLiteralUnionCodec` function to customise that behaviour.
- An array of codecs, each defining a variant of the union.
- A `getIndexFromValue` function which, given a value of the union, returns the index of the codec that should be used to encode that value.
- A `getIndexFromBytes` function which, given the byte array to decode at a given offset, returns the index of the codec that should be used to decode the next bytes.

```ts
const codec = getLiteralUnionCodec(['left', 'right', 'up', 'down'], {
size: getU32Codec(),
});
const codec: Codec<number | boolean> = getUnionCodec(
[getU16Codec(), getBooleanCodec()],
value => (typeof value === 'number' ? 0 : 1),
(bytes, offset) => (bytes.slice(offset).length > 1 ? 0 : 1),
);

codec.encode('left'); // 0x00000000
codec.encode('right'); // 0x01000000
codec.encode('up'); // 0x02000000
codec.encode('down'); // 0x03000000
codec.encode(42); // 0x2a00
codec.encode(true); // 0x01
```

Separate `getLiteralUnionEncoder` and `getLiteralUnionDecoder` functions are also available.
As usual, separate `getUnionEncoder` and `getUnionDecoder` functions are also available.

```ts
const bytes = getLiteralUnionEncoder(['left', 'right']).encode('left'); // 0x00
const value = getLiteralUnionDecoder(['left', 'right']).decode(bytes); // 'left'
const bytes = getUnionEncoder(encoders, getIndexFromValue).encode(42);
const value = getUnionDecoder(decoders, getIndexFromBytes).decode(bytes);
```

## Boolean codec
Expand Down
147 changes: 147 additions & 0 deletions packages/codecs-data-structures/src/__tests__/union-test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
import { assertIsFixedSize, assertIsVariableSize, fixCodec, mapCodec } from '@solana/codecs-core';
import { getU8Codec, getU16Codec } from '@solana/codecs-numbers';
import { getUtf8Codec } from '@solana/codecs-strings';
import { SOLANA_ERROR__CODECS__UNION_VARIANT_OUT_OF_RANGE, SolanaError } from '@solana/errors';

import { getBooleanCodec } from '../boolean';
import { getStructCodec } from '../struct';
import { getUnionCodec } from '../union';
import { b } from './__setup__';

describe('getUnionCodec', () => {
const codec = getUnionCodec(
[
fixCodec(getUtf8Codec(), 8), // 8 bytes.
getU16Codec(), // 2 bytes.
getBooleanCodec(), // 1 byte.
getStructCodec([
// 4 bytes.
['x', getU16Codec()],
['y', getU16Codec()],
]),
],
value => {
if (value === 999) return 999;
if (typeof value === 'string') return 0;
if (typeof value === 'number') return 1;
if (typeof value === 'boolean') return 2;
return 3;
},
bytes => {
if (bytes.length === 3 && [...bytes].every(byte => byte === 255)) return 999;
if (bytes.length === 8) return 0;
if (bytes.length === 2) return 1;
if (bytes.length === 1) return 2;
return 3;
},
);

it('encodes any valid union variant', () => {
expect(codec.encode('hello')).toStrictEqual(b('68656c6c6f000000'));
expect(codec.encode(42)).toStrictEqual(b('2a00'));
expect(codec.encode(true)).toStrictEqual(b('01'));
expect(codec.encode({ x: 1, y: 2 })).toStrictEqual(b('01000200'));
});

it('decodes any valid union variant', () => {
expect(codec.decode(b('68656c6c6f000000'))).toBe('hello');
expect(codec.decode(b('2a00'))).toBe(42);
expect(codec.decode(b('01'))).toBe(true);
expect(codec.decode(b('01000200'))).toStrictEqual({ x: 1, y: 2 });
});

it('pushes the offset forward when writing', () => {
expect(codec.write(42, new Uint8Array(10), 6)).toBe(8);
});

it('pushes the offset forward when reading', () => {
expect(codec.read(b('00'), 0)).toStrictEqual([false, 1]);
});

it('throws when encoding an invalid variant', () => {
expect(() => codec.encode(999)).toThrow(
new SolanaError(SOLANA_ERROR__CODECS__UNION_VARIANT_OUT_OF_RANGE, {
maxRange: 3,
minRange: 0,
variant: 999,
}),
);
});

it('throws when decoding an invalid variant', () => {
expect(() => codec.decode(b('ffffff'))).toThrow(
new SolanaError(SOLANA_ERROR__CODECS__UNION_VARIANT_OUT_OF_RANGE, {
maxRange: 3,
minRange: 0,
variant: 999,
}),
);
});

it('returns a variable size codec', () => {
assertIsVariableSize(codec);
expect(codec.getSizeFromValue('hello')).toBe(8);
expect(codec.getSizeFromValue(42)).toBe(2);
expect(codec.getSizeFromValue(true)).toBe(1);
expect(codec.getSizeFromValue({ x: 1, y: 2 })).toBe(4);
expect(codec.maxSize).toBe(8);
});

it('returns a fixed size codec when all variants have the same fixed size', () => {
const sameSizeCodec = getUnionCodec(
[getU8Codec(), getBooleanCodec()],
() => 0,
() => 0,
);
assertIsFixedSize(sameSizeCodec);
expect(sameSizeCodec.fixedSize).toBe(1);
});

it('can be used to create a zeroable nullable codec', () => {
const nullCodec = mapCodec(
getU8Codec(),
(_value: null) => 0xff,
() => null,
);
const zeroableCodec = getUnionCodec(
[nullCodec, getU8Codec()],
value => Number(value !== null),
(bytes, offset) => Number(bytes[offset] !== 0xff),
);
expect(zeroableCodec.encode(null)).toStrictEqual(b('ff'));
expect(zeroableCodec.encode(42)).toStrictEqual(b('2a'));
expect(zeroableCodec.decode(b('ff'))).toBeNull();
expect(zeroableCodec.decode(b('2a'))).toBe(42);
});

it('can be used to create a discriminated union codec', () => {
const staticU16One = mapCodec(
getU16Codec(),
(_value: 1) => 1,
() => 1 as const,
);
const staticU16Two = mapCodec(
getU16Codec(),
(_value: 2) => 2,
() => 2 as const,
);
const discriminatedUnionCodec = getUnionCodec(
[
getStructCodec([
['header', getU16Codec()],
['type', staticU16One],
]),
getStructCodec([
['size', getU16Codec()],
['type', staticU16Two],
]),
],
value => value.type - 1,
(bytes, offset) => bytes[offset + 2] - 1,
);
expect(discriminatedUnionCodec.encode({ header: 42, type: 1 })).toStrictEqual(b('2a000100'));
expect(discriminatedUnionCodec.encode({ size: 9, type: 2 })).toStrictEqual(b('09000200'));
expect(discriminatedUnionCodec.decode(b('2a000100'))).toStrictEqual({ header: 42, type: 1 });
expect(discriminatedUnionCodec.decode(b('09000200'))).toStrictEqual({ size: 9, type: 2 });
});
});
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
import { Codec, Decoder, Encoder } from '@solana/codecs-core';

import { getUnionCodec, getUnionDecoder, getUnionEncoder } from '../union';

const getIndex = () => 0;

// [getUnionEncoder] It constructs unions from a list of encoder variants.
{
getUnionEncoder(
[
{} as Encoder<null>,
{} as Encoder<bigint | number>,
{} as Encoder<{ value: string }>,
{} as Encoder<{ x: number; y: number }>,
],
getIndex,
) satisfies Encoder<bigint | number | { value: string } | { x: number; y: number } | null>;
}

// [getUnionDecoder] It constructs unions from a list of decoder variants.
{
getUnionDecoder(
[
{} as Decoder<null>,
{} as Decoder<bigint | number>,
{} as Decoder<{ value: string }>,
{} as Decoder<{ x: number; y: number }>,
],
getIndex,
) satisfies Decoder<bigint | number | { value: string } | { x: number; y: number } | null>;
}

// [getUnionCodec] It constructs unions from a list of codec variants.
{
getUnionCodec(
[
{} as Codec<null>,
{} as Codec<bigint | number>,
{} as Codec<{ value: string }>,
{} as Codec<{ x: number; y: number }>,
],
getIndex,
getIndex,
) satisfies Codec<bigint | number | { value: string } | { x: number; y: number } | null>;
}
1 change: 1 addition & 0 deletions packages/codecs-data-structures/src/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,5 @@ export * from './nullable';
export * from './set';
export * from './struct';
export * from './tuple';
export * from './union';
export * from './unit';
Loading

0 comments on commit bef9604

Please sign in to comment.