Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up decoding of large enums #2

Merged
merged 1 commit into from
Sep 16, 2024
Merged

Speed up decoding of large enums #2

merged 1 commit into from
Sep 16, 2024

Conversation

zyla
Copy link
Collaborator

@zyla zyla commented Sep 8, 2024

Generic to is quite slow on datatypes with many constructors, because it runs O(log^2(n)) instanceof operations, where n is the number of constructors.

This change moves this cost from the decoding loop to initialization time. Now we store values of the target type in the lookup table instead of Generic representations. This makes initialization slower, but if there are many enums values to decode, we save time there.

Before:

Enum3 Unscramble                        : 0.000027ms/op
Enum10 Unscramble                       : 0.000169ms/op
Enum30 Unscramble                       : 0.002258ms/op

After:

Enum3 Unscramble                        : 0.000016ms/op
Enum10 Unscramble                       : 0.000047ms/op
Enum30 Unscramble                       : 0.000050ms/op

As seen in the above benchmark results, after this change enum decoding is almost independent of enum size. Which makes sense, because now it's only a hashtable lookup, as it should be.

@zyla zyla changed the title Speed up Enum decoding by calling Generic ahead of time Speed up decoding of large enums Sep 16, 2024
genericUnsafeDecodeEnum opts =
let constructors = Object.fromFoldable (enumConstructors opts :: Array (Tuple String rep))
let constructors = Object.fromFoldable (enumConstructors to opts :: Array (Tuple String a))
in \value ->
let tag = decodeString value in
case Object.lookup tag constructors of
Copy link

@kozak kozak Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we are returning a lambda here but the lambda doesn't need to call to anymore, because constructors are already returning the final type and not the rep.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes (assuming JS Object is a hashtable, which it will probably be in this case in most implementations)

@zyla
Copy link
Collaborator Author

zyla commented Sep 16, 2024

To clarify: the instanceofs come from the generated Generic instances. For example, here is to from the instance for Enum10:

    to: function (x) {
        if (x instanceof Data_Generic_Rep.Inl) {
            return E10_1.value;
        };
        if (x instanceof Data_Generic_Rep.Inr && x.value0 instanceof Data_Generic_Rep.Inl) {
            return E10_2.value;
        };
        if (x instanceof Data_Generic_Rep.Inr && (x.value0 instanceof Data_Generic_Rep.Inr && x.value0.value0 instanceof Data_Generic_Rep.Inl)) {
            return E10_3.value;
        };
        if (x instanceof Data_Generic_Rep.Inr && (x.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0 instanceof Data_Generic_Rep.Inr && x.value0.value0.value0 instanceof Data_Generic_Rep.Inl))) {
            return E10_4.value;
        };
        if (x instanceof Data_Generic_Rep.Inr && (x.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0 instanceof Data_Generic_Rep.Inr && x.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inl)))) {
            return E10_5.value;
        };
        if (x instanceof Data_Generic_Rep.Inr && (x.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inr && x.value0.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inl))))) {
            return E10_6.value;
        };
        if (x instanceof Data_Generic_Rep.Inr && (x.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inr && x.value0.value0.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inl)))))) {
            return E10_7.value;
        };
        if (x instanceof Data_Generic_Rep.Inr && (x.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inr && x.value0.value0.value0.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inl))))))) {
            return E10_8.value;
        };
        if (x instanceof Data_Generic_Rep.Inr && (x.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inr && x.value0.value0.value0.value0.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inl)))))))) {
            return E10_9.value;
        };
        if (x instanceof Data_Generic_Rep.Inr && (x.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inr && (x.value0.value0.value0.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inr && x.value0.value0.value0.value0.value0.value0.value0.value0 instanceof Data_Generic_Rep.Inr)))))))) {
            return E10_10.value;
        };
        throw new Error("Failed pattern match at Bench.Micro (line 144, column 1 - line 144, column 33): " + [ x.constructor.name ]);
    },

Also, the complexity I provided in PR description is wrong. The generic tree seems to be unbalanced, to it's O(n^2), not O(log^2(n)).

Copy link

@kozak kozak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If my understanding (in the comment) is ok, then lg to me :)

@zyla zyla merged commit 0eec379 into master Sep 16, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants