Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide try-with-resources-like function to SHA256 #831

Open
wants to merge 1 commit into
base: next
Choose a base branch
from

Conversation

venfernand
Copy link
Contributor

Replace small getDigest-digest-release sequences with wrapper method from SHA256

Replace small getDigest-digest-release sequences with wrapper method from SHA256



Signed-off-by: Veniamin Fernandes <[email protected]>
} catch (Throwable t) {
Logger.error(this, "Caught "+t, t);
} finally {
SHA256.returnMessageDigest(md);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

md seems to be used basically only once, why not replace this one with the consumer-type method as well?

md.update(pwd);
md.update(salt);
byte[] outerKey = md.digest();
byte[] pwd = password.getBytes(StandardCharsets.UTF_8);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is pwd used outside of the lambda? If not, move it into the lambda, please.

return digest(md -> md.update(data));
}

public static byte[] digest(Consumer<MessageDigest> updater) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we’re doing this (and we should!) I would also love to deprecate the getMessageDigest() and returnMessageDigest() methods (hash(), too, probably), with a note on how to digest stuff in 2024! 😁

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Funny enough, on modern Java this get/return dance is actually slower* compared to just creating a fresh SHA256 digest & using it just once 🙃
Maybe we should get rid of getMessageDigest/returnMessageDigest altogether (or make them return a fresh instance / no-op respectively), but that's probably for later.

* I measured 30% performance decrease for get/return over fresh digests when used by 4 threads simultaneously in a microbenchmark, when digesting a moderate 272 bytes. Most of the time is spent in updating the queue & resetting the digest rather than doing the actual computations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it’s quite possible that this, too, is a remnant of times where certain things (like MessageDigest creation) were incredibly expensive, in which case we should get rid of them.

Can you set up a small benchmark (maybe using JMH or something similar) so we can run a couple of comparisons over various platforms to see if we can get rid of this? 🙂

Copy link
Contributor

@bertm bertm Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already had this set up in JMH so sharing it was just a cleanup away. Here you go!
https://gist.github.com/bertm/94f3c97fd138fe075a1f54076c6bd3cf

My output (on a laptop with 4c/8t Intel i7-10510U and OpenJDK 21):

Benchmark               (payloadLength)   Mode  Cnt         Score         Error  Units
DigestBenchmark.fresh                16  thrpt    5  10283055,648 ± 1009465,568  ops/s
DigestBenchmark.fresh               256  thrpt    5   2513375,885 ±  208951,386  ops/s
DigestBenchmark.fresh              2048  thrpt    5    468703,126 ±   98899,698  ops/s
DigestBenchmark.pooled               16  thrpt    5   3888433,541 ±  456434,379  ops/s
DigestBenchmark.pooled              256  thrpt    5   2052883,423 ±  269884,782  ops/s
DigestBenchmark.pooled             2048  thrpt    5    473900,467 ±   16243,795  ops/s

Same laptop with OpenJDK 8 is substantially slower, but still using fresh digests performs faster than pooling them:

Benchmark               (payloadLength)   Mode  Cnt        Score         Error  Units
DigestBenchmark.fresh                16  thrpt    5  4846875,123 ± 1454094,677  ops/s
DigestBenchmark.fresh               256  thrpt    5  1343279,656 ±   56002,017  ops/s
DigestBenchmark.fresh              2048  thrpt    5   232546,571 ±    2944,786  ops/s
DigestBenchmark.pooled               16  thrpt    5  3538942,103 ±   47069,863  ops/s
DigestBenchmark.pooled              256  thrpt    5  1302934,389 ±   27901,449  ops/s
DigestBenchmark.pooled             2048  thrpt    5   234428,038 ±    5611,622  ops/s

(Of course for large payloads the difference is negligible, as can be seen for the 2048 byte payload testcase.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My results from a 2020 MacBook Pro with Intel i7, running macOS 11.7.9.

Java 8.0.402-tem:

Benchmark               (payloadLength)   Mode  Cnt        Score         Error  Units
DigestBenchmark.fresh                16  thrpt    5  4590534.845 ± 1743148.084  ops/s
DigestBenchmark.fresh               256  thrpt    5   789453.566 ±  211538.845  ops/s
DigestBenchmark.fresh              2048  thrpt    5   121031.497 ±   54753.415  ops/s
DigestBenchmark.pooled               16  thrpt    5  2299581.955 ±  321764.398  ops/s
DigestBenchmark.pooled              256  thrpt    5   928764.410 ±  331213.950  ops/s
DigestBenchmark.pooled             2048  thrpt    5   186062.632 ±   33492.467  ops/s

Java 17.0.8-tem:

Benchmark               (payloadLength)   Mode  Cnt        Score         Error  Units
DigestBenchmark.fresh                16  thrpt    5  6849782.112 ± 3876040.782  ops/s
DigestBenchmark.fresh               256  thrpt    5  1424004.927 ±  670666.188  ops/s
DigestBenchmark.fresh              2048  thrpt    5   263265.018 ±   69039.721  ops/s
DigestBenchmark.pooled               16  thrpt    5  2749789.198 ±  351308.606  ops/s
DigestBenchmark.pooled              256  thrpt    5  1358076.282 ±  661234.574  ops/s
DigestBenchmark.pooled             2048  thrpt    5   291232.774 ±   99110.753  ops/s

Java 21.0.1-tem:

Benchmark               (payloadLength)   Mode  Cnt        Score         Error  Units
DigestBenchmark.fresh                16  thrpt    5  7157264.140 ± 3133981.665  ops/s
DigestBenchmark.fresh               256  thrpt    5  1540694.555 ±  414277.743  ops/s
DigestBenchmark.fresh              2048  thrpt    5   268549.257 ±   79066.360  ops/s
DigestBenchmark.pooled               16  thrpt    5  2778790.638 ±  210145.178  ops/s
DigestBenchmark.pooled              256  thrpt    5  1262859.045 ±  341595.845  ops/s
DigestBenchmark.pooled             2048  thrpt    5   323385.319 ±   74968.957  ops/s

I have to admit that I am slightly confused by these results. When using larger payloads the pooled digests are faster, indicating that creating a small number of digests is somehow slower than creating a large number of digests? I have the feeling we’re missing something here…

Copy link
Contributor

@bertm bertm Mar 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your measurement error is sky high (for 16 bytes approx. 50% of your measurement). The confidence intervals of your pooled and fresh measurements overlap substantially for 2048 bytes, rendering those results inconclusive. Have you tried your MacBook in performance mode? If that does not help to reduce the jitter, try increasing the number of iterations or the duration of the iterations.

Here's a re-run on OpenJDK 21 with 20 seconds per iteration on my machine, note the errors are now in the realm of 1-2% of the score where they were previously >10%:

Benchmark               (payloadLength)   Mode  Cnt        Score        Error  Units
DigestBenchmark.fresh                16  thrpt    5  9585988,644 ± 112557,463  ops/s
DigestBenchmark.fresh               256  thrpt    5  2707033,466 ±  36179,251  ops/s
DigestBenchmark.fresh              2048  thrpt    5   499138,086 ±   1028,970  ops/s
DigestBenchmark.pooled               16  thrpt    5  4160434,838 ±  31751,576  ops/s
DigestBenchmark.pooled              256  thrpt    5  2193776,379 ±  12036,799  ops/s
DigestBenchmark.pooled             2048  thrpt    5   484132,530 ±   1403,232  ops/s

In any case, the 16 byte measurement is the most relevant: it highlights the overhead of borrow/return versus object allocation best. It would be reasonable to assume that once obtained, both digests perform equally well. At that point, both variants are are just instances of the exact same class with the exact same state.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ll probably repeat some of the measurements tomorrow; that’s my work machine and sometimes I actually do have to work, even while benchmarking. 😀

Anyway, most relevant would be the most common size we’re hashing, and I guess that would be the size of our routing keys. Are those 16 bytes? (I have the feeling that I really should know these things… 😁) Either way, most common size will probably indeed be a lot closer to 16 than to 2048 so getting rid of the pool will probably get a 👍 from me.

return digest(md -> md.update(data));
}

public static byte[] digest(Consumer<MessageDigest> updater) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though this is definitely an improvement over the existing code, none of the places where this method is used actually seem to be interested in "consuming a MessageDigest". Instead, they just want to digest multiple sources, together.

Can we just go for the byte[] digest(byte[]... data) that is even less of an eyesore?
For example, this code:

byte[] outerKey = SHA256.digest(md -> {
    md.update(pwd);
    md.update(salt);
});

could simply become:

byte[] outerKey = SHA256.digest(pwd, salt);

(and yes, there are a few places that would like to update(byte) - well, those edge cases can just construct a new byte[])

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, the MessageDigest here is a means to an end. Using a Consumer to remove the creation (and subsequent release) of a temporary object like the message digest is a common pattern, though, so I believe that the irritation you are hinting at does not actually exist. 🙂

But yeah, by adding the vararg byte[] we could probably get rid of all other methods… 😁

md.update(plainKey);
md.update(salt);
});
assert hashedRoutingKey.length == 0x20;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, SHA-256 will return a 32 byte hash - that's implied by the name.

There's no point in asserting this here (if it is asserted at all, that should be in the SHA256 class, at is is a SHA-256 invariant). Let's just remove the assertion altogether.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants