Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: give a major rework on how our deleting works #54

Merged

Conversation

Davidson-Souza
Copy link
Collaborator

@Davidson-Souza Davidson-Souza commented Jul 29, 2024

This pr improves accumulator deletion significantly in a few ways.

Our deleting code had some major bottlenecks, like calling calculate
hashes twice (one for the old roots and one for updated ones) and
filtering the nodes that are in the proof, which was causing a linear
search on proof_positins for each element in nodes.

This commit fixes that by:

  • Creating a calculate_nodes_delete function that computes in parallel
    both updated and current roots, so we don't need to loop twice. In
    the future, we could even use vectorized computation of sha512_256
    two further speed things up.
  • Don't filter proof positions from the nodes vector in
    calculate_nodes_delete. This will waste a little memory as nodes in
    the proofs are returned, but won't cause a new allocation as we are
    returning nodes as is.
  • calculate_hashes now won't delete, so it gets simplified to only
    work when checking or updating proofs.

After running the new code for two days with floresta, the speedup is
clear, with calculate_nodes_delete taking less than 2% of the CPU time
for block validation. Before this path, it was using >40%. Here's a before and after

@Davidson-Souza Davidson-Souza changed the title perf: give a mojor rework on how our deleting works perf: give a major rework on how our deleting works Jul 29, 2024
@Davidson-Souza Davidson-Souza force-pushed the feat/calculate-hashes-delete branch 2 times, most recently from ad742e7 to daef321 Compare July 30, 2024 16:11
The current alg is okay for small proofs, but for big proofs, where we
need to move many elements to the left before adding to the middle of a
vector, this move becomes too expensive. The current alg uses two
vectors that are naturally sorted and we can just push to the end of one
vec.

This patch was tested on floresta and grants better performance while
not causing any incompatibility
Our deleting code had some major bottlenecks, like calling calculate
hashes twice (one for the old roots and one for updated ones) and
filtering the nodes that are in the proof, which was causing a linear
search on proof_positins **for each element in nodes**.

This commit fixes that by:
  - Creating a calculate_nodes_delete function that computes in parallel
    both updated and current roots, so we don't need to loop twice. In
    the future we could even use vectorized computation of sha512_256
    two further speed things up.
  - Don't filter proof positions from the nodes vector in
    calculate_nodes_delete. This will waste a little memory as nodes in
    the proof are returned, but won't cause a new allocation as we are
    returning nodes as is.
  - calculate_hashes now won't delete, so it gets simplified to only
    work when checking or updating proofs.

After running the new code for two days with floresta, the speedup is
clear, with calculate_nodes_delete taking less than 2% of the CPU time
for block validation. Before this path it was using >40%.
Every call to sort takes a big chunk of CPU time as the vectors can get
really long, and as std's merge sort is unstable. We aren't supposed to
get the vector unsorted while working inside a row, so we can call sort
only after finishing a row.
@Davidson-Souza Davidson-Souza merged commit e140c8a into mit-dci:main Jul 30, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant