Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use number of partitions for buffer allocation instead of partition size #339

Closed
wants to merge 2 commits into from

Conversation

TheMenko
Copy link

@TheMenko TheMenko commented Nov 8, 2024

Buffer allocation was too big when using partition size. Only "num partition" values would be filled, leaving everything else as 0.
This Changes the size to number of partitions.

@facebook-github-bot
Copy link

Hi @TheMenko!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at [email protected]. Thanks!

@facebook-github-bot
Copy link

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Meta Open Source project. Thanks!

@TheMenko TheMenko marked this pull request as draft November 8, 2024 11:06
) -> (Self, TracePolyTable<E>) {
// extend the main execution trace and build a commitment to the extended trace
let (main_segment_lde, main_segment_vector_com, main_segment_polys) =
build_trace_commitment::<E, E::BaseField, H, V>(
main_trace,
domain,
partition_option.partition_size::<E::BaseField>(main_trace.num_cols()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compilation error. Needs to be partition_options.

Comment on lines +192 to +193
let partition_size = partition_options.partition_size::<E>(self.num_cols());
let num_partitions = partition_options.num_partitons();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: should this be the "specified" number or partitions or the "implied" number of partitions? For example, let's say we have 7 columns in the degree 2 extension field and the specified number of partitions is 4 with min partition size being 8.

With these parameters, the implied number of partitions is actually 2 (because partition size would be 4 columns and there are 7 columns total). So, would we want to use 2 or 4 for the number of partitions?

@@ -383,7 +383,11 @@ impl PartitionOptions {
self.min_partition_size as usize,
);

base_elements_per_partition.div(E::EXTENSION_DEGREE)
base_elements_per_partition.div_ceil(E::EXTENSION_DEGREE)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may need to make this a bit more sophisticated because I think it will produce rather suboptimal results in some situations. For example:

  • num_columns = 7
  • E::EXTENSION_DEGREE = 3
  • self.num_partitions = 4
  • self.min_partition_size = 8

If I did my math correctly, with this approach, we'd get partition size of 3 columns which would imply 9 base field columns per partition. This would require 2 permutations of the hash function in each partition.

The previous approach would have actually resulted in a better outcome here (i.e., partition size 2, so so the 4 partitions would have 2, 2, 2, 1 columns). But this result would have been technically incorrect because we'd have 6 base field elements per partition and this would be smaller than min_partition_size.

Maybe instead of min_partition_size we should be specifying the number of base elements that can be absorbed per permutation and then we can adjust this algorithm to output more optimal results.

@gswirski
Copy link
Contributor

gswirski commented Nov 12, 2024

EDIT: Please disregard the comment below. I've created a new PR that takes @irakliyk feedback into account: #340

Hi @irakliyk, thank you for your initial review. I'm going to pick up this work from @TheMenko.

Here is some context that might be helpful:

  • On the GPU (that prompted partitioning in the first place), we don't pay much attention to extension fields. Extensions get "exploded" into additional columns. Columns A, B in ext deg 2 become columns a1, a2, b1, b2.
  • Winterfell splits columns based on "logical" field elements whereas GPU looks at base elements only. Right now, GPU could potentially split extended elements into two different partitions. We need to be more careful about that.
  • This PR is wrong because hash_row function receives a fixed num_partitions value throughout the proving process. It ends up being incorrect for aux traces. PartitionOptions should be propagated all the way to the hash_row function in the verifier.

I'm going to rework this PR, but let's first agree what a desirable solution looks like. To make sure we are on the same page, below are my calculations for 7 "logical" columns and extension degree 2 and 3.

Extension Degree = 2

num_columns = 7
extension_degree = 2 // -> base_element_columns = 14

num_partitions = 4
min_base_element_partition_size = 8

yields:

  • first partition with 4 "logical columns" (8 base element columns),
  • second partition with 3 "logical columns" (6 base element columns)
  • the remaining 2 partitions are not used and don't influence the results.

Extension Degree = 3

num_columns = 7
extension_degree = 3 // -> base_element_columns = 21

num_partitions = 4
min_base_element_partition_size = 8

yields:

  • first partition with 3 "logical columns" (9 base element columns)
  • second partition with 3 "logical columns" (9 base element columns)
  • third partition with 1 "logical column" (3 base element columns),
  • the remaining partition is not used and doesn't influence the results.

I understand this is not what you want. We have two options here:

  1. Specify the min_base_element_partition_size as 6. This way we get 6, 6, 6, 3 base elements split
  2. Allow splitting extended elements into multiple partitions. Then we would get 8, 8, 5 base elements split.

Next Steps

Let me know which option makes more sense to you.

  1. If we choose path 1, we should probably keep the min_partition_size name and refer to "logical columns".
  2. If we choose path 2, I would rename min_partition_size to min_base_elements_partition_size or similar.

EDIT: Please disregard this comment. I've created a new PR that takes @irakliyk feedback into account: #340

@irakliyk
Copy link
Collaborator

Superseded by #340.

@irakliyk irakliyk closed this Nov 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants