-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal for switching to a Virtual Bus #1182
Comments
EDIT: I adjusted the section after discussing with @Al-Kindi-0 (which cleared up the remaining misunderstanding). It is now dedicated to explaining the "why" behind the Lagrange kernel's transition constraints as described here. I'd like to zoom in on the Lagrange kernel transition constraints, as it took me a bit of time to understand what was going on, and would like to make it easier for others. This writeup will use a motivating example with Problem recapThe so-called "Lagrange kernel" is the above-mentioned function We want to create a column Naive constraintsTo ensure that The first constraint is a boundary constraint. Since this note focuses on the transition constraints, we will only look at the last 7. Rewriting the naive transition constraintsNotice that we can rewrite the naive transition constraints in terms of one another: Note that there are more than one way to write these constraints; any two rows can be written in terms of one another if their index differs by 1 bit (i.e. they have a Hamming distance of 1). We chose this way specifically to serve the following sections. Reminder: group and subgroup generatorsBefore we introduce the core idea of this note (i.e. using constraint enforcement domains to reduce the number of constraints), I think it's useful to review how generators of subgroups are built. The VM uses the group of size These groups are cyclic, meaning that they can be generated by a single element. For our original group, the generator is We also know how to compute the generator of all other smaller subgroups (of size Specifically, for the purposes of our example, we will only need the generator for the subgroup of size 8 (the length of our trace): Introducing constraint enforcement domainsA key identity in STARKs is As a reminder the transition constraint formula is as follows (rewritten slightly for readability). For for Let's build our intuition of why this works by going back to our These domains are derived as described in the previous section. Let's evaluate the transition constraint formula for Notice that this is the 5th constraint defined in the previous section! For and with our 3rd and 7th constraints! For |
Sometimes it may be useful to test summations different from Examples of these are the virtual tables of stack overflow and kernel procedures. In the first example, the VM can accommodate programs which start with more than 16 field elements by initializing the stack overflow virtual table with these elements. Since this initialization is a function only of the (overflow) stack elements and their positions on the stack, the verifier can initialize the virtual table on its own and using randomness, say More precisely:
Note that, in the case of a universal bus, |
INTRO
In many situations, within the verifiable computation paradigm, one is interested, say for example in the context of zkVMs, in whether a certain, global over the execution trace, algebraic relation holds. To prove that this latter algebraic relation has been computed correctly, the prover can use the AIR (or more precisely the RAP) in order to compute the global relation via a sequence of local (i.e., transition) constraints applicable to some intermediate values. These values then become part of the execution trace even though they are artificial with respect to the original, say, zkVM execution trace. From the perspective of both the prover and verifier, the intermediate values' sole purpose is to witness the correct computation/enforcement of the local steps used in computing/enforcing the global relation/constraint. Unfortunately, this comes at the cost of extra data that the prover needs to commit to, which might be problematic considering that the commitment cost is, as of today, still the dominant cost in verifiable computation.
A second option is to use a protocol that is more tailored to the situation we have at hand, namely evaluating a (global) algebraic relation without any extra commitments (e.g., to the intermediate results of the computation) beyond those already required for the original trace. GKR is an interactive proof protocol for proving correct
evaluation
of a layered circuit 1 and is the ideal candidate for solving our stated problem. This idea was used recently in Lasso, in order to enforce a Grand-Product argument, and variations of the same principle appear in earlier works (e.g., Quarks).LogUp
Consider the LogUp lookup argument. This is a protocol for proving that a certain set$\lbrace w(x_i)\rbrace$ is a subset of another $\lbrace t(x_i)\rbrace$ . In its most basic form it can be written as the following global algebraic relation
$$\sum_{x\in\mathbb{H}} \frac{m(x)}{\alpha + t(x)} = \sum_{x\in\mathbb{H}} \frac{f(x)}{\alpha + w(x)}$$
where:
Miden VM uses the concept of a bus in order to enforce the above constraint. More precisely, it defines an extra column$p$ with the following constraints:
Notice here the issue discussed in the introduction, namely that LogUp introduces an extra witness$p$ that is in a sense extrinsic to the original statement.
Multivariate formulation
To describe the GKR-based solution to the previous issue, we need to formulate the problem in the language of multivariate polynomials instead of uni-variate ones. More precisely, we viewed the columns of the execution trace as maps from$\mathbb{H}\rightarrow \mathbb{F}_q$ where $\mathbb{F}_q$ is the base field. In the multivariate formulation, we look at each column now as a map from $\lbrace 0, 1\rbrace^{\nu}\rightarrow\mathbb{F}_q$ where $\nu = log_2(n)$ assuming $n$ is a power of $2$ .$f: \lbrace0, 1\rbrace^{\nu}\rightarrow\mathbb{F}_q$ to functions in $\mathbb{F}_q^{\nu}\rightarrow\mathbb{F}_q$ and this is done using the concept of (unique) multi-linear extension (MLE) polynomial. This is the unique multi-linear polynomial $\tilde{f}$ in $\mathbb{F}_q[X_0,\dots, X_{\nu - 1}]$ such that for any $x:= (x_0,\dots, x_{\nu-1}) \in \lbrace0, 1\rbrace^{\nu}$ we have $\tilde{f}(x) = f(x)$ . One can show2 that the multi-linear extension is unique and it is given3 by
The next thing we need to do is extend functions
where
Using the new formalism, we can write our constraint as
$$\sum_{x\in\lbrace 0, 1\rbrace^{\nu}} \frac{\tilde{m}(x)}{\widetilde{\left(\alpha + t\right)}(x)} = \sum_{x\in\lbrace 0, 1\rbrace^{\nu}} \frac{\tilde{f}(x)}{\widetilde{\left(\alpha + w\right)}(x)}$$
$$p(x_0,\dots, x_{\nu}) := (1 - x_0)\cdot \tilde{m}(x_1,\dots ,x_{\nu}) + x_0\cdot\left(-\tilde{f}(x_1,\dots ,x_{\nu})\right)$$
$$q(x_0,\dots, x_{\nu}) := (1 - x_0)\cdot \widetilde{\left(\alpha + t\right)}(x_1,\dots ,x_{\nu}) + x_0\cdot\left(-\widetilde{\left(\alpha + w\right)}(x_1,\dots ,x_{\nu})\right)$$
$$\sum_{x\in\lbrace 0, 1\rbrace^{\nu + 1}} \frac{p(x)}{q(x)} = 0$$ $\quad\tilde{}\quad$ .
which is equivalent to the previous identity by virtue of the extension property of the MLEs.
This can be further simplified by defining
and
which are again multi-linear polynomials. Our constraint thus becomes
The reason for introducing MLE notion will become clear in the next sections. From now on, we will overload the notation and omit the
Sum-check
Imagine our task was to prove the simpler relation
$$\sum_{x\in\lbrace 0, 1\rbrace^{\mu}} p(x) = 0$$
$$\sum_{x\in\lbrace 0, 1\rbrace^{\mu}} g\left(p_0(x),\dots,p_v(x)\right) = 0$$ $p_i$ are multi-linear polynomials to which the verifier has oracle access to and $g$ is a polynomial of degree at most $d$ which the verifier can evaluate on its own.
instead. The sum-check protocol solves exactly this and we now look at how it works. Assume the generalized problem
where
The sum-check protocol in this case can be described as follows:
To sum up, the sum-check protocol reduces the claim
$$\sum_{x\in\lbrace 0, 1\rbrace^{\mu}} g\left(p_0(x),\dots,p_v(x)\right) = 0$$ $p_i$ 's at a random point $(r_0,\dots,r_{\mu-1})$ which the verifier can check using one query to each of the oracles $p_i$ (as a multi-linear function).
to the evaluation of
GKR for sum of fractions
A Warm-Up
Recall that our main task was reduced to proving that
$$\sum_{x\in\lbrace 0, 1\rbrace^{\nu + 1}} \frac{p(x)}{q(x)} = 0$$ $\nu + 1 = 2$ .
$$\frac{p(0, 0)}{q(0, 0)} + \frac{p(0, 1)}{q(0, 1)} + \frac{p(1, 0)}{q(1, 0)} + \frac{p(1, 1)}{q(1, 1)}$$
$$\frac{p^{'}(0)}{q^{'}(0)} + \frac{p^{'}(1)}{q^{'}(1)}$$
$$\frac{p^{'}(0)}{q^{'}(0)} = \frac{p(0, 0)\cdot q(0, 1) + p(0, 1)\cdot q(0, 0)}{q(0, 0)\cdot q(0, 1)}$$
$$\frac{p^{'}(1)}{q^{'}(1)} = \frac{p(1, 0)\cdot q(1, 1) + p(1, 1)\cdot q(1, 0)}{q(1, 0)\cdot q(1, 1)}$$
The prover and verifier will run a GKR IP protocol in order to prove the validity of the above constraint. To this end, we will construct a (layered) circuit in order to evaluate the left-hand side expression. To give some intuition, we start with a simple example in the case when
In this case, the expression becomes:
which can be computed/reduced to an expression with half the number of terms
where
and
This can be written as
$$p^{'}(x) := \sum_{y\in\lbrace 0,1\rbrace} \mathsf{EQ}(x, y)\cdot \left(p(y,0)\cdot q(y, 1) + p(y,1)\cdot q(y, 0) \right)$$
We can now describe how the GKR protocol works in our simplified example. The GKR protocol runs in the opposite direction to that of the evaluation of the circuit i.e.,
The General Case
The circuit
The circuit we are interested in is composed of$\mu$ layers with the $\mu$ -th layer being the input layer and is located at the bottom. The input layer feeds into the next layer above it, i.e., $\mu-1$ , and this goes on until one reaches the $1$ -st layer. More precisely:
Notice that we can associate with each layer the following two MLEs
$$p_{\kappa}(x) = \sum_{y\in\lbrace 0,1\rbrace^k} \mathsf{EQ}(x, y)\cdot\left( p_{\kappa + 1}(y, 0)\cdot q_{\kappa + 1}(y, 1) + p_{\kappa + 1}(y, 1)\cdot q_{\kappa + 1}(y, 0)\right)$$
$$q_{\kappa}(x) = \sum_{y\in\lbrace 0,1\rbrace^k} \mathsf{EQ}(x, y)\cdot\left( q_{\kappa + 1}(y, 0)\cdot q_{\kappa + 1}(y, 1) \right)$$ $p_{\kappa}$ and $p_{\kappa}$ at tuples in $\mathbb{F}_q$ .
and in particular we can can make sense of evaluating
The protocol
The GKR protocol works from the output layer backwards and works essentially by reducing a claim about the output of one layer to a similar claim on the outputs of the layer feeding into it. This continues until one reaches the input layer (i.e., layer$\mu$ ) at which the verifier makes a query to the oracles it received at beginning of the interaction.
More precisely:
In the context of our basic example of LogUp, the last point entails a claim on$t, m, f, w$ at $(r_{\mu , 1},\dots, r_{\mu, \mu})$ since $$p_{\mu}(r_{\mu}) = (1-r_{\mu , 0})\cdot m(r_{\mu , 1},\dots, r_{\mu, \mu}) - r_{\mu , 0}\cdot f(r_{\mu , 1},\dots, r_{\mu, \mu})$$ and $$q_{\mu}(r_{\mu}) = (1-r_{\mu , 0})\cdot (\alpha + t)(r_{\mu , 1},\dots, r_{\mu, \mu}) - r_{\mu , 0}\cdot (\alpha + w) (r_{\mu , 1},\dots, r_{\mu, \mu})$$ where we used $r_{\mu} := (r_{\mu , 0}, r_{\mu , 1},\dots, r_{\mu, \mu}).$
Simulating a Multi-linear Polynomial Commitment scheme
The remaining question now is implementing the final oracle query by the verifier at the random point resulting from the GKR protocol interaction. To be more precise, suppose$t$ is a column of the main trace, then the question is how can the prover compute $t(r_0,\dots,r_{\nu - 1})$ and convince the verifier that it has done so honestly.$t$ , i.e., both as a uni-variate and as a multivariate polynomial. Suppose $\mathbb{H} := \lbrace g^{i}: i\in\lbrace 0,\dots ,2^{\nu}-1\rbrace\rbrace$ for a primitive $2^{\nu}$ -th root of unity. This means that for any $y = g^i\in\mathbb{H}$ we have $$y = g^i = g^{i_0 + i_1\cdot 2 + \dots + i_{\nu-1}\cdot 2^{\nu - 1}}$$ $$t(i_0,\dots ,i_{\nu -1}) = t(g^i) = t(y)$$ $$t(r_0,\dots ,r_{\nu -1}) = \sum_{(i_0,\dots , i_{\nu - 1})\in\lbrace 0, 1\rbrace^{\nu}} \mathsf{EQ}\left((r_0,\dots ,r_{\nu -1}), (i_0,\dots , i_{\nu - 1})\right) t\left(g^{i_0 + i_1\cdot 2 + \dots + i_{\nu-1}\cdot 2^{\nu - 1}}\right)$$ $$\mathsf{EQ}((r_0,\dots ,r_{\nu -1}), (i_0,\dots , i_{\nu - 1})) = \prod_{j=0}^{\nu - 1}\left(r_j\cdot i_j + (1-r_j)\cdot(1- i_j)\right)$$ $t(r_0,\dots ,r_{\nu -1})$ is the inner product of two vectors and hence, given oracle access to the (uni-variates) $t$ and $\mathsf{c} := (c_i)_{0\leq i\leq 2^{\nu} - 1}$ where $$</code>c_i := \mathsf{EQ}\left((r_0,\dots ,r_{\nu -1}), (i_0,\dots , i_{\nu - 1})\right)<code>$$ for $i=i_0 + i_1\cdot 2 + \dots + i_{\nu-1}\cdot 2^{\nu - 1}$ , the verifier can assertain $t(r_0,\dots ,r_{\nu -1})$ using a uni-variate sum-check protocol. This can be done either as part of the auxiliary trace, at the cost of an extra column and a quadratic transition constraint, or outside using an additional protocol as in e.g., Aurora.
First, let's go back to the dual view of
which we can use to define the mapping (with notational overload)
This then gives meaning to
where
Thus,
The Lagrange kernel oracle
As part of the auxiliary trace, and given the random values$(r_0,\dots ,r_{\nu -1})$ , the column $\mathsf{c}$ satisfies the following sets of constraints:
The above can be encoded using:
Description of the full protocol
Next Steps
The main points that need further considerations are:
These will be discussed in future separate issues.
Footnotes
Log-uniform circuits i.e., roughly circuits with a succinct description. ↩
Fact 3.5 in Proofs, arguments and zero-knowledge (PAZK) by Justin Thaler. ↩
Lemma 3.6 in PAZK. ↩
The text was updated successfully, but these errors were encountered: