Commit 8882184
Update ZkLean extractor to work with 64-bit Twist & Shout (#1060)
* Re-enable zklean-extractor and minimally make it compile
* Extract T&S instruction MLEs
This commit replaces the `RV32I` struct from L&S jolt with the
`RV32IMInstruction` struct from T&S Jolt and updates the basic iteration,
extraction, and printing functions. This additionally requires adding a
derivation for `EnumIter` to `RV32IMinstruction`, in order to allow iteration.
* Add evaluation tests for MLEs
This also entails temporarily disabling some additional instructions, due to
the different profile used for tests. These will be re-enabled as part of
https://gitlab-ext.galois.com/jb4/jolt-fork/-/issues/14
* Extract operand interleaving for instruction MLEs
The determines how the two input operands of an instruction are combined into a
single vector. Due to the requirement of Twist & Shout that each instruction
MLE be decomposable into prefix and suffix MLEs, some instructions require
their operands to be interleaved, and others concatenated. Jolt determines the
operand interleaving using the circuit flags, so we do the same, here.
* Clean up some cruft
* Make R1CS printing compile with new codebase
This removes cross-step constraints and updates the extraction to use the new
types, module structure, etc.
* Fix incorrectly named test
* Remove subtable-extraction cruft
* Extract Jolt lookup-table MLEs
T&S Jolt associates MLEs with lookup-tables, rather than directly with
instructions. This adds a ZkLeanLookupTable type to the extractor and extracts
the MLE for each variant of LookupTables from jolt_core.
* Replace per-instruction MLEs with references to lookup tables
Now that we extract lookup tables, along with their MLEs, we can avoid printing
out the MLE for each instruction, and refer to the relevant lookup table,
instead. This reduces repetition (since multiple instructions have the same
MLE), reduces the number instructions we need to skip (instructions without
lookup tables can simply indicate this), and better matches the Jolt codebase.
We also remove the tests for instruction-MLE evaluation, since these are now
handled in `lookup.rs`.
* Extract circuit flags set for each instruction
Each instruction sets some of the `R1CSInputs::OpFlags` Boolean R1CS input
variables. This extracts those flags. We currently print them in a comment,
pending a decision on how to incorporate them into ZkLean.
Note that *all* instructions conditionally set the `DoNotUpdateUnexpandedPC`
and `InlineSequenceInstruction` flags when they are part of an inline sequence.
We do not extract these, since they depend on the trace, not the instruction,
itself.
* Incorporate `Var` and `Scalar` nodes into parent node in `MleAst`
This reduces the number of nodes required in ASTs by putting scalars and
variables in the branches of their parent nodes, rather than making separate
nodes for them. Since manyvariables and constants such as 0 and 1 tend to be
reused many times. The cost of this is that the size of each node is increased,
due to needing to contain space for two scalars, as well as additional overhead
for enum tags. We work on compressing node sizes next.
* Reduce size of atomic MLE AST elements
This reduces the size of an AST node by reducing the size of an atomic element
(a scalar or a variable). In particular, we
* store an array of all unique constants in an AST, so that a scalar can be
represented as an index into this array; and
* store the register name in the AST, rather than in each variable node.
Doing so brings the size of an atom down to 4 bytes and the size of a node down
to 10 bytes (due to the overhead of the enum tags). This is sufficient to
enable all 32-bit lookup tables (which we also do).
* Use current revision of ZkLean in extractor
* Increase max heartbeats in extracted Lean file for lookup tables
This is necessary because many of the MLEs are too large for Lean to
process within the default value of `maxHeartbeats`. We could potentially
eliminate the need for this by separating the lookup tables into separate
files.
* Use `sorry` instead of `panic` for instructions w/o MLEs in extractor
Using `panic` causes errors in which Lean can't prove that the
corresponding type is inhabited.
* Update extractor dependencies
* Remove undefined-symbol cruft from extractor package template
* Avoid using identifiers for MLEs we currently can't extract
* Add comment with old Lasso memops in extractor package template
The memops need to be updated to use the new Twist & Shout R1CS
input variables. We put them in a comment for now so we can refer
to them when we do that.
* Use arena allocation for MLE AST nodes in zklean extractor
Our approach to extracting lookup-table MLEs is to have an AST data
structure that implements `JoltField` and keeps track of the
polynomial computation. However, naively, the `Sized` and `Copy`
constraints on `JoltField` means that we are unable to use variably
sized collections in this data structure, and many ASTs get
allocated on the stack over the course of computations. This runs
into scalability issues, since stack space is limited. This commit
avoids this problem by using a static reference to a locked
allocation arena for AST nodes stored on the heap.
* Update instruction types for upstream codebase in extractor
The instruction type has changed in the upstream codebase. This
commit fixes that up in the extractor, post merge. In addition, the
subset of instructions that are "supported" by Jolt has changed,
due to the move to 64 bit. We fix that here, too.
* Update MleAst to fully implement JoltField given upstream changes
There are new constraints, methods, associated types, etc. for
JoltField in the upstream codebase. This commit ensures that MleAst
fully implements the new rendition of the trait. Non-trivial
methods that are not needed for extraction are left as stubs.
* Update cargo lock file
* Update R1CS-constraint data structure in extractor
The data structure for R1CS constraints has changed upstream, in particular, allowing R1CS constraints to be expressed in a constant array, rather than a builder pattern. This commit brings the extractor up to date with that data structure.
One thing to note is that the terms and constants of a linear combination within an R1CS constraint does not seem to be directly accessible. For that reason, we make the `LC::decompose` function public in this commit.
* Make size of extracted lookup tables depend on parameter set
* Implement additional traits on MleAst needed by JoltField
Post merge, there are some additional constraints on JoltField and
its associated types. This commit implements them on MleAst. Since
they're not needed for extraction, they're mostly stubs.
* Fix extraction of the R1CS constraints for current Jolt
This updates our extraction of R1CS constraints to match the
current codebase in 2 ways:
* coefficients and constant terms in LCs are now `i128`
* R1CS constraints are now represented as just `a` and `b` LCs,
with the `c` LC being (implicitly) always 0
* introduce common subexpression elimination to reduce output term size
* introduce definitions rather than let-bindings to help performance
Lean does not handle let-bound sub-expressions efficiently (e.g. a
graph), instead performance degrades exponentially in the presence of
many shared let-bindings.
Using top-level definitions enforce a linear behavior wrt. the number of
introduced sub-definitions. For our purposes, this change makes the
runtime of compiling the output go from well over 2 years to just about
10 minutes.
* add a comment to format_for_lean
* Add evaluation tests in Lean for extracted lookup-table MLEs
We test the lookup tables in Lean as follows:
* Generate 2 * WORD_SIZE random field elements to use as input
* For each lookup-table MLE, compute output := evaluate_mle(input)
in Rust
* Generate a Tests module in Lean with a #guard statement for each
MLE that checks mle(input) = output
This commit implements the above tests. The field is hard-coded to
be the scalar field of BN254 for now, implemented using ZMod p in
Lean. This requires showing that p is prime, which we sorry out for
now. Proving this or using BN254 from arklib is todo.
* have the Display instance print the whole AST as it used to
* Avoid hard-coding list of non-inline-sequence instructions in extractor
When extracting instructions with lookup tables / circuit flags, we
want to avoid inline sequences, because the functions that return
these things panic on inline-sequence instructions. Previously, we
hard-coded the list of supported instructions in the extractor,
which is bad because it requires manual curation. This commit
checks the `inline_sequence` instruction and filters out
instructionst that return non-empty vectors, instead.
* Remove increase to maxHeartbeats in extracted lookup-table module
Since the scaling issue with compiling large MLEs is now solved, we
no longer need to increase maxHeartbeats.
* remove src/ directory from Lean template
* Copy BN254.ScalarField from ArkLib for use in extracted Lean tests
This commit allows us to use the BN254 scalar field as if we were
importing it from ArkLib. For now, there is a Lean4 version
mismatch preventing us from actually depending on ArkLib. Once
that's fixed, we can remove our copy and substitute in the ArkLib
dependency without otherwise changing the Lean code.
* Increase maxHeartbeats in Lean for MLE evaluation tests
The Lean evaluation tests for the extracted MLEs time out with the
default Lean heartbeat limit. This increases it so they can
complete.
* Move Lean tests into separate module in generated package
The Lean tests take a long time to run, since they need to evaluate
large polynomials. We move them into a separate module here so that
they don't automatically run when running `lake build`. They can
now be run with `lake test`.
* Re-enable CI tests for extractor and add `lake test`
* Fix clippy and formatter warnings in extractor
* Remove unused flags module in extractor
* Fix extractor output source directory in CI
* Remove `lake test` from CI for now; Lean tests are too slow
* Removing both Lean compilation and Lean tests from CI for now
Compiling the larger 64-bit MLEs in Lean works locally, but
overflows the stack in lake on CI. We will remove it from CI for
now, until we can find a way to make them more performant.
* Rename WORD_SIZE to XLEN in extractor
---------
Co-authored-by: Valentin Robert <[email protected]>1 parent 6949f31 commit 8882184
File tree
29 files changed
+1548
-952
lines changed- .github/workflows
- jolt-core/src/zkvm
- lookup_table
- r1cs
- tracer/src/instruction
- zklean-extractor
- package-template
- Jolt
- src
- Jolt
- src
- modules
29 files changed
+1548
-952
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
171 | 171 | | |
172 | 172 | | |
173 | 173 | | |
174 | | - | |
175 | | - | |
176 | | - | |
177 | | - | |
178 | | - | |
179 | | - | |
180 | | - | |
181 | | - | |
182 | | - | |
183 | | - | |
184 | | - | |
185 | | - | |
186 | | - | |
187 | | - | |
188 | | - | |
189 | | - | |
190 | | - | |
191 | | - | |
192 | | - | |
193 | | - | |
194 | | - | |
195 | | - | |
196 | | - | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
73 | 73 | | |
74 | 74 | | |
75 | 75 | | |
| 76 | + | |
76 | 77 | | |
77 | 78 | | |
78 | 79 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | | - | |
| 22 | + | |
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
| |||
117 | 117 | | |
118 | 118 | | |
119 | 119 | | |
120 | | - | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
121 | 123 | | |
122 | 124 | | |
123 | 125 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
227 | 227 | | |
228 | 228 | | |
229 | 229 | | |
230 | | - | |
| 230 | + | |
231 | 231 | | |
232 | 232 | | |
233 | 233 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
375 | 375 | | |
376 | 376 | | |
377 | 377 | | |
378 | | - | |
| 378 | + | |
379 | 379 | | |
380 | 380 | | |
381 | 381 | | |
| |||
0 commit comments