-
Notifications
You must be signed in to change notification settings - Fork 274
Update ZkLean extractor to work with 64-bit Twist & Shout #1060
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit replaces the `RV32I` struct from L&S jolt with the `RV32IMInstruction` struct from T&S Jolt and updates the basic iteration, extraction, and printing functions. This additionally requires adding a derivation for `EnumIter` to `RV32IMinstruction`, in order to allow iteration.
This also entails temporarily disabling some additional instructions, due to the different profile used for tests. These will be re-enabled as part of https://gitlab-ext.galois.com/jb4/jolt-fork/-/issues/14
The determines how the two input operands of an instruction are combined into a single vector. Due to the requirement of Twist & Shout that each instruction MLE be decomposable into prefix and suffix MLEs, some instructions require their operands to be interleaved, and others concatenated. Jolt determines the operand interleaving using the circuit flags, so we do the same, here.
…traction' Resolve "Update instruction MLEs for T&S" See merge request jb4/jolt-fork!19
This removes cross-step constraints and updates the extraction to use the new types, module structure, etc.
…ture/ts-extraction' Resolve "Update R1CS constraint extraction for T&S" See merge request jb4/jolt-fork!18
T&S Jolt associates MLEs with lookup-tables, rather than directly with instructions. This adds a ZkLeanLookupTable type to the extractor and extracts the MLE for each variant of LookupTables from jolt_core.
Now that we extract lookup tables, along with their MLEs, we can avoid printing out the MLE for each instruction, and refer to the relevant lookup table, instead. This reduces repetition (since multiple instructions have the same MLE), reduces the number instructions we need to skip (instructions without lookup tables can simply indicate this), and better matches the Jolt codebase. We also remove the tests for instruction-MLE evaluation, since these are now handled in `lookup.rs`.
…ts-extraction' Resolve "Extract lookup tables and their MLEs" See merge request jb4/jolt-fork!21
Each instruction sets some of the `R1CSInputs::OpFlags` Boolean R1CS input variables. This extracts those flags. We currently print them in a comment, pending a decision on how to incorporate them into ZkLean. Note that *all* instructions conditionally set the `DoNotUpdateUnexpandedPC` and `InlineSequenceInstruction` flags when they are part of an inline sequence. We do not extract these, since they depend on the trace, not the instruction, itself.
…o 'feature/ts-extraction' Resolve "Extract R1CS circuit flags for each instruction" See merge request jb4/jolt-fork!22
This reduces the number of nodes required in ASTs by putting scalars and variables in the branches of their parent nodes, rather than making separate nodes for them. Since manyvariables and constants such as 0 and 1 tend to be reused many times. The cost of this is that the size of each node is increased, due to needing to contain space for two scalars, as well as additional overhead for enum tags. We work on compressing node sizes next.
This reduces the size of an AST node by reducing the size of an atomic element (a scalar or a variable). In particular, we * store an array of all unique constants in an AST, so that a scalar can be represented as an index into this array; and * store the register name in the AST, rather than in each variable node. Doing so brings the size of an atom down to 4 bytes and the size of a node down to 10 bytes (due to the overhead of the enum tags). This is sufficient to enable all 32-bit lookup tables (which we also do).
This is necessary because many of the MLEs are too large for Lean to process within the default value of `maxHeartbeats`. We could potentially eliminate the need for this by separating the lookup tables into separate files.
Using `panic` causes errors in which Lean can't prove that the corresponding type is inhabited.
The memops need to be updated to use the new Twist & Shout R1CS input variables. We put them in a comment for now so we can refer to them when we do that.
…ation' into 'feature/ts-extraction' Resolve "Update pretty-printing to match new ZkLean representation" See merge request jb4/jolt-fork!25
…-from-crashing-in-debug-mode
…' into 'feature/ts-extraction' Resolve "Prevent zklean-extractor from crashing in debug mode" See merge request jb4/jolt-fork!23
Our approach to extracting lookup-table MLEs is to have an AST data structure that implements `JoltField` and keeps track of the polynomial computation. However, naively, the `Sized` and `Copy` constraints on `JoltField` means that we are unable to use variably sized collections in this data structure, and many ASTs get allocated on the stack over the course of computations. This runs into scalability issues, since stack space is limited. This commit avoids this problem by using a static reference to a locked allocation arena for AST nodes stored on the heap.
This updates our extraction of R1CS constraints to match the current codebase in 2 ways: * coefficients and constant terms in LCs are now `i128` * R1CS constraints are now represented as just `a` and `b` LCs, with the `c` LC being (implicitly) always 0
Lean does not handle let-bound sub-expressions efficiently (e.g. a graph), instead performance degrades exponentially in the presence of many shared let-bindings. Using top-level definitions enforce a linear behavior wrt. the number of introduced sub-definitions. For our purposes, this change makes the runtime of compiling the output go from well over 2 years to just about 10 minutes.
We test the lookup tables in Lean as follows: * Generate 2 * WORD_SIZE random field elements to use as input * For each lookup-table MLE, compute output := evaluate_mle(input) in Rust * Generate a Tests module in Lean with a #guard statement for each MLE that checks mle(input) = output This commit implements the above tests. The field is hard-coded to be the scalar field of BN254 for now, implemented using ZMod p in Lean. This requires showing that p is prime, which we sorry out for now. Proving this or using BN254 from arklib is todo.
…bit-jolt-instructions' introduce common subexpression elimination to reduce output term size See merge request jb4/jolt-fork!29
When extracting instructions with lookup tables / circuit flags, we want to avoid inline sequences, because the functions that return these things panic on inline-sequence instructions. Previously, we hard-coded the list of supported instructions in the extractor, which is bad because it requires manual curation. This commit checks the `inline_sequence` instruction and filters out instructionst that return non-empty vectors, instead.
Since the scaling issue with compiling large MLEs is now solved, we no longer need to increase maxHeartbeats.
…raction' Resolve "Enable 64-bit Jolt instructions" See merge request jb4/jolt-fork!27
…ction' remove src/ directory from Lean template See merge request jb4/jolt-fork!26
…-evaluation-in-lean
This commit allows us to use the BN254 scalar field as if we were importing it from ArkLib. For now, there is a Lean4 version mismatch preventing us from actually depending on ArkLib. Once that's fixed, we can remove our copy and substitute in the ArkLib dependency without otherwise changing the Lean code.
The Lean evaluation tests for the extracted MLEs time out with the default Lean heartbeat limit. This increases it so they can complete.
The Lean tests take a long time to run, since they need to evaluate large polynomials. We move them into a separate module here so that they don't automatically run when running `lake build`. They can now be run with `lake test`.
…eature/ts-extraction' Resolve "Add tests for polynomial evaluation in Lean" See merge request jb4/jolt-fork!28
| (inputs : JoltR1CSInputs f) | ||
| (mem_reg mem_ram mem_elfaddress mem_bitflags mem_rs1 mem_rs2 mem_rd mem_imm: RAM f) /- We don't currently support storing tuples inside RAM so we've split them into separate ROMs -/ | ||
| : ZKBuilder f PUnit := do | ||
| panic! "TODO: rewrite or extract memory ops for Twist & Shout" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The panic! macro will cause runtime errors when this code is executed. For a more Lean-compatible approach, consider either implementing the proper memory operations for Twist & Shout or using Lean's sorry tactic as a temporary placeholder that allows compilation to proceed. This would better align with Lean's proof system expectations while indicating the code is incomplete.
| panic! "TODO: rewrite or extract memory ops for Twist & Shout" | |
| sorry -- TODO: implement memory operations for Twist & Shout |
Spotted by Graphite Agent
Is this helpful? React 👍 or 👎 to let us know.
| impl<const N: usize> From<ark_ff::BigInt<N>> for MleAst { | ||
| fn from(_value: ark_ff::BigInt<N>) -> Self { | ||
| unimplemented!("hash unimplemented for MleAst") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message in the From<BigInt> implementation incorrectly refers to hash functionality. It should be updated to "From unimplemented for MleAst" to accurately reflect the unimplemented conversion function rather than hashing.
| impl<const N: usize> From<ark_ff::BigInt<N>> for MleAst { | |
| fn from(_value: ark_ff::BigInt<N>) -> Self { | |
| unimplemented!("hash unimplemented for MleAst") | |
| } | |
| impl<const N: usize> From<ark_ff::BigInt<N>> for MleAst { | |
| fn from(_value: ark_ff::BigInt<N>) -> Self { | |
| unimplemented!("From<BigInt> unimplemented for MleAst") | |
| } | |
Spotted by Graphite Agent
Is this helpful? React 👍 or 👎 to let us know.
Compiling the larger 64-bit MLEs in Lean works locally, but overflows the stack in lake on CI. We will remove it from CI for now, until we can find a way to make them more performant.
0xAndoroid
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just spotted this one thing
|
@moodlezoup should we merge this? |
This PR updates the ZkLean extractor for the current version of Jolt. It includes the following changes:
We make just two changes to
jolt-core.IntoStaticStrforLookupTablesenum. This helps in automatically emitting relevant names for the lookup-table MLEs in Lean.LC::decomposefunctionpub. This makes it possible for us to extract the R1CS constraints.