From 7ef8997a793f381c6f7709180e1eafe83d0370a5 Mon Sep 17 00:00:00 2001 From: Marcos Nicolau Date: Wed, 21 Aug 2024 13:59:25 -0300 Subject: [PATCH 01/14] Add basic doc layout --- docs/zksync-era-integration.md | 105 +++++++++++++++++++++++++++++++++ 1 file changed, 105 insertions(+) create mode 100644 docs/zksync-era-integration.md diff --git a/docs/zksync-era-integration.md b/docs/zksync-era-integration.md new file mode 100644 index 00000000..74aa24a3 --- /dev/null +++ b/docs/zksync-era-integration.md @@ -0,0 +1,105 @@ +## Zksync-era integration + +The `era_vm` doesn't do much by itself, it needs someone to tell it what to do and orquestrate it. On ethereum, for example, the execution node would take a block and process all of the transactions within it one by one using the `evm`. In zksync, things are a bit different, here transactions aren't processed one by one but in batch, this is to publish all of the transactions as just a single one, making the l1 post cheaper and distribitued among all the trnansactions withing the block. + +This is what the bootloader do. The bootloader is a `.yul` contract that lives on l1 and takes an array of transactions and executes them in the `era_vm`. + +Part of integrating our vm with the zk stack comes down to integrate the bootloader with our vm. That is, we need to make our vm "talk" with the bootloader. Before diving into the integration, we'll first explore what the bootloader does more in detail, which will explain what parts we need to keep track on our vm. + +### Bootloader + +At the most basic level, the bootloader performs the following steps: + +1. Transaction Validation and Processing +2. State Initialization +3. Execution +4. Publishing final state to the l1. + +## Transaction processing and validation + +### How the bootloader communicates with the vm + +In the `era_vm` there is one special heap that is reserver for the bootloader. Only the bootloader can write to that heap and in order to do that there are reserved memory address than when written will stop the vm execution and the return the written value(see [here]()). We call this hooks and they allow the bootloader to gather information and data after every transaction or even in the middle of it. Hooks are used a lot through the execution of transaction. The most important hooks are: + +- PubdataRequested +- PostResult +- NotifyAboutRefunds +- AskOperatorForRefund +- TxHasEnded + +We'll dive deeper into the behaviour of these hooks, but first, you should know that there are two modes of execution: + +- OneTx: executes only one transaction. +- Batch: executes an array of transaction and batches it as a single one to the l1. + +Hooks will react based on the execution node. + +Explain hooks. + +### Rollbacks and snapshots + +When executing a transaction under panics and reverts the changes have to be rollbacked, that is, we have to set them back as they were in the previous frame. Remember that frames get created under `near_call` and `far_call` [opcodes](). That is why, in the struct we see that the fields are marked as `Rollbackable`. Currently, we are handling rollbacks trough snapshots, which are simple copies of the current field state and then when rollbacking we change the state with the snapshot one. In the future, we probably want to include a more performant way of doing it(see [here]() for a current opened discussion). + +The bootloader also has rollbacks, though they differ. Bootloader rollbacks involve a rollback to the whole bootloader state and the whole vm(the execution and the state). The latter are what we call `external snapshots`, this create a snapshot of whole state. [Here]() is some code. Before starting a new transaction, the bootloader crates an snapshot which it will use to make statistics to the last tx and it might rollback the whole vm if the whole transaction has failed. see [here](). + +## Execution + +Here the bootloader calls the `era_vm` to execute of a transaction. + +## State initialization + +At the start of every transaction, the bootloader loads the necessary contracts and prepares the environment for the era_vm. The `era_vm` receive a pointer to a struct that implements the following trait: + +```rust +trait Storage: Debug { + fn decommit(&mut self, hash: U256) -> Option>; + + fn storage_read(&mut self, key: &StorageKey) -> Option; + + fn cost_of_writing_storage(&mut self, key: &StorageKey, value: U256) -> u32; + + fn is_free_storage_slot(&self, key: &StorageKey) -> bool; +} +``` + +[Here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L726) you can see the implementation of this function. + +This storage is saved on the vm state and it is used all through the opcodes: + +- decommit: given a hash it returns a contract bytecode. +- storage_read: given a key it returns the value from the initial contract storage +- cost_of_writing_storage: when writing to the contract storage, gas is consumed, but the cost of writing is depends on wether the write is initial or not. +- is_free_storage_slot: if the address to write belongs to a system contract and the key belongs to the bootloader address. + +This functions are used to keep track of refunds and pubdata costs. + +## Publishing final state + +At the end of the batch, the bootloader needs to publish the pubdata to l1. Pubdata consists of the following fields: + +- L2 to L1 Logs: explain +- L2 to L1 Messages: explain +- Smart Contract Bytecodes: explain compression +- Storage writes: explain store, storage keys, in diff. + +[Here]() is an implementation. + +This, requires the era_vm to keep the state. For that we hold this struct + +```rust +struct VMState {} +``` + +You can see the implementation [here](). + +Here is what each field represents: + +- storage_changes: +- pubdata: +- l2_to_l1_messages: +- events: +- refunds: +- decommited_hashes: +- ...: + +For example, at the end of a batch, the bootloader will query the diff changes from the start of the tx, to publish it to ethereum. See [here]() for the implementation. From 083128462f8bbecaf40018d9c97950094afe788b Mon Sep 17 00:00:00 2001 From: Marcos Nicolau Date: Thu, 22 Aug 2024 11:08:03 -0300 Subject: [PATCH 02/14] First version Needs a polish --- docs/zksync-era-integration.md | 195 +++++++++++++++++++++++---------- 1 file changed, 136 insertions(+), 59 deletions(-) diff --git a/docs/zksync-era-integration.md b/docs/zksync-era-integration.md index 74aa24a3..30a41f79 100644 --- a/docs/zksync-era-integration.md +++ b/docs/zksync-era-integration.md @@ -1,57 +1,61 @@ -## Zksync-era integration +## Introduction -The `era_vm` doesn't do much by itself, it needs someone to tell it what to do and orquestrate it. On ethereum, for example, the execution node would take a block and process all of the transactions within it one by one using the `evm`. In zksync, things are a bit different, here transactions aren't processed one by one but in batch, this is to publish all of the transactions as just a single one, making the l1 post cheaper and distribitued among all the trnansactions withing the block. +So far we have been talking only about the `era_vm`. But you should know that the vm is only a small part in the zk stack. In fact, the zk stack is composed of many critical components. In this section we are only going to be interested in one particular component: the **nodes**, which can further be decomposed in the following units: -This is what the bootloader do. The bootloader is a `.yul` contract that lives on l1 and takes an array of transactions and executes them in the `era_vm`. +- **Operator**: the server that initialises the vm, injects the bootloader bytecode, receives transactions and pushes them into the vm(bootloader memory to be more specific) and start batches and seals them. +- **Bootloader**: a system contract that receives an array of transaction which are processed, validated, executed and then, the final state is published in the l1. +- **era_vm**: the virtual machine where the bootloader(and so all the transactions bytecode) gets executed. -Part of integrating our vm with the zk stack comes down to integrate the bootloader with our vm. That is, we need to make our vm "talk" with the bootloader. Before diving into the integration, we'll first explore what the bootloader does more in detail, which will explain what parts we need to keep track on our vm. +These three components are interacting with each other all the time to process transactions. In the next document, we'll go over general overview of the bootloader. Then, we'll move and do the same with the operator. After that, we'll see how the operators manages the bootloader. Finally, at the end, we'll see how data gets published on the l1. All of this, with the perspective of how this impacts the design of the `era_vm` which is what we care about the most in here. -### Bootloader +## Bootloader -At the most basic level, the bootloader performs the following steps: - -1. Transaction Validation and Processing -2. State Initialization -3. Execution -4. Publishing final state to the l1. - -## Transaction processing and validation +The bootloader is a special system contract whose hash lives on the l1 but its code isn't stored on the l1 nor on l2 but it gets compiled from `.yul` to the `era_vm` assembly with `zksolc` when the operator first initialises the vm(more on that below). -### How the bootloader communicates with the vm +The bootloader, unlike ethereum, takes an array of transactions(a batch) and executes all of them in one run (unless specified not to, that is, if the execution mode of the vm is set to OneTx. More on this below). This approach allows for the batch of transaction to be then posted on the l1 as just a single one, making the processing on ethereum cheaper, since taxes and gas can be distributed among all the transactions within the posted batch. -In the `era_vm` there is one special heap that is reserver for the bootloader. Only the bootloader can write to that heap and in order to do that there are reserved memory address than when written will stop the vm execution and the return the written value(see [here]()). We call this hooks and they allow the bootloader to gather information and data after every transaction or even in the middle of it. Hooks are used a lot through the execution of transaction. The most important hooks are: +At the most basic level, the bootloader performs the following steps: -- PubdataRequested -- PostResult -- NotifyAboutRefunds -- AskOperatorForRefund -- TxHasEnded +1. Reads the initial batch information and make a call to the SystemContext contract to validate the batch. + +2. Loops through all transactions and executes them until the `execute` flag is set to $0$, at that point, it jumps to step 3. +3. Seals l2 block and publish final data to the l1. -We'll dive deeper into the behaviour of these hooks, but first, you should know that there are two modes of execution: +[Here](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yul#L3962-L3965) you can see the main loop in the bootloader. -- OneTx: executes only one transaction. -- Batch: executes an array of transaction and batches it as a single one to the l1. +If you are curious and want to know more, here is the bootloader [contract](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yul) implementation. -Hooks will react based on the execution node. +## Operator -Explain hooks. +Currently the operator is a centralized server(there are plans to make a decentralised consensus see [here]) which can be thought as the entry point of the node, its responsibilities are: -### Rollbacks and snapshots +- Initializing the `era_vm` and keep its state. +- Orchestrating the bootloader and keep its state. +- Keeping storage database and commiting changes. -When executing a transaction under panics and reverts the changes have to be rollbacked, that is, we have to set them back as they were in the previous frame. Remember that frames get created under `near_call` and `far_call` [opcodes](). That is why, in the struct we see that the fields are marked as `Rollbackable`. Currently, we are handling rollbacks trough snapshots, which are simple copies of the current field state and then when rollbacking we change the state with the snapshot one. In the future, we probably want to include a more performant way of doing it(see [here]() for a current opened discussion). +Here is a simplified version of what the vm on the operator end looks like: -The bootloader also has rollbacks, though they differ. Bootloader rollbacks involve a rollback to the whole bootloader state and the whole vm(the execution and the state). The latter are what we call `external snapshots`, this create a snapshot of whole state. [Here]() is some code. Before starting a new transaction, the bootloader crates an snapshot which it will use to make statistics to the last tx and it might rollback the whole vm if the whole transaction has failed. see [here](). +```rust +struct OperatorVm { + pub(crate) inner: EraVM, // this would be the actual `era_vm` + pub suspended_at: u16, // last pc when execution stopped because of a hook + pub bootloader_state: BootloaderState, + pub(crate) storage: StorageDb, + pub snapshot: Option, +} +``` -## Execution +### Initializing the vm -Here the bootloader calls the `era_vm` to execute of a transaction. +The process of initializing the vm consists on: -## State initialization +- Loading the bootloader bytecode and initializing its state. +- Setting up the `era_vm` by injecting the bootloader code, loading default contracts, and setup other settings. -At the start of every transaction, the bootloader loads the necessary contracts and prepares the environment for the era_vm. The `era_vm` receive a pointer to a struct that implements the following trait: +When setting up `era_vm`, the operator provides access to the chain storage database with the following API: ```rust -trait Storage: Debug { +trait Storage { fn decommit(&mut self, hash: U256) -> Option>; fn storage_read(&mut self, key: &StorageKey) -> Option; @@ -62,44 +66,117 @@ trait Storage: Debug { } ``` -[Here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L726) you can see the implementation of this function. +This storage is saved on the vm state as a pointer. Here is what each function does: -This storage is saved on the vm state and it is used all through the opcodes: +- **decommit**: given a hash it returns a contract bytecode. +- **storage_read**: given a key, it returns the potential value. +- **cost_of_writing_storage**: when writing to the contract storage, gas is consumed, but the cost of writing is depends on whether the write is initial or not. More on that [here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/docs/specs/zk_evm/fee_model.md). +- **is_free_storage_slot**: if the address to write belongs to the system context and the key belongs to base L2 token address, then the storage_slot is free(doesn't incur in gas charges). -- decommit: given a hash it returns a contract bytecode. -- storage_read: given a key it returns the value from the initial contract storage -- cost_of_writing_storage: when writing to the contract storage, gas is consumed, but the cost of writing is depends on wether the write is initial or not. -- is_free_storage_slot: if the address to write belongs to a system contract and the key belongs to the bootloader address. +A few notes about this storage: -This functions are used to keep track of refunds and pubdata costs. +1. The key has the following structure: -## Publishing final state +```rust +struct StorageKey { + pub address: H160, + pub key: U256, +} +``` + +And this allows us to query a key that belongs to the current executing address. + +2. There isn't any consensus or spec about how storage should be implemented. We came up with this API because it is what we though was more convenient for the requirement. But, for example, the vm1 implements a query logic, where the operator will react based on the provided params. + +[Here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L726) you can take a look a the implementation of this trait in detail. + +This functions are specially used in the `era_vm` to calculate refunds and pubdata costs. See [Here](https://github.com/lambdaclass/era_vm/blob/main/src/state.rs#L108-L123) and [here](https://github.com/lambdaclass/era_vm/blob/main/src/state.rs#L132-L173). -At the end of the batch, the bootloader needs to publish the pubdata to l1. Pubdata consists of the following fields: +Finally, here is the [full](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L79-L154) vm initialization code. -- L2 to L1 Logs: explain -- L2 to L1 Messages: explain -- Smart Contract Bytecodes: explain compression -- Storage writes: explain store, storage keys, in diff. +### Orchestrating the bootloader -[Here]() is an implementation. +The operator is responsible for managing the bootloader within the `era_vm`. This includes injecting the bootloader code into the `era_vm` and maintaining its state by: -This, requires the era_vm to keep the state. For that we hold this struct +- Pushing transactions into the bootloader. +- Passing necessary parameters into the bootloader's memory. +- Start bootloader execution. +- Rollback the vm state in case of an err. + +Given that both transactions and the bootloader operate within the same `era_vm`, the bootloader has access to a reserved heap where the operator can write any required data. This interaction is continuous, as the bootloader is unaware of the broader state of the `era_vm`. To facilitate communication, the bootloader can write to a special address that triggers a suspension of the `era_vm` execution, allowing the operator to provide necessary data. This are known as hooks, and based on the written value a specific hook will get triggered by the operator. Hooks are extensively used throughout transaction execution, enabling the bootloader to gather information and data both after transactions and during their execution.Here are some of the most important hooks: + +- PostResult: Sets the last transaction result +- TxHasEnded: if the mode of execution is set to **OneTx**, then the execution is stopped and it returns the result collected in the **PostResult** hook. +- NotifyAboutRefunds: Informs the operator about the amount of gas refunded after a transaction. +- AskOperatorForRefund: here the bootloader asks the operator to suggest a refund amount for the transaction. +- PubdataRequested: At the end of the batch, the bootloader ask for the data to publish on the l1(more about this later). + +Now, where does the operator know where to write to? Well, within the `era_vm`, there exists a special heap reserved exclusively for the bootloader. The Operator writes all the data in that heap which has designated slots based on the type of data to write(see more [here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/docs/specs/zk_evm/bootloader.md#structure-of-the-bootloaders-memory)). Transactions, for example, are pushed into the `[252189..523261]` slots. + +### Rollbacks and snapshots + +In the `era_vm`, when a transaction encounters a panic or reverts, the vm needs to roll back the changes, restoring only a part of the [state](https://github.com/lambdaclass/era_vm/blob/main/src/state.rs#L43-L60) to its previous frame. Remember that frames are created under `near_call` and `far_call` opcodes, and to manage state rollbacks, fields in the related structs are marked as `Rollbackable` (see [here](https://github.com/lambdaclass/era_vm/blob/zksync-era-integration-tests/src/state.rs#L299-L307)). + +Currently, rollbacks are handled through snapshots by copying the current state of fields. If a rollback is necessary, the state is restored from these snapshots. While this method is functional, there’s ongoing discussion about implementing a more efficient rollback mechanism in the future (see [here](#)). + +The Bootloader, can fail sometimes, and it is the job of the Operator to trigger rollbacks. Though this type of rollbacks differ from the former. Bootloader rollbacks involve restoring not only the vm state but the also the bootloader state and the whole vm(the execution and the state). This snapshots are called `external snapshots` and can only be triggered by the bootloader. [Here]() you can see what a full snapshot looks like. Before starting a new batch execution, the operator creates a snapshot. This snapshot is also used at the end of the execution to collect the logs. + +> Notice that when we say vm `state`, we refer to the changes made to the data that lives on the chain, and the vm `execution` is the vm state of registers, memory, etc. This difference is important, since transactions reverts and panics only rollback the vm state (actually just a part of it not all), but a bootloader rollbacks also restore the vm `execution`. + +## Publishing data + +As said above, once the batch of transactions have all been executed, the final step in the bootloader is to publish the final data. The data to be published if composed of: + +- **L2 to L1 Logs**: Logs generated during L2 transactions that need to be recorded on L1. This can be transactions on L1 that have been forwarded to the L2 to lower costs. +- **L2 to L1 Messages**: used to transmit instructions or data from smart contracts on L2 to contracts or systems on L1. +- **Smart Contract Bytecodes**: This involves publishing the bytecodes of smart contracts deployed on L2. Before being sent to L1, these bytecodes are often compressed to save space and reduce costs. +- **Storage writes**: These are records of changes to the storage on L2. Only the final diff from the previous state is included. + +In theory, with this data one should be able to reconstruct the whole state of the l2. + +At the end of the batch, bootloader calls the `PubdataRequested` hook to ask the operator for the final batch state. The operator writes into the bootloader memory(slots [40053..248052]) the collected data from the`era_vm`. [Here]() the hook implementation in detail. + +Now, this requires the `era_vm` to keep a state for all the changes in the L2 state. For that we hold the following a struct: ```rust -struct VMState {} +struct VMState { + storage_changes: HashMap, + transient_storage: HashMap, + l2_to_l1_logs: Vec, + events: Vec, + pubdata: Primitive, + pubdata_costs: Vec, + paid_changes: HashMap, + refunds: Vec, + read_storage_slots: HashSet, + written_storage_slots: HashSet, + decommitted_hashes: HashSet, +} ``` -You can see the implementation [here](). - Here is what each field represents: -- storage_changes: -- pubdata: -- l2_to_l1_messages: -- events: -- refunds: -- decommited_hashes: -- ...: +- **storage_changes**: Tracks the changes to the storage keys and their new values during execution. +- **transient_storage**: Temporary storage that last until the end of the transaction, meaning that it gets cleared after every transaction. +- **l2_to_l1_logs**: Logs generated during execution that need to be sent from L2 to L1. +- **events**: Events triggered during contract execution, often used for logging or triggering other actions. +- - **pubdata_costs**: The costs associated with publishing data to L1, used for fee calculations. +- **pubdata**: Holds the sum of `pubdata_costs`. +- **paid_changes**: After every write, tracks the cost to write to a key, to charge the difference in price on a subsequent writes to that key . +- **refunds**: A list of refund amounts that have been calculated during the transaction. +- **read_storage_slots**: A set of storage keys that have been read during execution, used to calculate gas fees and refunds. +- **written_storage_slots**: A set of storage keys that have been written to during execution, used to calculate gas fees and refunds. +- **decommitted_hashes**: Stores the hashes that have been the decommited through the whole execution. When decommiting a hash in a `far_call` or `decommit`, we check if the has been already decommited, if true then the decommit is free of charge. + +And so we end up with two key structures on the `era_vm`: + +- The execution state: the state of registers, heaps, frames, etc. +- The L2 state changes: the changes on the chain that will get publish on L1 and committed on the l2 database. + +You can see the implementation and how we work with it [here](https://github.com/lambdaclass/era_vm/blob/zksync-era-integration-tests/src/state.rs). + +Finally, everything finishes by the operator committing the changes to its database. + +## Final comment -For example, at the end of a batch, the bootloader will query the diff changes from the start of the tx, to publish it to ethereum. See [here]() for the implementation. +This document aims to give you a brief overview of the `era_vm` integration with the zk-stack and how this impacts the vm design and architecture. For that, we first needed to understood what a node is and its parts: bootloader and operator. In the explanation many details of the bootloader and operator where left behind, we only pick the parts that mostly involved and impacted the `era_vm` design. From cf9c3551649d16fcb227b79e348b9ecc80345a5e Mon Sep 17 00:00:00 2001 From: Marcos Nicolau Date: Thu, 22 Aug 2024 13:09:16 -0300 Subject: [PATCH 03/14] Polish redaction --- docs/zksync-era-integration.md | 96 +++++++++++++++++----------------- 1 file changed, 48 insertions(+), 48 deletions(-) diff --git a/docs/zksync-era-integration.md b/docs/zksync-era-integration.md index 30a41f79..54627aaa 100644 --- a/docs/zksync-era-integration.md +++ b/docs/zksync-era-integration.md @@ -1,37 +1,37 @@ ## Introduction -So far we have been talking only about the `era_vm`. But you should know that the vm is only a small part in the zk stack. In fact, the zk stack is composed of many critical components. In this section we are only going to be interested in one particular component: the **nodes**, which can further be decomposed in the following units: +So far we have been talking only about the `era_vm`. But you should know that the vm is only a small part of the zk stack. The zk stack is composed of many critical components. In this section, we are only going to be interested in one particular component: the **nodes**, which can further be decomposed into the following units: -- **Operator**: the server that initialises the vm, injects the bootloader bytecode, receives transactions and pushes them into the vm(bootloader memory to be more specific) and start batches and seals them. -- **Bootloader**: a system contract that receives an array of transaction which are processed, validated, executed and then, the final state is published in the l1. +- **Operator/Sequencer**: the server that initializes the vm, injects the bootloader bytecode, receives transactions and pushes them into the vm(bootloader memory to be more specific) and start batches and seals them. +- **Bootloader**: a system contract that receives an array of transactions which are processed, validated, executed, and then, the final state is published in the l1. - **era_vm**: the virtual machine where the bootloader(and so all the transactions bytecode) gets executed. -These three components are interacting with each other all the time to process transactions. In the next document, we'll go over general overview of the bootloader. Then, we'll move and do the same with the operator. After that, we'll see how the operators manages the bootloader. Finally, at the end, we'll see how data gets published on the l1. All of this, with the perspective of how this impacts the design of the `era_vm` which is what we care about the most in here. +These components interact continuously to process transactions. This document will provide an overview of the bootloader, then explore the operator, its management of the bootloader, and finally, the data publishing process to L1. All of this while primarily focusing on how these interactions impact the design of the `era_vm`. ## Bootloader -The bootloader is a special system contract whose hash lives on the l1 but its code isn't stored on the l1 nor on l2 but it gets compiled from `.yul` to the `era_vm` assembly with `zksolc` when the operator first initialises the vm(more on that below). +The bootloader is a special system contract whose hash resides on L1, but its code isn't stored on either L1 or L2. Instead, it’s compiled from `.yul` to `era_vm` assembly using `zksolc` when the operator first initializes the VM (more on that below). -The bootloader, unlike ethereum, takes an array of transactions(a batch) and executes all of them in one run (unless specified not to, that is, if the execution mode of the vm is set to OneTx. More on this below). This approach allows for the batch of transaction to be then posted on the l1 as just a single one, making the processing on ethereum cheaper, since taxes and gas can be distributed among all the transactions within the posted batch. +The bootloader, unlike Ethereum, takes an array of transactions(a batch) and executes all of them in one run (unless specified not to, that is, if the execution mode of the vm is set to OneTx). This approach allows the transaction batch to be posted on the l1 as just a single one, making the processing on Ethereum cheaper, since taxes and gas can be distributed among all the transactions within the posted batch. At the most basic level, the bootloader performs the following steps: -1. Reads the initial batch information and make a call to the SystemContext contract to validate the batch. - +1. Reads the initial batch information and makes a call to the SystemContext contract to validate the batch. + 2. Loops through all transactions and executes them until the `execute` flag is set to $0$, at that point, it jumps to step 3. 3. Seals l2 block and publish final data to the l1. -[Here](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yul#L3962-L3965) you can see the main loop in the bootloader. +The initial validation of the batch is necessary, since, as we'll see below, the the bootloader starts with its memory pre-filled with any data the operator wants. That is why it needs to validate its correctness. -If you are curious and want to know more, here is the bootloader [contract](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yul) implementation. +For more details, you can see the [main loop](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yul#L3962-L3965) or the [full contract code](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yu) -## Operator +## Operator/sequencer -Currently the operator is a centralized server(there are plans to make a decentralised consensus see [here]) which can be thought as the entry point of the node, its responsibilities are: +Currently, the operator is a centralized server (there are plans to make a decentralized consensus for operators) which can be thought of as the node entry point, its responsibilities are: -- Initializing the `era_vm` and keep its state. -- Orchestrating the bootloader and keep its state. -- Keeping storage database and commiting changes. +- Initializing the `era_vm` and keeping its state. +- Orchestrating the bootloader and keeping its state. +- Keeping storage database and committing changes. Here is a simplified version of what the vm on the operator end looks like: @@ -47,10 +47,10 @@ struct OperatorVm { ### Initializing the vm -The process of initializing the vm consists on: +VM initialization involves: - Loading the bootloader bytecode and initializing its state. -- Setting up the `era_vm` by injecting the bootloader code, loading default contracts, and setup other settings. +- Setting up the `era_vm` by injecting the bootloader code, loading default contracts, and configuring other settings. When setting up `era_vm`, the operator provides access to the chain storage database with the following API: @@ -66,12 +66,12 @@ trait Storage { } ``` -This storage is saved on the vm state as a pointer. Here is what each function does: +This storage is saved in the VM state as a pointer. Here’s a brief explanation of each function: - **decommit**: given a hash it returns a contract bytecode. - **storage_read**: given a key, it returns the potential value. -- **cost_of_writing_storage**: when writing to the contract storage, gas is consumed, but the cost of writing is depends on whether the write is initial or not. More on that [here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/docs/specs/zk_evm/fee_model.md). -- **is_free_storage_slot**: if the address to write belongs to the system context and the key belongs to base L2 token address, then the storage_slot is free(doesn't incur in gas charges). +- **cost_of_writing_storage**: when writing to the contract storage, gas is consumed, but the cost of writing depends on whether the write is initial or not. More on that [here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/docs/specs/zk_evm/fee_model.md). +- **is_free_storage_slot**: if the address to write belongs to the system context and the key belongs to the base L2 token address, then the storage_slot is free(doesn't incur gas charges). A few notes about this storage: @@ -84,13 +84,13 @@ struct StorageKey { } ``` -And this allows us to query a key that belongs to the current executing address. +This allows us to query a key that belongs to the current executing address. -2. There isn't any consensus or spec about how storage should be implemented. We came up with this API because it is what we though was more convenient for the requirement. But, for example, the vm1 implements a query logic, where the operator will react based on the provided params. +2. There isn't any consensus or spec about how storage should be implemented. We came up with this API because it is what we thought was more convenient for the requirement. But, for example, the vm1 implements a query logic, where the operator will react based on the provided params. [Here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L726) you can take a look a the implementation of this trait in detail. -This functions are specially used in the `era_vm` to calculate refunds and pubdata costs. See [Here](https://github.com/lambdaclass/era_vm/blob/main/src/state.rs#L108-L123) and [here](https://github.com/lambdaclass/era_vm/blob/main/src/state.rs#L132-L173). +These functions are specially used in the `era_vm` to calculate refunds and pubdata costs. See [Here](https://github.com/lambdaclass/era_vm/blob/main/src/state.rs#L108-L123) and [here](https://github.com/lambdaclass/era_vm/blob/main/src/state.rs#L132-L173). Finally, here is the [full](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L79-L154) vm initialization code. @@ -100,32 +100,32 @@ The operator is responsible for managing the bootloader within the `era_vm`. Thi - Pushing transactions into the bootloader. - Passing necessary parameters into the bootloader's memory. -- Start bootloader execution. -- Rollback the vm state in case of an err. +- Starting bootloader execution. +- Rolling back the VM state in case of errors. -Given that both transactions and the bootloader operate within the same `era_vm`, the bootloader has access to a reserved heap where the operator can write any required data. This interaction is continuous, as the bootloader is unaware of the broader state of the `era_vm`. To facilitate communication, the bootloader can write to a special address that triggers a suspension of the `era_vm` execution, allowing the operator to provide necessary data. This are known as hooks, and based on the written value a specific hook will get triggered by the operator. Hooks are extensively used throughout transaction execution, enabling the bootloader to gather information and data both after transactions and during their execution.Here are some of the most important hooks: +Since both transactions and the bootloader run in the same `era_vm`, the bootloader accesses a reserved heap where the operator writes any required data. This interaction is continuous, as the bootloader is unaware of the broader state of the `era_vm`. To facilitate communication, the bootloader can write to a special address that triggers a suspension of the `era_vm` execution, allowing the operator to provide necessary data. These are known as hooks, and based on the written value a specific hook will get triggered by the operator. Here are some of the most important hooks: -- PostResult: Sets the last transaction result -- TxHasEnded: if the mode of execution is set to **OneTx**, then the execution is stopped and it returns the result collected in the **PostResult** hook. -- NotifyAboutRefunds: Informs the operator about the amount of gas refunded after a transaction. -- AskOperatorForRefund: here the bootloader asks the operator to suggest a refund amount for the transaction. -- PubdataRequested: At the end of the batch, the bootloader ask for the data to publish on the l1(more about this later). +- **PostResult**: Sets the last transaction result +- **TxHasEnded**: if the mode of execution is set to **OneTx**, then the execution is stopped and it returns the result collected in the **PostResult** hook. +- **NotifyAboutRefunds**: Inform the operator about the amount of gas refunded after a transaction. +- **AskOperatorForRefund**: here the bootloader asks the operator to suggest a refund amount for the transaction. +- **PubdataRequested**: At the end of the batch, the bootloader asks for the data to publish on the l1(more about this later). -Now, where does the operator know where to write to? Well, within the `era_vm`, there exists a special heap reserved exclusively for the bootloader. The Operator writes all the data in that heap which has designated slots based on the type of data to write(see more [here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/docs/specs/zk_evm/bootloader.md#structure-of-the-bootloaders-memory)). Transactions, for example, are pushed into the `[252189..523261]` slots. +Now, where does the operator know where to write to? Again, within the `era_vm`, there exists a special heap reserved exclusively for the bootloader. The Operator writes all the data in that heap which has designated slots based on the type of data to write(see more [here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/docs/specs/zk_evm/bootloader.md#structure-of-the-bootloaders-memory)). Transactions, for example, are pushed into the `[252189..523261]` slots. ### Rollbacks and snapshots In the `era_vm`, when a transaction encounters a panic or reverts, the vm needs to roll back the changes, restoring only a part of the [state](https://github.com/lambdaclass/era_vm/blob/main/src/state.rs#L43-L60) to its previous frame. Remember that frames are created under `near_call` and `far_call` opcodes, and to manage state rollbacks, fields in the related structs are marked as `Rollbackable` (see [here](https://github.com/lambdaclass/era_vm/blob/zksync-era-integration-tests/src/state.rs#L299-L307)). -Currently, rollbacks are handled through snapshots by copying the current state of fields. If a rollback is necessary, the state is restored from these snapshots. While this method is functional, there’s ongoing discussion about implementing a more efficient rollback mechanism in the future (see [here](#)). +Currently, rollbacks are perform using snapshots which are just copies of the current state. If a rollback is necessary, the state is restored from these snapshots. -The Bootloader, can fail sometimes, and it is the job of the Operator to trigger rollbacks. Though this type of rollbacks differ from the former. Bootloader rollbacks involve restoring not only the vm state but the also the bootloader state and the whole vm(the execution and the state). This snapshots are called `external snapshots` and can only be triggered by the bootloader. [Here]() you can see what a full snapshot looks like. Before starting a new batch execution, the operator creates a snapshot. This snapshot is also used at the end of the execution to collect the logs. +The Bootloader, can fail sometimes, and it is the job of the Operator to trigger rollbacks. However, this type of rollback differs from the ones just mentioned. Bootloader rollbacks involve restoring not only the [full vm state](https://github.com/lambdaclass/era_vm/blob/zksync-era-integration-tests/src/vm.rs#L66-L69) but also the bootloader state. These snapshots are called external snapshots and can only be triggered by the Bootloader. [Here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/snapshot.rs) you can see what a full snapshot looks like. Before starting a new batch execution, the operator creates a snapshot, which is also used at the end of execution to collect logs (see [here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L508-L595) for more details). -> Notice that when we say vm `state`, we refer to the changes made to the data that lives on the chain, and the vm `execution` is the vm state of registers, memory, etc. This difference is important, since transactions reverts and panics only rollback the vm state (actually just a part of it not all), but a bootloader rollbacks also restore the vm `execution`. +> Notice that when we say vm `state`, we refer to the changes made to the data that lives on the chain, and the vm `execution` is the vm state of registers, memory, etc. This difference is important since transactions reverts and panics only rollback the vm `state` (actually just a part of it not all), but bootloader rollbacks also restore the vm `execution`. ## Publishing data -As said above, once the batch of transactions have all been executed, the final step in the bootloader is to publish the final data. The data to be published if composed of: +As said above, once the batch of transactions has all been executed, the final step in the bootloader is to publish the final data. The data to be published is composed of: - **L2 to L1 Logs**: Logs generated during L2 transactions that need to be recorded on L1. This can be transactions on L1 that have been forwarded to the L2 to lower costs. - **L2 to L1 Messages**: used to transmit instructions or data from smart contracts on L2 to contracts or systems on L1. @@ -134,9 +134,9 @@ As said above, once the batch of transactions have all been executed, the final In theory, with this data one should be able to reconstruct the whole state of the l2. -At the end of the batch, bootloader calls the `PubdataRequested` hook to ask the operator for the final batch state. The operator writes into the bootloader memory(slots [40053..248052]) the collected data from the`era_vm`. [Here]() the hook implementation in detail. +At the end of the batch, the bootloader calls the `PubdataRequested` hook to ask the operator for the final batch state. The operator writes into the bootloader memory(slots [40053..248052]) the collected data from the`era_vm`. [Here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L312-L353) you can see the hook implementation in detail. -Now, this requires the `era_vm` to keep a state for all the changes in the L2 state. For that we hold the following a struct: +Now, this requires the `era_vm` to keep a state for all the changes in the L2 state. For that, we hold the following structure: ```rust struct VMState { @@ -144,8 +144,8 @@ struct VMState { transient_storage: HashMap, l2_to_l1_logs: Vec, events: Vec, - pubdata: Primitive, pubdata_costs: Vec, + pubdata: Primitive, paid_changes: HashMap, refunds: Vec, read_storage_slots: HashSet, @@ -157,26 +157,26 @@ struct VMState { Here is what each field represents: - **storage_changes**: Tracks the changes to the storage keys and their new values during execution. -- **transient_storage**: Temporary storage that last until the end of the transaction, meaning that it gets cleared after every transaction. +- **transient_storage**: Temporary storage that lasts until the end of the transaction, meaning that it gets cleared after every transaction. - **l2_to_l1_logs**: Logs generated during execution that need to be sent from L2 to L1. -- **events**: Events triggered during contract execution, often used for logging or triggering other actions. +- **events**: Events triggered during contract execution. - - **pubdata_costs**: The costs associated with publishing data to L1, used for fee calculations. - **pubdata**: Holds the sum of `pubdata_costs`. -- **paid_changes**: After every write, tracks the cost to write to a key, to charge the difference in price on a subsequent writes to that key . -- **refunds**: A list of refund amounts that have been calculated during the transaction. +- **paid_changes**: After every write, tracks the cost to write to a key, to charge the difference in price on a subsequent writes to that key. +- **refunds**: A list of refund amounts that have been calculated during execution. - **read_storage_slots**: A set of storage keys that have been read during execution, used to calculate gas fees and refunds. - **written_storage_slots**: A set of storage keys that have been written to during execution, used to calculate gas fees and refunds. - **decommitted_hashes**: Stores the hashes that have been the decommited through the whole execution. When decommiting a hash in a `far_call` or `decommit`, we check if the has been already decommited, if true then the decommit is free of charge. And so we end up with two key structures on the `era_vm`: -- The execution state: the state of registers, heaps, frames, etc. -- The L2 state changes: the changes on the chain that will get publish on L1 and committed on the l2 database. +- [The execution state](https://github.com/lambdaclass/era_vm/blob/main/src/execution.rs#L32-L51): the state of registers, heaps, frames, etc. +- The L2 state changes: the changes on the chain that will get published on L1 and committed to the l2 database. -You can see the implementation and how we work with it [here](https://github.com/lambdaclass/era_vm/blob/zksync-era-integration-tests/src/state.rs). +[Here](https://github.com/lambdaclass/era_vm/blob/zksync-era-integration-tests/src/state.rs) is the full code on how we manage the state changes, refunds, pubdata and more. -Finally, everything finishes by the operator committing the changes to its database. +Finally, everything finishes with the operator committing the changes to its database. ## Final comment -This document aims to give you a brief overview of the `era_vm` integration with the zk-stack and how this impacts the vm design and architecture. For that, we first needed to understood what a node is and its parts: bootloader and operator. In the explanation many details of the bootloader and operator where left behind, we only pick the parts that mostly involved and impacted the `era_vm` design. +This document provides an overview of the `era_vm` integration within the zk-stack, focusing on the bootloader, oand perator, and how their interactions impact the VM's design and architecture. In the explanation many details of the bootloader and operator were left behind, we only picked the parts that mostly involved and impacted the `era_vm` design. From dd52beb06b8bab606edcdcba47bda964758b842e Mon Sep 17 00:00:00 2001 From: Marcos Nicolau Date: Thu, 22 Aug 2024 13:24:08 -0300 Subject: [PATCH 04/14] More corrections --- docs/zksync-era-integration.md | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/docs/zksync-era-integration.md b/docs/zksync-era-integration.md index 54627aaa..d2be1079 100644 --- a/docs/zksync-era-integration.md +++ b/docs/zksync-era-integration.md @@ -1,3 +1,5 @@ +# ZK stack + ## Introduction So far we have been talking only about the `era_vm`. But you should know that the vm is only a small part of the zk stack. The zk stack is composed of many critical components. In this section, we are only going to be interested in one particular component: the **nodes**, which can further be decomposed into the following units: @@ -17,13 +19,12 @@ The bootloader, unlike Ethereum, takes an array of transactions(a batch) and exe At the most basic level, the bootloader performs the following steps: 1. Reads the initial batch information and makes a call to the SystemContext contract to validate the batch. - -2. Loops through all transactions and executes them until the `execute` flag is set to $0$, at that point, it jumps to step 3. +2. Loops through all transactions and executes them until the `execute` flag is set to $0$, at that point, it jumps to step 3. 3. Seals l2 block and publish final data to the l1. -The initial validation of the batch is necessary, since, as we'll see below, the the bootloader starts with its memory pre-filled with any data the operator wants. That is why it needs to validate its correctness. +The initial validation of the batch is necessary, since, as we'll see below, the bootloader starts with its memory pre-filled with any data the operator wants. That is why it needs to validate its correctness. -For more details, you can see the [main loop](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yul#L3962-L3965) or the [full contract code](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yu) +For more details, you can see the [main loop](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yul#L3962-L3965) or the [full contract code](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yu). ## Operator/sequencer @@ -68,8 +69,8 @@ trait Storage { This storage is saved in the VM state as a pointer. Here’s a brief explanation of each function: -- **decommit**: given a hash it returns a contract bytecode. -- **storage_read**: given a key, it returns the potential value. +- **decommit**: given a hash it returns a contract bytecode from the database. +- **storage_read**: given a key, it returns the potential value from the database. - **cost_of_writing_storage**: when writing to the contract storage, gas is consumed, but the cost of writing depends on whether the write is initial or not. More on that [here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/docs/specs/zk_evm/fee_model.md). - **is_free_storage_slot**: if the address to write belongs to the system context and the key belongs to the base L2 token address, then the storage_slot is free(doesn't incur gas charges). @@ -90,7 +91,7 @@ This allows us to query a key that belongs to the current executing address. [Here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L726) you can take a look a the implementation of this trait in detail. -These functions are specially used in the `era_vm` to calculate refunds and pubdata costs. See [Here](https://github.com/lambdaclass/era_vm/blob/main/src/state.rs#L108-L123) and [here](https://github.com/lambdaclass/era_vm/blob/main/src/state.rs#L132-L173). +These functions are specially used in the `era_vm` to calculate refunds and pubdata costs. See [here](https://github.com/lambdaclass/era_vm/blob/main/src/state.rs#L108-L123) and [here](https://github.com/lambdaclass/era_vm/blob/main/src/state.rs#L132-L173). Finally, here is the [full](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L79-L154) vm initialization code. @@ -103,25 +104,23 @@ The operator is responsible for managing the bootloader within the `era_vm`. Thi - Starting bootloader execution. - Rolling back the VM state in case of errors. -Since both transactions and the bootloader run in the same `era_vm`, the bootloader accesses a reserved heap where the operator writes any required data. This interaction is continuous, as the bootloader is unaware of the broader state of the `era_vm`. To facilitate communication, the bootloader can write to a special address that triggers a suspension of the `era_vm` execution, allowing the operator to provide necessary data. These are known as hooks, and based on the written value a specific hook will get triggered by the operator. Here are some of the most important hooks: +Since both transactions and the bootloader run in the same `era_vm`, the bootloader accesses a reserved heap where the operator writes any required data. This interaction is continuous, as the bootloader is unaware of the broader state of the `era_vm`. To facilitate communication, the bootloader can write to a special address that triggers a suspension of the `era_vm` execution, allowing the operator to provide necessary data. These are known as **hooks**, and based on the written value a specific hook will get triggered by the operator. Here are some of the most important hooks: - **PostResult**: Sets the last transaction result -- **TxHasEnded**: if the mode of execution is set to **OneTx**, then the execution is stopped and it returns the result collected in the **PostResult** hook. +- **TxHasEnded**: if the mode of execution is set to **OneTx**, then the execution is stopped and it returns the result collected in the _PostResult_ hook. - **NotifyAboutRefunds**: Inform the operator about the amount of gas refunded after a transaction. - **AskOperatorForRefund**: here the bootloader asks the operator to suggest a refund amount for the transaction. -- **PubdataRequested**: At the end of the batch, the bootloader asks for the data to publish on the l1(more about this later). +- **PubdataRequested**: At the end of the batch, the bootloader asks for the data to publish on the l1 (more on this later). -Now, where does the operator know where to write to? Again, within the `era_vm`, there exists a special heap reserved exclusively for the bootloader. The Operator writes all the data in that heap which has designated slots based on the type of data to write(see more [here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/docs/specs/zk_evm/bootloader.md#structure-of-the-bootloaders-memory)). Transactions, for example, are pushed into the `[252189..523261]` slots. +Now, where does the operator know where to write to? Again, within the `era_vm`, there exists a special heap reserved exclusively for the bootloader. The Operator writes all the data in that heap which has designated slots based on the type of data to write (see more [here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/docs/specs/zk_evm/bootloader.md#structure-of-the-bootloaders-memory)). Transactions, for example, are pushed into the `[252189..523261]` slots. ### Rollbacks and snapshots -In the `era_vm`, when a transaction encounters a panic or reverts, the vm needs to roll back the changes, restoring only a part of the [state](https://github.com/lambdaclass/era_vm/blob/main/src/state.rs#L43-L60) to its previous frame. Remember that frames are created under `near_call` and `far_call` opcodes, and to manage state rollbacks, fields in the related structs are marked as `Rollbackable` (see [here](https://github.com/lambdaclass/era_vm/blob/zksync-era-integration-tests/src/state.rs#L299-L307)). - -Currently, rollbacks are perform using snapshots which are just copies of the current state. If a rollback is necessary, the state is restored from these snapshots. +In the `era_vm`, when a transaction encounters a panic or reverts, the vm needs to roll back the changes, restoring only a part of the [state](https://github.com/lambdaclass/era_vm/blob/main/src/state.rs#L43-L60) to its previous frame. Remember that frames are created under `near_call` and `far_call` opcodes. Currently, rollbacks are perform using snapshots which are just copies of the current state. If a rollback is necessary, the state is restored from these snapshots. The Bootloader, can fail sometimes, and it is the job of the Operator to trigger rollbacks. However, this type of rollback differs from the ones just mentioned. Bootloader rollbacks involve restoring not only the [full vm state](https://github.com/lambdaclass/era_vm/blob/zksync-era-integration-tests/src/vm.rs#L66-L69) but also the bootloader state. These snapshots are called external snapshots and can only be triggered by the Bootloader. [Here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/snapshot.rs) you can see what a full snapshot looks like. Before starting a new batch execution, the operator creates a snapshot, which is also used at the end of execution to collect logs (see [here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L508-L595) for more details). -> Notice that when we say vm `state`, we refer to the changes made to the data that lives on the chain, and the vm `execution` is the vm state of registers, memory, etc. This difference is important since transactions reverts and panics only rollback the vm `state` (actually just a part of it not all), but bootloader rollbacks also restore the vm `execution`. +> Notice that when we say vm `state`, we refer to the changes made to the data that lives on the chain, and the vm `execution` is the vm state of registers, memory, etc (see more [here](#era_vm-key-structures)). This difference is important since transactions reverts and panics only rollback the vm `state` (actually just a part of it not all), but bootloader rollbacks also restore the vm `execution`. ## Publishing data @@ -160,7 +159,7 @@ Here is what each field represents: - **transient_storage**: Temporary storage that lasts until the end of the transaction, meaning that it gets cleared after every transaction. - **l2_to_l1_logs**: Logs generated during execution that need to be sent from L2 to L1. - **events**: Events triggered during contract execution. -- - **pubdata_costs**: The costs associated with publishing data to L1, used for fee calculations. +- **pubdata_costs**: The costs associated with publishing data to L1, used for fee calculations. - **pubdata**: Holds the sum of `pubdata_costs`. - **paid_changes**: After every write, tracks the cost to write to a key, to charge the difference in price on a subsequent writes to that key. - **refunds**: A list of refund amounts that have been calculated during execution. @@ -168,6 +167,7 @@ Here is what each field represents: - **written_storage_slots**: A set of storage keys that have been written to during execution, used to calculate gas fees and refunds. - **decommitted_hashes**: Stores the hashes that have been the decommited through the whole execution. When decommiting a hash in a `far_call` or `decommit`, we check if the has been already decommited, if true then the decommit is free of charge. + And so we end up with two key structures on the `era_vm`: - [The execution state](https://github.com/lambdaclass/era_vm/blob/main/src/execution.rs#L32-L51): the state of registers, heaps, frames, etc. From 8782e3b13d5b8ebd5933bcb5c163cbae2fc4bf78 Mon Sep 17 00:00:00 2001 From: Marcos Nicolau Date: Thu, 22 Aug 2024 13:26:26 -0300 Subject: [PATCH 05/14] Even more polish --- docs/zksync-era-integration.md | 50 +++++++++++++++++----------------- 1 file changed, 25 insertions(+), 25 deletions(-) diff --git a/docs/zksync-era-integration.md b/docs/zksync-era-integration.md index d2be1079..3adc9d15 100644 --- a/docs/zksync-era-integration.md +++ b/docs/zksync-era-integration.md @@ -4,34 +4,34 @@ So far we have been talking only about the `era_vm`. But you should know that the vm is only a small part of the zk stack. The zk stack is composed of many critical components. In this section, we are only going to be interested in one particular component: the **nodes**, which can further be decomposed into the following units: -- **Operator/Sequencer**: the server that initializes the vm, injects the bootloader bytecode, receives transactions and pushes them into the vm(bootloader memory to be more specific) and start batches and seals them. +- **Operator/Sequencer**: the server that initializes the vm, injects the Bootloader bytecode, receives transactions and pushes them into the vm(bootloader memory to be more specific) and start batches and seals them. - **Bootloader**: a system contract that receives an array of transactions which are processed, validated, executed, and then, the final state is published in the l1. -- **era_vm**: the virtual machine where the bootloader(and so all the transactions bytecode) gets executed. +- **era_vm**: the virtual machine where the Bootloader(and so all the transactions bytecode) gets executed. -These components interact continuously to process transactions. This document will provide an overview of the bootloader, then explore the operator, its management of the bootloader, and finally, the data publishing process to L1. All of this while primarily focusing on how these interactions impact the design of the `era_vm`. +These components interact continuously to process transactions. This document will provide an overview of the Bootloader, then explore the operator, its management of the bootloader, and finally, the data publishing process to L1. All of this while primarily focusing on how these interactions impact the design of the `era_vm`. ## Bootloader -The bootloader is a special system contract whose hash resides on L1, but its code isn't stored on either L1 or L2. Instead, it’s compiled from `.yul` to `era_vm` assembly using `zksolc` when the operator first initializes the VM (more on that below). +The Bootloader is a special system contract whose hash resides on L1, but its code isn't stored on either L1 or L2. Instead, it’s compiled from `.yul` to `era_vm` assembly using `zksolc` when the operator first initializes the VM (more on that below). -The bootloader, unlike Ethereum, takes an array of transactions(a batch) and executes all of them in one run (unless specified not to, that is, if the execution mode of the vm is set to OneTx). This approach allows the transaction batch to be posted on the l1 as just a single one, making the processing on Ethereum cheaper, since taxes and gas can be distributed among all the transactions within the posted batch. +The Bootloader, unlike Ethereum, takes an array of transactions(a batch) and executes all of them in one run (unless specified not to, that is, if the execution mode of the vm is set to OneTx). This approach allows the transaction batch to be posted on the l1 as just a single one, making the processing on Ethereum cheaper, since taxes and gas can be distributed among all the transactions within the posted batch. -At the most basic level, the bootloader performs the following steps: +At the most basic level, the Bootloader performs the following steps: 1. Reads the initial batch information and makes a call to the SystemContext contract to validate the batch. 2. Loops through all transactions and executes them until the `execute` flag is set to $0$, at that point, it jumps to step 3. 3. Seals l2 block and publish final data to the l1. -The initial validation of the batch is necessary, since, as we'll see below, the bootloader starts with its memory pre-filled with any data the operator wants. That is why it needs to validate its correctness. +The initial validation of the batch is necessary, since, as we'll see below, the Bootloader starts with its memory pre-filled with any data the operator wants. That is why it needs to validate its correctness. -For more details, you can see the [main loop](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yul#L3962-L3965) or the [full contract code](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yu). +For more details, you can see the [main loop](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/Bootloader/bootloader.yul#L3962-L3965) or the [full contract code](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yu). ## Operator/sequencer Currently, the operator is a centralized server (there are plans to make a decentralized consensus for operators) which can be thought of as the node entry point, its responsibilities are: - Initializing the `era_vm` and keeping its state. -- Orchestrating the bootloader and keeping its state. +- Orchestrating the Bootloader and keeping its state. - Keeping storage database and committing changes. Here is a simplified version of what the vm on the operator end looks like: @@ -40,7 +40,7 @@ Here is a simplified version of what the vm on the operator end looks like: struct OperatorVm { pub(crate) inner: EraVM, // this would be the actual `era_vm` pub suspended_at: u16, // last pc when execution stopped because of a hook - pub bootloader_state: BootloaderState, + pub Bootloader_state: BootloaderState, pub(crate) storage: StorageDb, pub snapshot: Option, } @@ -50,8 +50,8 @@ struct OperatorVm { VM initialization involves: -- Loading the bootloader bytecode and initializing its state. -- Setting up the `era_vm` by injecting the bootloader code, loading default contracts, and configuring other settings. +- Loading the Bootloader bytecode and initializing its state. +- Setting up the `era_vm` by injecting the Bootloader code, loading default contracts, and configuring other settings. When setting up `era_vm`, the operator provides access to the chain storage database with the following API: @@ -95,24 +95,24 @@ These functions are specially used in the `era_vm` to calculate refunds and pubd Finally, here is the [full](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L79-L154) vm initialization code. -### Orchestrating the bootloader +### Orchestrating the Bootloader -The operator is responsible for managing the bootloader within the `era_vm`. This includes injecting the bootloader code into the `era_vm` and maintaining its state by: +The operator is responsible for managing the Bootloader within the `era_vm`. This includes injecting the bootloader code into the `era_vm` and maintaining its state by: -- Pushing transactions into the bootloader. -- Passing necessary parameters into the bootloader's memory. -- Starting bootloader execution. +- Pushing transactions into the Bootloader. +- Passing necessary parameters into the Bootloader's memory. +- Starting Bootloader execution. - Rolling back the VM state in case of errors. -Since both transactions and the bootloader run in the same `era_vm`, the bootloader accesses a reserved heap where the operator writes any required data. This interaction is continuous, as the bootloader is unaware of the broader state of the `era_vm`. To facilitate communication, the bootloader can write to a special address that triggers a suspension of the `era_vm` execution, allowing the operator to provide necessary data. These are known as **hooks**, and based on the written value a specific hook will get triggered by the operator. Here are some of the most important hooks: +Since both transactions and the Bootloader run in the same `era_vm`, the bootloader accesses a reserved heap where the operator writes any required data. This interaction is continuous, as the bootloader is unaware of the broader state of the `era_vm`. To facilitate communication, the bootloader can write to a special address that triggers a suspension of the `era_vm` execution, allowing the operator to provide necessary data. These are known as **hooks**, and based on the written value a specific hook will get triggered by the operator. Here are some of the most important hooks: - **PostResult**: Sets the last transaction result - **TxHasEnded**: if the mode of execution is set to **OneTx**, then the execution is stopped and it returns the result collected in the _PostResult_ hook. - **NotifyAboutRefunds**: Inform the operator about the amount of gas refunded after a transaction. -- **AskOperatorForRefund**: here the bootloader asks the operator to suggest a refund amount for the transaction. -- **PubdataRequested**: At the end of the batch, the bootloader asks for the data to publish on the l1 (more on this later). +- **AskOperatorForRefund**: here the Bootloader asks the operator to suggest a refund amount for the transaction. +- **PubdataRequested**: At the end of the batch, the Bootloader asks for the data to publish on the l1 (more on this later). -Now, where does the operator know where to write to? Again, within the `era_vm`, there exists a special heap reserved exclusively for the bootloader. The Operator writes all the data in that heap which has designated slots based on the type of data to write (see more [here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/docs/specs/zk_evm/bootloader.md#structure-of-the-bootloaders-memory)). Transactions, for example, are pushed into the `[252189..523261]` slots. +Now, where does the operator know where to write to? Again, within the `era_vm`, there exists a special heap reserved exclusively for the Bootloader. The Operator writes all the data in that heap which has designated slots based on the type of data to write (see more [here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/docs/specs/zk_evm/bootloader.md#structure-of-the-bootloaders-memory)). Transactions, for example, are pushed into the `[252189..523261]` slots. ### Rollbacks and snapshots @@ -120,11 +120,11 @@ In the `era_vm`, when a transaction encounters a panic or reverts, the vm needs The Bootloader, can fail sometimes, and it is the job of the Operator to trigger rollbacks. However, this type of rollback differs from the ones just mentioned. Bootloader rollbacks involve restoring not only the [full vm state](https://github.com/lambdaclass/era_vm/blob/zksync-era-integration-tests/src/vm.rs#L66-L69) but also the bootloader state. These snapshots are called external snapshots and can only be triggered by the Bootloader. [Here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/snapshot.rs) you can see what a full snapshot looks like. Before starting a new batch execution, the operator creates a snapshot, which is also used at the end of execution to collect logs (see [here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L508-L595) for more details). -> Notice that when we say vm `state`, we refer to the changes made to the data that lives on the chain, and the vm `execution` is the vm state of registers, memory, etc (see more [here](#era_vm-key-structures)). This difference is important since transactions reverts and panics only rollback the vm `state` (actually just a part of it not all), but bootloader rollbacks also restore the vm `execution`. +> Notice that when we say vm `state`, we refer to the changes made to the data that lives on the chain, and the vm `execution` is the vm state of registers, memory, etc (see more [here](#era_vm-key-structures)). This difference is important since transactions reverts and panics only rollback the vm `state` (actually just a part of it not all), but Bootloader rollbacks also restore the vm `execution`. ## Publishing data -As said above, once the batch of transactions has all been executed, the final step in the bootloader is to publish the final data. The data to be published is composed of: +As said above, once the batch of transactions has all been executed, the final step in the Bootloader is to publish the final data. The data to be published is composed of: - **L2 to L1 Logs**: Logs generated during L2 transactions that need to be recorded on L1. This can be transactions on L1 that have been forwarded to the L2 to lower costs. - **L2 to L1 Messages**: used to transmit instructions or data from smart contracts on L2 to contracts or systems on L1. @@ -133,7 +133,7 @@ As said above, once the batch of transactions has all been executed, the final s In theory, with this data one should be able to reconstruct the whole state of the l2. -At the end of the batch, the bootloader calls the `PubdataRequested` hook to ask the operator for the final batch state. The operator writes into the bootloader memory(slots [40053..248052]) the collected data from the`era_vm`. [Here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L312-L353) you can see the hook implementation in detail. +At the end of the batch, the Bootloader calls the `PubdataRequested` hook to ask the operator for the final batch state. The operator writes into the bootloader memory(slots [40053..248052]) the collected data from the`era_vm`. [Here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L312-L353) you can see the hook implementation in detail. Now, this requires the `era_vm` to keep a state for all the changes in the L2 state. For that, we hold the following structure: @@ -179,4 +179,4 @@ Finally, everything finishes with the operator committing the changes to its dat ## Final comment -This document provides an overview of the `era_vm` integration within the zk-stack, focusing on the bootloader, oand perator, and how their interactions impact the VM's design and architecture. In the explanation many details of the bootloader and operator were left behind, we only picked the parts that mostly involved and impacted the `era_vm` design. +This document provides an overview of the `era_vm` integration within the zk-stack, focusing on the Bootloader, and operator, and how their interactions impact the VM's design and architecture. In the explanation many details of the bootloader and operator were left behind, we only picked the parts that mostly involved and impacted the `era_vm` design. From f542642f52dedf789647c09f32098c1fdff70803 Mon Sep 17 00:00:00 2001 From: Marcos Nicolau Date: Thu, 22 Aug 2024 19:11:43 -0300 Subject: [PATCH 06/14] Add storage_change note about sorted diff --- docs/zksync-era-integration.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/zksync-era-integration.md b/docs/zksync-era-integration.md index 3adc9d15..0194896b 100644 --- a/docs/zksync-era-integration.md +++ b/docs/zksync-era-integration.md @@ -173,6 +173,8 @@ And so we end up with two key structures on the `era_vm`: - [The execution state](https://github.com/lambdaclass/era_vm/blob/main/src/execution.rs#L32-L51): the state of registers, heaps, frames, etc. - The L2 state changes: the changes on the chain that will get published on L1 and committed to the l2 database. +Note on `storage_change`: The bootloader requires storage changes to be sorted by address first and then by key. This sorting is essential because, before publishing the data, the bootloader invokes a contract called Compressor to compress the state diff and validate it. During validation, the Compressor receives both the compressed diff and the original state diff. The compression process sorts the map automatically, so then when the verification process starts, the provided original state diff must also be sorted. You can find the full validation function in the Compressor contract [here](https://github.com/matter-labs/era-contracts/blob/8670004d6daa7e8c299087d62f1451a3dec4f899/system-contracts/contracts/Compressor.sol#L77-L190) if you're interested. As we continue developing the VM, we're considering implementing a new data structure that maintains this order upon insertion, potentially avoiding the costly sorting process when dealing with large lists of changes. + [Here](https://github.com/lambdaclass/era_vm/blob/zksync-era-integration-tests/src/state.rs) is the full code on how we manage the state changes, refunds, pubdata and more. Finally, everything finishes with the operator committing the changes to its database. From 0164d9896d385c7d7ac42ab8b5f7d835fe396923 Mon Sep 17 00:00:00 2001 From: Marcos Nicolau <76252340+MarcosNicolau@users.noreply.github.com> Date: Fri, 23 Aug 2024 09:42:55 -0300 Subject: [PATCH 07/14] Add blockquote to storage_changes note --- docs/zksync-era-integration.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/zksync-era-integration.md b/docs/zksync-era-integration.md index 0194896b..1575066e 100644 --- a/docs/zksync-era-integration.md +++ b/docs/zksync-era-integration.md @@ -167,14 +167,14 @@ Here is what each field represents: - **written_storage_slots**: A set of storage keys that have been written to during execution, used to calculate gas fees and refunds. - **decommitted_hashes**: Stores the hashes that have been the decommited through the whole execution. When decommiting a hash in a `far_call` or `decommit`, we check if the has been already decommited, if true then the decommit is free of charge. +> Note on `storage_change`: The bootloader requires storage changes to be sorted by address first and then by key. This sorting is essential because, before publishing the data, the bootloader invokes a contract called Compressor to compress the state diff and validate it. During validation, the Compressor receives both the compressed diff and the original state diff. The compression process sorts the map automatically, so then when the verification process starts, the provided original state diff must also be sorted. You can find the full validation function in the Compressor contract [here](https://github.com/matter-labs/era-contracts/blob/8670004d6daa7e8c299087d62f1451a3dec4f899/system-contracts/contracts/Compressor.sol#L77-L190) if you're interested. As we continue developing the VM, we're considering implementing a new data structure that maintains this order upon insertion, potentially avoiding the costly sorting process when dealing with large lists of changes. + And so we end up with two key structures on the `era_vm`: - [The execution state](https://github.com/lambdaclass/era_vm/blob/main/src/execution.rs#L32-L51): the state of registers, heaps, frames, etc. - The L2 state changes: the changes on the chain that will get published on L1 and committed to the l2 database. -Note on `storage_change`: The bootloader requires storage changes to be sorted by address first and then by key. This sorting is essential because, before publishing the data, the bootloader invokes a contract called Compressor to compress the state diff and validate it. During validation, the Compressor receives both the compressed diff and the original state diff. The compression process sorts the map automatically, so then when the verification process starts, the provided original state diff must also be sorted. You can find the full validation function in the Compressor contract [here](https://github.com/matter-labs/era-contracts/blob/8670004d6daa7e8c299087d62f1451a3dec4f899/system-contracts/contracts/Compressor.sol#L77-L190) if you're interested. As we continue developing the VM, we're considering implementing a new data structure that maintains this order upon insertion, potentially avoiding the costly sorting process when dealing with large lists of changes. - [Here](https://github.com/lambdaclass/era_vm/blob/zksync-era-integration-tests/src/state.rs) is the full code on how we manage the state changes, refunds, pubdata and more. Finally, everything finishes with the operator committing the changes to its database. From 19e7221b2cfa55254ead99a741f5900ecdd3371f Mon Sep 17 00:00:00 2001 From: Marcos Nicolau Date: Tue, 27 Aug 2024 10:21:28 -0300 Subject: [PATCH 08/14] Add refunds section and fix typos --- docs/zksync-era-integration.md | 84 +++++++++++++++++++++++++--------- 1 file changed, 63 insertions(+), 21 deletions(-) diff --git a/docs/zksync-era-integration.md b/docs/zksync-era-integration.md index 1575066e..384d6a62 100644 --- a/docs/zksync-era-integration.md +++ b/docs/zksync-era-integration.md @@ -2,33 +2,36 @@ ## Introduction -So far we have been talking only about the `era_vm`. But you should know that the vm is only a small part of the zk stack. The zk stack is composed of many critical components. In this section, we are only going to be interested in one particular component: the **nodes**, which can further be decomposed into the following units: +So far we have been talking only about the `era_vm`. But you should know that the vm is only a small part of the zk stack. The zk stack is composed of many critical components. In this section, we are only going to be interested in one particular component: the **operators**, which can further be decomposed into the following units: -- **Operator/Sequencer**: the server that initializes the vm, injects the Bootloader bytecode, receives transactions and pushes them into the vm(bootloader memory to be more specific) and start batches and seals them. +- **Operator/Sequencer**: the server that initializes the vm, injects the Bootloader bytecode, receives transactions and pushes them into the vm (Bootloader memory to be more specific) and start batches and seals them. - **Bootloader**: a system contract that receives an array of transactions which are processed, validated, executed, and then, the final state is published in the l1. -- **era_vm**: the virtual machine where the Bootloader(and so all the transactions bytecode) gets executed. +- **era_vm**: the virtual machine where the Bootloader (and so all the transactions bytecode) gets executed. -These components interact continuously to process transactions. This document will provide an overview of the Bootloader, then explore the operator, its management of the bootloader, and finally, the data publishing process to L1. All of this while primarily focusing on how these interactions impact the design of the `era_vm`. +These components interact continuously to process transactions. This document will provide an overview of the Bootloader, then explore the operator, its management of the Bootloader, and finally, the data publishing process to L1. All of this while primarily focusing on how these interactions impact the design of the `era_vm`. ## Bootloader The Bootloader is a special system contract whose hash resides on L1, but its code isn't stored on either L1 or L2. Instead, it’s compiled from `.yul` to `era_vm` assembly using `zksolc` when the operator first initializes the VM (more on that below). -The Bootloader, unlike Ethereum, takes an array of transactions(a batch) and executes all of them in one run (unless specified not to, that is, if the execution mode of the vm is set to OneTx). This approach allows the transaction batch to be posted on the l1 as just a single one, making the processing on Ethereum cheaper, since taxes and gas can be distributed among all the transactions within the posted batch. +The Bootloader takes an array of transactions(a batch) and executes all of them in one run (unless specified not to, that is, if the execution mode of the vm is set to OneTx). This approach allows the transaction batch to be posted on the l1 as just a single one, making the processing on Ethereum cheaper, since taxes and gas can be distributed among all the transactions within the posted batch and data publishing costs can be reduced by posting only state diffs. At the most basic level, the Bootloader performs the following steps: 1. Reads the initial batch information and makes a call to the SystemContext contract to validate the batch. -2. Loops through all transactions and executes them until the `execute` flag is set to $0$, at that point, it jumps to step 3. -3. Seals l2 block and publish final data to the l1. +2. Loops through all transactions and executes them until the `execute` flag is set to `0`, at that point, it jumps to step `3`. +3. Seals L2 block and publish final data to the l1. + +//TODO +Note that the step `2` will depend on where the transaction came from. The initial validation of the batch is necessary, since, as we'll see below, the Bootloader starts with its memory pre-filled with any data the operator wants. That is why it needs to validate its correctness. -For more details, you can see the [main loop](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/Bootloader/bootloader.yul#L3962-L3965) or the [full contract code](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yu). +For more details, you can see the [main loop](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yul#L3962-L3965) or the [full contract code](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yul). ## Operator/sequencer -Currently, the operator is a centralized server (there are plans to make a decentralized consensus for operators) which can be thought of as the node entry point, its responsibilities are: +Currently, the operator is a centralized server (there are plans to make a decentralized consensus for operators) which can be thought of as the entry point, its responsibilities are: - Initializing the `era_vm` and keeping its state. - Orchestrating the Bootloader and keeping its state. @@ -69,7 +72,7 @@ trait Storage { This storage is saved in the VM state as a pointer. Here’s a brief explanation of each function: -- **decommit**: given a hash it returns a contract bytecode from the database. +- **decommit**: given a contract hash it returns its corresponding bytecode (if it exists) from the database. - **storage_read**: given a key, it returns the potential value from the database. - **cost_of_writing_storage**: when writing to the contract storage, gas is consumed, but the cost of writing depends on whether the write is initial or not. More on that [here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/docs/specs/zk_evm/fee_model.md). - **is_free_storage_slot**: if the address to write belongs to the system context and the key belongs to the base L2 token address, then the storage_slot is free(doesn't incur gas charges). @@ -85,11 +88,11 @@ struct StorageKey { } ``` -This allows us to query a key that belongs to the current executing address. +This allows us to query a storage key belonging to any desired contract through its address. -2. There isn't any consensus or spec about how storage should be implemented. We came up with this API because it is what we thought was more convenient for the requirement. But, for example, the vm1 implements a query logic, where the operator will react based on the provided params. +2. There isn't any consensus or spec about how storage should be implemented. We came up with this API because it is what we thought was more convenient for the requirement. But, for example, the vm1 implements a query logic, where the operator will react based on the [provided params](https://github.com/matter-labs/zksync-era/blob/87768755e8653e4be5f29945b56fd05a5246d5a8/core/lib/types/src/zk_evm_types.rs#L17-L30). -[Here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L726) you can take a look a the implementation of this trait in detail. +[Here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L726) you can take a look at the implementation of this trait in detail. These functions are specially used in the `era_vm` to calculate refunds and pubdata costs. See [here](https://github.com/lambdaclass/era_vm/blob/main/src/state.rs#L108-L123) and [here](https://github.com/lambdaclass/era_vm/blob/main/src/state.rs#L132-L173). @@ -106,10 +109,10 @@ The operator is responsible for managing the Bootloader within the `era_vm`. Thi Since both transactions and the Bootloader run in the same `era_vm`, the bootloader accesses a reserved heap where the operator writes any required data. This interaction is continuous, as the bootloader is unaware of the broader state of the `era_vm`. To facilitate communication, the bootloader can write to a special address that triggers a suspension of the `era_vm` execution, allowing the operator to provide necessary data. These are known as **hooks**, and based on the written value a specific hook will get triggered by the operator. Here are some of the most important hooks: -- **PostResult**: Sets the last transaction result -- **TxHasEnded**: if the mode of execution is set to **OneTx**, then the execution is stopped and it returns the result collected in the _PostResult_ hook. +- **PostResult**: sets the last transaction result +- **TxHasEnded**: If the mode of execution is set to **OneTx**, then the execution is stopped and it returns the result collected in the _PostResult_ hook. - **NotifyAboutRefunds**: Inform the operator about the amount of gas refunded after a transaction. -- **AskOperatorForRefund**: here the Bootloader asks the operator to suggest a refund amount for the transaction. +- **AskOperatorForRefund**: Here the Bootloader asks the operator to suggest a refund amount for the transaction. - **PubdataRequested**: At the end of the batch, the Bootloader asks for the data to publish on the l1 (more on this later). Now, where does the operator know where to write to? Again, within the `era_vm`, there exists a special heap reserved exclusively for the Bootloader. The Operator writes all the data in that heap which has designated slots based on the type of data to write (see more [here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/docs/specs/zk_evm/bootloader.md#structure-of-the-bootloaders-memory)). Transactions, for example, are pushed into the `[252189..523261]` slots. @@ -118,7 +121,7 @@ Now, where does the operator know where to write to? Again, within the `era_vm`, In the `era_vm`, when a transaction encounters a panic or reverts, the vm needs to roll back the changes, restoring only a part of the [state](https://github.com/lambdaclass/era_vm/blob/main/src/state.rs#L43-L60) to its previous frame. Remember that frames are created under `near_call` and `far_call` opcodes. Currently, rollbacks are perform using snapshots which are just copies of the current state. If a rollback is necessary, the state is restored from these snapshots. -The Bootloader, can fail sometimes, and it is the job of the Operator to trigger rollbacks. However, this type of rollback differs from the ones just mentioned. Bootloader rollbacks involve restoring not only the [full vm state](https://github.com/lambdaclass/era_vm/blob/zksync-era-integration-tests/src/vm.rs#L66-L69) but also the bootloader state. These snapshots are called external snapshots and can only be triggered by the Bootloader. [Here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/snapshot.rs) you can see what a full snapshot looks like. Before starting a new batch execution, the operator creates a snapshot, which is also used at the end of execution to collect logs (see [here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L508-L595) for more details). +The Bootloader may fail sometimes, it is the job of the Operator to trigger the necessary rollbacks. However, this type of rollback differs from the ones just mentioned. Bootloader rollbacks involve restoring not only the [full vm state](https://github.com/lambdaclass/era_vm/blob/zksync-era-integration-tests/src/vm.rs#L66-L69) but also the bootloader state. These snapshots are called external snapshots and can only be triggered by the Bootloader. [Here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/snapshot.rs) you can see what a full snapshot looks like. Before starting a new batch execution, the operator creates a snapshot, which is also used at the end of execution to collect logs (see [here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L508-L595) for more details). > Notice that when we say vm `state`, we refer to the changes made to the data that lives on the chain, and the vm `execution` is the vm state of registers, memory, etc (see more [here](#era_vm-key-structures)). This difference is important since transactions reverts and panics only rollback the vm `state` (actually just a part of it not all), but Bootloader rollbacks also restore the vm `execution`. @@ -133,7 +136,7 @@ As said above, once the batch of transactions has all been executed, the final s In theory, with this data one should be able to reconstruct the whole state of the l2. -At the end of the batch, the Bootloader calls the `PubdataRequested` hook to ask the operator for the final batch state. The operator writes into the bootloader memory(slots [40053..248052]) the collected data from the`era_vm`. [Here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L312-L353) you can see the hook implementation in detail. +At the end of the batch, the Bootloader calls the `PubdataRequested` hook to ask the operator for the final batch state. The operator writes into the bootloader memory(slots [40053..248052]) the collected data from the`era_vm`. [Here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L309-L350) you can see the hook implementation in detail. Now, this requires the `era_vm` to keep a state for all the changes in the L2 state. For that, we hold the following structure: @@ -165,7 +168,7 @@ Here is what each field represents: - **refunds**: A list of refund amounts that have been calculated during execution. - **read_storage_slots**: A set of storage keys that have been read during execution, used to calculate gas fees and refunds. - **written_storage_slots**: A set of storage keys that have been written to during execution, used to calculate gas fees and refunds. -- **decommitted_hashes**: Stores the hashes that have been the decommited through the whole execution. When decommiting a hash in a `far_call` or `decommit`, we check if the has been already decommited, if true then the decommit is free of charge. +- **decommitted_hashes**: Stores the hashes that have been the decommited through the whole execution. > Note on `storage_change`: The bootloader requires storage changes to be sorted by address first and then by key. This sorting is essential because, before publishing the data, the bootloader invokes a contract called Compressor to compress the state diff and validate it. During validation, the Compressor receives both the compressed diff and the original state diff. The compression process sorts the map automatically, so then when the verification process starts, the provided original state diff must also be sorted. You can find the full validation function in the Compressor contract [here](https://github.com/matter-labs/era-contracts/blob/8670004d6daa7e8c299087d62f1451a3dec4f899/system-contracts/contracts/Compressor.sol#L77-L190) if you're interested. As we continue developing the VM, we're considering implementing a new data structure that maintains this order upon insertion, potentially avoiding the costly sorting process when dealing with large lists of changes. @@ -175,10 +178,49 @@ And so we end up with two key structures on the `era_vm`: - [The execution state](https://github.com/lambdaclass/era_vm/blob/main/src/execution.rs#L32-L51): the state of registers, heaps, frames, etc. - The L2 state changes: the changes on the chain that will get published on L1 and committed to the l2 database. -[Here](https://github.com/lambdaclass/era_vm/blob/zksync-era-integration-tests/src/state.rs) is the full code on how we manage the state changes, refunds, pubdata and more. - Finally, everything finishes with the operator committing the changes to its database. +### Refunds, Storage write/read and Pubdata Costs associated + +We keep several fields in the `VMState` to track refunds and pubdata associated costs. Refunds are a return of ergs spent, since gas is always paid upfront. They can occur during storage operations, specifically when: + +- Reading from storage +- Writing to storage + +The relevant opcodes for these operations are: + +- `far_call` +- `decommit` +- `sstore` +- `ssload` + +#### How Refunds Are Calculated + +Refunds depend on whether a storage key has been accessed before. To manage this, we use the following keys from the `VMState`: + +- `paid_changes` +- `read_storage_slots` +- `written_storage_slots` +- `decommited_hashes` + +#### Decommit behaviour + +Decommits might occur during `far_call` or `decommit`. Whenever we decommit a `hash`, we check if that hash has already been decommited, if it is then we return the gas spent for deommit since decommits are paid upfront. Otherwise, we store the hash a already decommited so subsequent decommits to that hash will become free of charge. + +#### Storage Read Behavior + +When performing a storage_read, we check if the slot is free or if the key has already been read. If the key has been read before, a "warm" refund is given. If not, no refund is provided, but the key is marked as read for future refunds. + +#### Storage Write Behavior + +During a storage_write, we first check if the slot is free. If it is, a "warm" refund is given. Otherwise, we calculate the pubdata cost—the current price for writing to storage. If the key has been written to before, we only pay the difference between the new price and the previously paid amount (this difference is what we track as `pubdata_costs`). This difference can be negative, resulting in a refund. Additionally, if the key has been written to before, a "warm" refund is provided. If the key has only been read before and is now being written to, a "cold" write refund is given. + +#### What Defines a Free Slot? + +The operator determines whether a slot is considered "free." This decision is based on whether the key address belongs to the system context contract or if it belongs to the L2_BASE_TOKEN_ADDRESS and is associated with the ETH bootloader’s balance. + +[Here](https://github.com/lambdaclass/era_vm/blob/zksync-era-integration-tests/src/state.rs) is the full code on how we manage the state changes, refunds, pubdata and more. + ## Final comment This document provides an overview of the `era_vm` integration within the zk-stack, focusing on the Bootloader, and operator, and how their interactions impact the VM's design and architecture. In the explanation many details of the bootloader and operator were left behind, we only picked the parts that mostly involved and impacted the `era_vm` design. From ffc01b8b1eac22395ae5a0e36f82f399a1940b3f Mon Sep 17 00:00:00 2001 From: Marcos Nicolau Date: Tue, 27 Aug 2024 10:43:28 -0300 Subject: [PATCH 09/14] Add l1 and l2 transaction processing differences --- docs/zksync-era-integration.md | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/docs/zksync-era-integration.md b/docs/zksync-era-integration.md index 384d6a62..481cd9ed 100644 --- a/docs/zksync-era-integration.md +++ b/docs/zksync-era-integration.md @@ -14,20 +14,23 @@ These components interact continuously to process transactions. This document wi The Bootloader is a special system contract whose hash resides on L1, but its code isn't stored on either L1 or L2. Instead, it’s compiled from `.yul` to `era_vm` assembly using `zksolc` when the operator first initializes the VM (more on that below). -The Bootloader takes an array of transactions(a batch) and executes all of them in one run (unless specified not to, that is, if the execution mode of the vm is set to OneTx). This approach allows the transaction batch to be posted on the l1 as just a single one, making the processing on Ethereum cheaper, since taxes and gas can be distributed among all the transactions within the posted batch and data publishing costs can be reduced by posting only state diffs. +The Bootloader takes an array of transactions (a batch) and executes all of them in one run (unless specified not to, that is, if the execution mode of the vm is set to OneTx). This approach allows the transaction batch to be posted on the l1 as just a single one, making the processing on Ethereum cheaper, since taxes and gas can be distributed among all the transactions within the posted batch and data publishing costs can be reduced by posting only state diffs. At the most basic level, the Bootloader performs the following steps: 1. Reads the initial batch information and makes a call to the SystemContext contract to validate the batch. -2. Loops through all transactions and executes them until the `execute` flag is set to `0`, at that point, it jumps to step `3`. +2. Loops through all transactions and executes them until the `execute` flag is set to `0`, at that point, it jumps to step `3`. 3. Seals L2 block and publish final data to the l1. -//TODO -Note that the step `2` will depend on where the transaction came from. - The initial validation of the batch is necessary, since, as we'll see below, the Bootloader starts with its memory pre-filled with any data the operator wants. That is why it needs to validate its correctness. -For more details, you can see the [main loop](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yul#L3962-L3965) or the [full contract code](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yul). +Note that transaction processing in step 2 varies depending on the origin of the transaction. Transactions can either be pushed from L1 using the [requestL2Transaction](https://github.com/code-423n4/2023-10-zksync/blob/ef99273a8fdb19f5912ca38ba46d6bd02071363d/code/contracts/ethereum/contracts/zksync/facets/Mailbox.sol#L236) method or initiated from L2 by querying the operator API. + +If the transaction originates from L1, the `from` address is assumed to be authorized, so certain steps typically performed during L2 processing are skipped. These include setting the `tx.origin`, `from`, and `ergs_price`, as these details are already provided by the transaction. However, if the transaction comes from L2, it is processed according to the account abstraction model. + +For more details, you can read about L1 transactions processing [here](https://github.com/code-423n4/2023-10-zksync/blob/main/docs/Smart%20contract%20Section/Handling%20L1%E2%86%92L2%20ops%20on%20zkSync.md) and L2 transactions [here](https://docs.zksync.io/build/developer-reference/account-abstraction/design#transaction-flow). + +For more details on the bootloader, you can see the [main loop](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yul#L3962-L3965) or the [full contract code](https://github.com/matter-labs/era-contracts/blob/main/system-contracts/bootloader/bootloader.yul). ## Operator/sequencer From cb6a64c06f78ea342508debab224c5c016035f28 Mon Sep 17 00:00:00 2001 From: Marcos Nicolau Date: Tue, 27 Aug 2024 10:51:14 -0300 Subject: [PATCH 10/14] Add link to vm1 storage implementation --- docs/zksync-era-integration.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/zksync-era-integration.md b/docs/zksync-era-integration.md index 481cd9ed..132939cd 100644 --- a/docs/zksync-era-integration.md +++ b/docs/zksync-era-integration.md @@ -93,7 +93,7 @@ struct StorageKey { This allows us to query a storage key belonging to any desired contract through its address. -2. There isn't any consensus or spec about how storage should be implemented. We came up with this API because it is what we thought was more convenient for the requirement. But, for example, the vm1 implements a query logic, where the operator will react based on the [provided params](https://github.com/matter-labs/zksync-era/blob/87768755e8653e4be5f29945b56fd05a5246d5a8/core/lib/types/src/zk_evm_types.rs#L17-L30). +2. There isn't any consensus or spec about how storage should be implemented. We came up with this API because it is what we thought was more convenient for the requirement. But, for example, the vm1 implements a query logic, where the operator will react based on the [provided params](https://github.com/matter-labs/zksync-era/blob/87768755e8653e4be5f29945b56fd05a5246d5a8/core/lib/types/src/zk_evm_types.rs#L17-L30) ([here](https://github.com/lambdaclass/zksync-era/blob/611dc845b4e01c3e14586c91b2169770c8667d7e/core/lib/multivm/src/versions/vm_1_3_2/oracles/storage.rs#L212-L272) you can see their implementation in detail). [Here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/core/lib/multivm/src/versions/era_vm/vm.rs#L726) you can take a look at the implementation of this trait in detail. From 45a6f0e4c34e17923168e4d11a98fe1238c6748e Mon Sep 17 00:00:00 2001 From: Marcos Nicolau Date: Tue, 27 Aug 2024 10:55:17 -0300 Subject: [PATCH 11/14] Add fee model link --- docs/zksync-era-integration.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/zksync-era-integration.md b/docs/zksync-era-integration.md index 132939cd..fbd47b81 100644 --- a/docs/zksync-era-integration.md +++ b/docs/zksync-era-integration.md @@ -77,7 +77,7 @@ This storage is saved in the VM state as a pointer. Here’s a brief explanation - **decommit**: given a contract hash it returns its corresponding bytecode (if it exists) from the database. - **storage_read**: given a key, it returns the potential value from the database. -- **cost_of_writing_storage**: when writing to the contract storage, gas is consumed, but the cost of writing depends on whether the write is initial or not. More on that [here](https://github.com/lambdaclass/zksync-era/blob/era_vm_integration_v2/docs/specs/zk_evm/fee_model.md). +- **cost_of_writing_storage**: when writing to the contract storage, gas is consumed, but the cost of writing depends on whether the write is initial or not. More on that [here](#refunds-storage-writeread-and-pubdata-costs-associated). - **is_free_storage_slot**: if the address to write belongs to the system context and the key belongs to the base L2 token address, then the storage_slot is free(doesn't incur gas charges). A few notes about this storage: @@ -222,7 +222,7 @@ During a storage_write, we first check if the slot is free. If it is, a "warm" r The operator determines whether a slot is considered "free." This decision is based on whether the key address belongs to the system context contract or if it belongs to the L2_BASE_TOKEN_ADDRESS and is associated with the ETH bootloader’s balance. -[Here](https://github.com/lambdaclass/era_vm/blob/zksync-era-integration-tests/src/state.rs) is the full code on how we manage the state changes, refunds, pubdata and more. +[Here](https://github.com/lambdaclass/era_vm/blob/zksync-era-integration-tests/src/state.rs) is the full code on how we manage the state changes, refunds, pubdata and more. Finally, for a full explanation on refunds and the fee model, go [here](https://github.com/lambdaclass/zksync-era/blob/611dc845b4e01c3e14586c91b2169770c8667d7e/core/lib/multivm/src/versions/vm_1_3_2/oracles/storage.rs#L212-L272). ## Final comment From 4e562dd1a6d826b1468bae78101c3f4fed7044d2 Mon Sep 17 00:00:00 2001 From: Marcos Nicolau Date: Tue, 27 Aug 2024 11:01:59 -0300 Subject: [PATCH 12/14] Storage write clarify pubdata costs --- docs/zksync-era-integration.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/zksync-era-integration.md b/docs/zksync-era-integration.md index fbd47b81..88dc6484 100644 --- a/docs/zksync-era-integration.md +++ b/docs/zksync-era-integration.md @@ -216,11 +216,11 @@ When performing a storage_read, we check if the slot is free or if the key has a #### Storage Write Behavior -During a storage_write, we first check if the slot is free. If it is, a "warm" refund is given. Otherwise, we calculate the pubdata cost—the current price for writing to storage. If the key has been written to before, we only pay the difference between the new price and the previously paid amount (this difference is what we track as `pubdata_costs`). This difference can be negative, resulting in a refund. Additionally, if the key has been written to before, a "warm" refund is provided. If the key has only been read before and is now being written to, a "cold" write refund is given. +During a storage_write, we first check if the slot is free. If it is, a "warm" refund is given. Otherwise, we calculate the pubdata cost—the current price for writing to storage. If the key has been written to before, we only pay the difference between the new price and the previously paid amount (this difference is what we track as `pubdata_costs`). Notice that of this difference is negative (i.e the new price is lower that what has already been paid), then they cost becomes free of charge, but it doesn't actually result in a refund. Additionally, if the key has been written to before, a "warm" refund is provided. If the key has only been read before and is now being written to, a "cold" write refund is given. #### What Defines a Free Slot? -The operator determines whether a slot is considered "free." This decision is based on whether the key address belongs to the system context contract or if it belongs to the L2_BASE_TOKEN_ADDRESS and is associated with the ETH bootloader’s balance. +The operator determines whether a slot is considered "free." This decision is based on whether the key address belongs to the system context contract or if it belongs to the `L2_BASE_TOKEN_ADDRESS` and is associated with the ETH account bootloader’s balance. [Here](https://github.com/lambdaclass/era_vm/blob/zksync-era-integration-tests/src/state.rs) is the full code on how we manage the state changes, refunds, pubdata and more. Finally, for a full explanation on refunds and the fee model, go [here](https://github.com/lambdaclass/zksync-era/blob/611dc845b4e01c3e14586c91b2169770c8667d7e/core/lib/multivm/src/versions/vm_1_3_2/oracles/storage.rs#L212-L272). From 82b801e44566ad1650e4635580ceee67bcdf4aae Mon Sep 17 00:00:00 2001 From: Marcos Nicolau <76252340+MarcosNicolau@users.noreply.github.com> Date: Fri, 30 Aug 2024 18:04:45 -0300 Subject: [PATCH 13/14] Fix typos --- docs/zksync-era-integration.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/zksync-era-integration.md b/docs/zksync-era-integration.md index 88dc6484..c14bd7a5 100644 --- a/docs/zksync-era-integration.md +++ b/docs/zksync-era-integration.md @@ -208,7 +208,7 @@ Refunds depend on whether a storage key has been accessed before. To manage this #### Decommit behaviour -Decommits might occur during `far_call` or `decommit`. Whenever we decommit a `hash`, we check if that hash has already been decommited, if it is then we return the gas spent for deommit since decommits are paid upfront. Otherwise, we store the hash a already decommited so subsequent decommits to that hash will become free of charge. +Decommits might occur during `far_call` or `decommit`. Whenever we decommit a `hash`, we check if that hash has already been decommited, if it is then we return the gas spent for decommit since decommits are paid upfront. Otherwise, we store the hash as already decommited so subsequent decommits to that hash will become free of charge. #### Storage Read Behavior @@ -216,7 +216,7 @@ When performing a storage_read, we check if the slot is free or if the key has a #### Storage Write Behavior -During a storage_write, we first check if the slot is free. If it is, a "warm" refund is given. Otherwise, we calculate the pubdata cost—the current price for writing to storage. If the key has been written to before, we only pay the difference between the new price and the previously paid amount (this difference is what we track as `pubdata_costs`). Notice that of this difference is negative (i.e the new price is lower that what has already been paid), then they cost becomes free of charge, but it doesn't actually result in a refund. Additionally, if the key has been written to before, a "warm" refund is provided. If the key has only been read before and is now being written to, a "cold" write refund is given. +During a storage_write, we first check if the slot is free. If it is, a "warm" refund is given. Otherwise, we calculate the pubdata cost—the current price for writing to storage. If the key has been written to before, we only pay the difference between the new price and the previously paid amount (this difference is what we track as `pubdata_costs`). Notice that this difference is negative (i.e the new price is lower that what has already been paid), then the cost becomes free of charge, but it doesn't actually result in a refund. Additionally, if the key has been written to before, a "warm" refund is provided. If the key has only been read before and is now being written to, a "cold" write refund is given. #### What Defines a Free Slot? From e542eaa7c8f16d2acb59b87c56bcffa7612c6cdc Mon Sep 17 00:00:00 2001 From: Marcos Nicolau <76252340+MarcosNicolau@users.noreply.github.com> Date: Tue, 3 Sep 2024 10:43:49 -0300 Subject: [PATCH 14/14] Update zksync-era-integration.md --- docs/zksync-era-integration.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/zksync-era-integration.md b/docs/zksync-era-integration.md index c14bd7a5..05c24b71 100644 --- a/docs/zksync-era-integration.md +++ b/docs/zksync-era-integration.md @@ -14,7 +14,7 @@ These components interact continuously to process transactions. This document wi The Bootloader is a special system contract whose hash resides on L1, but its code isn't stored on either L1 or L2. Instead, it’s compiled from `.yul` to `era_vm` assembly using `zksolc` when the operator first initializes the VM (more on that below). -The Bootloader takes an array of transactions (a batch) and executes all of them in one run (unless specified not to, that is, if the execution mode of the vm is set to OneTx). This approach allows the transaction batch to be posted on the l1 as just a single one, making the processing on Ethereum cheaper, since taxes and gas can be distributed among all the transactions within the posted batch and data publishing costs can be reduced by posting only state diffs. +The Bootloader takes an array of transactions (a batch) and executes all of them in one run (unless specified not to, that is, if the execution mode of the vm is set to OneTx). This approach allows the transaction batch to be posted on the l1 as just a single one, making the processing on Ethereum cheaper, since gas can be distributed among all the transactions within the posted batch and and data publishing costs can be reduced by posting only state diffs. At the most basic level, the Bootloader performs the following steps: