Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIMD-0177: Program Runtime ABI v2 #177

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
134 changes: 134 additions & 0 deletions proposals/0177-program-runtime-abiv2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
---
simd: '0177'
title: Program Runtime ABI v2
authors:
- Alexander Meißner
category: Standard
type: Core
status: Idea
created: 2024-10-01
feature: TBD
extends: SIMD-0184, SIMD-0185
---

## Summary

Align the layout of the virtual address space to large pages in order to avoid
account data copies while maintaining a simple address translation logic.

## Motivation

At the moment all validator implementations have to copy (and compare) data in
and out of the virtual memory of the virtual machine. There are four possible
account data copy paths:

- Serialization: Copy from program runtime (host) to virtual machine (guest)
- CPI call: Copy from virtual machine (guest) to program runtime (host)
- CPI return: Copy from program runtime (host) to virtual machine (guest)
- Deserialization: Copy from virtual machine (guest) to program runtime (host)

To avoid this a feature named "direct mapping" was designed which uses the
address translation logic of the virtual machine to emulate the serialization
and deserialization without actually performing copies.

Implementing direct mapping in the current ABI v0 and v1 was deemed too complex
because of unaligned virtual memory regions and memory accesses overlapping
multiple virtual memory regions. Instead the layout of the virtual address
space should be adjusted so that all virtual memory regions are aligned to
4 GiB.

## Alternatives Considered

What alternative designs were considered and what pros/cons does this feature
have relative to them?

## New Terminology

None.

## Detailed Design

SDKs will have to support both ABI v1 and v2 for a transition period. The
program runtime must only use ABI v2 if all programs in a transaction support
it. Programs signal their support through their SBPF version field (TBD) while
Comment on lines +51 to +53

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

| The program runtime must only use ABI v2 if all programs in a transaction support it.

Why this restriction? It's possible to execute some programs with direct mapping and others without in the same transaction.

Copy link
Contributor Author

@Lichtso Lichtso Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, technically the restriction could be relaxed to be within the CPI tree of one top-level instruction to prevent caller and callee from having different ABI versions. However, unlike all programs in a transaction, we don't know what is going to be called in CPI beforehand.

Copy link

@seanyoung seanyoung Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is: why prevent caller and callee from having different ABI versions.

I meant to say that it is possible to execute some programs with direct mapping and some without, within the same instruction/CPI tree.

Sure that needs some refactoring in agave, but it might be worth it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our implementation of direct mapping is explicitly built on the assumption that caller and callee both use or not use direct mapping. Mixing them would lead to trouble e.g. in account payload reallocation.

the program runtime signals which ABI is chosen through the serialized magic
field.

### The serialization interface

- Writing to readonly accounts fails the transaction, even if the exact same
data is written as already is there, thus even if no change occurs.
- The is-executable-flag is never set.
- The next rent collection epoch is not serialized.
- Readonly instruction accounts have no growth capacity.
- For writable instruction accounts additional capacity is allocated and mapped
for potential account growth. The maximum capacity is the length of the account
payload at the beginning of the transaction plus 10 KiB. CPI can not grow

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are independent. SIMD-0163 is about the program being called, that is not affected in this SIMD.

Copy link
Contributor

@buffalojoec buffalojoec Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could increase the account realloc / resize limit if there are no interactions of ABI v0/v1 and ABI v2 programs in the CPI call tree of a top level instruction. See the discussion with Sean below.

beyond what the caller allowed as top-level instructions limit the potential
growth. Thus it makes sense to preallocate this capacity in the beginning of
the transaction when the writable accounts are copied in case the transaction
needs to be rolled back.

### The serialization layout

The following memory regions must be mapped into the virtual machine,
each starting at a 4 GiB boundary in virtual address space:

- Writable header:
- Magic: `u32`: `0x76494241` ("ABIv" encoded in ASCII)
- ABI version `u32`: `0x00000002`
- Pointer to instruction data: `u64`
- Length of instruction data: `u32`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this and I thought that if data was presented in a way which makes sense to rust, e.g. regular slices with u64 ptr and u64 length, then rust programs do not have to do any entry processing at all, and can just cast 4GiB address to a type and be done.

- Number of unique instruction accounts: `u16`
- Number of instruction accounts: `u16`
- Program key: `[u8; 32]`
- For each unique instruction account:
- Key: `[u8; 32]`
- Owner: `[u8; 32]`
- Flags: `u64` (bit 8 is signer, bit 16 is writable)
- Lamports: `u64`
- Pointer to account payload: `u64`
- Account payload length: `u32`
- Account payload capacity: `u32`
- Instruction account index indirection for aliasing:
- Index to unique instruction account: `u16`
- Readonly instruction data
- Writable payload of account #0
- Readonly payload of account #1
- Writable payload of account #2
- Writable payload of account #3
- ...

With this design a program SDK can (but no longer needs to) eagerly deserialize
all account metadata at the entrypoint. Because this layout is strictly aligned
and uses proper arrays, it is possible to directly calculate the offset of a
single accounts metadata with only one indirect lookup and no need to scan all
preceeding metadata. This allows a program SDK to offer a lazy interface which
only interacts with the account metadata fields which are needed, only of the
accounts which are of interest and only when necessary.

## Impact

This change is expected to drastically reduce the CU costs if all programs in
a transaction support it as the cost will no longer depend on the length of the
instruction account payloads or instruction data.

Otherwise, the change will be hidden in the SDK and thus be invisible to the
dApp developer.

## Security Considerations

What security implications/considerations come with implementing this feature?
Are there any implementation-specific guidance or pitfalls?

## Drawbacks *(Optional)*

Why should we not do this?

## Backwards Compatibility

The magic field (`u32`) and version field (`u32`) of ABI v2 are placed at the
beginning, where ABI v0 and v1 would otherwise indicate the number of
instruction accounts as an `u64`. Because the older ABIs will never serialize
more than a few hundred accounts, it is possible to differentiate the ABI
that way without breaking the older layouts.
Loading