-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Develop a way to create and a package format for smart contracts #550
Comments
A couple of preliminary comments: Account codeWe are moving toward using MAST as the format for serializing Miden VM programs. So, the code included in the package will probably be serialized MAST. This means two things:
Account storageThis is indeed the trickiest part. I actually think what we are looking for is a way to define an "account template" which would consist of MASM/MAST code + storage layout that is required to support this code. This account template can then be instantiated into a specific account (at this time, account ID would also be generated), and then this account can be serialized as The code part should probably in the package format discussed in 0xPolygonMiden/compiler#129 (cc @bitwalker, @greenhat). The storage part could be some metadata file describing which storage slots should be initialized and how. Just to illustrate and using JSON format, for basic wallet the storage template could look like so: {
"slots": [
{
"index": 0,
"name": "pub_key",
"value:" {
"arity": 0,
"value": "{{auth::rpo_falcon512::pub_key}}"
}
}
]
} This basically specifies that when an account with this template is instantiated, we'll need to get a RpoFalcon512 public key from the user and put it into the first slot of account storage (the expected type of the storage slot would be a simple value with arity 0 - i.e., just a single word). Of course, we'll actually need to define all the details of this format (what I have above is just a very basic illustration) - but I don't think that's going to be too difficult. A more difficult question to resolve is what to do when a user combines different components into a single contract (e.g., basic wallet + something else). The main issue is potential conflicts in how different components try to access storage. The simplest solution is to just panic on conflicts. A more sophisticated one would be to try to resolve conflicts by dynamically modifying the code - but this probably best left to the compiler. |
I think to some degree this will tie back to the fundamentals of packaging and distribution of libraries/programs. A package that contains a library which implements an account/contract built out of other, more foundational, libraries/components, is going to be doing so either by referencing code from other packages, or by generating code that integrates behavior provided by another package (e.g. basically along the lines of the difference between calling an I would recommend that we define the details of storage layout abstractly, but deterministically, much like how Rust defines how it lays out types in memory. So if you know the offset at which your storage slots start, you can compute the offset of specific storage keys by adding a zero-based index to that offset. This is akin to how one can compute the memory address of a specific struct field in a complex nested data structure, all you need to know is the offset of the data structure, and from there the rules for deriving the address of a specific field are entirely deterministic. This would make it possible to combine together storage requirements from multiple "parent" contracts/components in a new contract, as long as they can all fit in 255 slots - each need only know (or be provided) the offset at which their storage begins. Because the derived contract knows all of the components that it is mixing together, it can compute the offset for each of those components statically, so accessing specific slots doesn't necessarily even require any runtime calculations. Admittedly, I have no idea where we're at with regard to account storage in general (i.e. how it is specified, how it gets initialized, etc.), so maybe some of those details have already been pinned down. It'll certainly be something @greenhat will need to get deeper into pretty soon (right now we've largely been focused on surfacing the low-level primitives, but coming up with a natural idiomatic Rust way of expressing storage is something we haven't really done yet). We probably should make accommodations in the package format for surfacing additional arbitrary custom package metadata (e.g. Rollup-specific details like whether a package provides an account or note script, what its storage requirements are, etc.). Tooling specific to the Rollup will want to be able to easily access that information, but I don't think we want to bake in Rollup-specific details in the package format, nor do I think we would want to have multiple package formats. Cargo is a good example of this, custom tools can describe metadata in With that in mind, we'd then only really need some lightweight tooling, built on top of the more general packaging infra, that can validate how specific packages are used together (i.e. ensuring the package types are compatible, validating that there are no storage conflicts, etc.). In any case, definitely interested to understand more of the gritty detail of account storage and how some of the interactions will work - definitely a weak spot in my knowledge of the overall system right now. |
Yes - totally agreed: we should have a single package format for the VM programs - but allow users to add custom metadata to these packages (i.e., for rollup purposes).
This could work, but has one potential downside: if the offsets are part of the code, then MAST roots of public procedures would change depending on what other components are added to the same contract. In some cases this may be fine, but in other cases, it may break things (e.g., notes would not have a stable MAST root to call on an account interface). It would be ideal if we could somehow provide these offsets at runtime in a way that doesn't change MAST roots - but I'm not sure it is possible. There are other solutions to this (e.g., using only storage maps and salting the keys) - but these come with their own set of pros and cons. |
I'm not sure I follow how this happens - wouldn't the slot indices be fixed when publishing the contract anyway? In other words, if you are "mixing in" elements from other contracts acting as templates, you end up with code that is either unique to the derived contract (and so published with it), or code which is generic across all derivations (and so external to the derived contract and published with the base contract). In the latter case, presumably the base contract would expect to be provided with the offset at which its storage slots begin, at runtime, so that contracts which derive its functionality, but that have storage offset at potentially different slot indices, can all share the same code. In the former case, the slot indices are computed/known statically during compilation, and the code is published with those indices baked in (or, possibly also compiled to be relative to some runtime-provided base offset, if it is intended to itself be derivable). I'm not sure if what I'm saying is applicable here or not, since I only have a rough understanding of how account storage is supposed to work, but since contracts are going to have to be compiled to MAST, whatever mechanism we come up with here, has to be something that is either known at the time the MAST is compiled (and so baked in), or is provided at runtime. Either way, the code never changes (and can't change anyway), regardless of how the contract is used downstream. It kind of sounds like the mechanism would need to be based on runtime-provided offsets, but I'm definitely way out of my depth at this point, so I hesitate to opine too much on what makes sense here. The problem statement stood out to me as very similar to how the concept of ABIs, type layouts in memory, and how linkers allocate segments of memory to different parts of a program that are potentially linked in dynamically at runtime; so I do think there is prior art on how to approach that kind of thing when interop is in play - but it might be that there are other elements here that make it a fundamentally unique problem. |
This is correct - but let me explain what I mean on a concrete example. Let's use our BasicWallet as an example. This is not the best example because procedures in this module do not touch the storage at all - but for the purposes of the example, let's pretend that the do. So, we have two procedures defined as a part of Now, let's say a user wants to create an account which combines functionality of So, basically, in the ideal scenario we want to achieve 2 goals:
There are many ways to achieve the above - I think the tricky part is finding a way which minimizes performance overhead. Here is one approach: By convention, we designate storage slot 0 to store offsets for all components deployed in the account. The type of this slot would be storage map (basically a key-value map with 256-bit keys and 256-bit values). Each component would be identified by some unique 256-bit ID and so, when a component is added to the account, we'd insert a key-value pair for this component into storage slot zero which would look like Then, when a component needs to access storage, it would look up its offset in this map, and then use it accordingly. For example, one of the first steps of This does create quite a bit of overhead though. Specifically:
Also, there is a question of how to we guarantee that the 256-bit component IDs are indeed unique (or whether we can even guarantee this at all). So, I do wonder if there is a better alternative. |
Another approach, which is simpler in many ways but also less flexible in others: A component which doesn't want to have any potential conflicts with other components would always use a storage map in some designated slot (let's say storage slot 0 again). This slot would always type storage map - and so, could hold practically unlimited amount of data. To avoid key collisions before inserting values into the storage map, we'd compute the new key a:
As long as component ID can be easily identified, this key mapping can be handled by the kernel. |
I think I see where we're diverging here. My expectation would be that we would not be compiling the code of
With those assumptions in mind, the only actual requirement imposed on If instead, That said, assuming I'm still missing the key thing here that breaks my mental model as laid out above, what you've described as a possible solution seems sound to me. We can likely lean on the compiler to generate some of the more repetitive/annoying code as you mentioned, I don't see any reason why that wouldn't be possible as long as some pretty minimal requirements are met (i.e. that we know the storage layout somehow, that we know all of the call sites where storage is being accessed, and that at each call site we know what "component" it belongs to), and I'm pretty sure that will be the case in general. |
It does occur to me that one of the things I'm obviously forgetting to account for here are the boundaries of the system with regard to storage. When we talk about Do we have an up-to-date description/sketch of how we have been expecting storage to work up until now, and how it relates to Miden execution contexts (to the degree there is a relationship there)? I think that would probably sort me out. |
Unfortunately, there isn't a single comprehensive description of this yet (@Dominik1999 let's create an issue to add this to the docs). The basic design of the storage is described in #239 (comment) and following comments. To summarize:
Yes, agreed. I think the tricky part here is how exactly to provide these at runtime in a way which introduces minimal overhead both in terms of performance and complexity. There is another thing we may want to consider: not all account procedures added to the account have the same requirements. Specifically, I think we have 4 cases:
So far, we've been focusing on use case 3 (and use case 1 is already handled), but I think we should figure out if we want to (and are able to) support use cases 2 and 4. |
Interesting discussion, indeed. I will create an issue to explain how account storage works in the docs. I wanted to wait until we merge However, aren't we overthinking this here a bit? So, the problem, as I understand it, lies in the combination of different contracts or procedures that have conflicting storage assumptions. And please correct me, if I am oversimplifying it.
|
Revisiting this again, I think the solution is actually much simpler than what I originally thought. First, for every storage access we'll implicitly apply an offset. So, for example, when the user executes account::get_item or account::set_item procedures, the kernel will add an offset to the specified slot index. The offset will come from the The process of performing the offset lookup would be conceptually similar to what is happening in the current authenticate_procedure procedure. In fact, we may want to create a separate
One of the implications of this is that both storage reads and storage writes would need to be authenticated (currently, we authenticate writes but not reads) - but I think that's a good idea anyway. To support the above, we'll need to modify how account code tree is built. Specifically, right now this is a Merkle tree where each leaf is a root of a procedure in the public interface of the account. We'll need to change this to Separately, it may be a good idea to move away from using a Merkle tree here in favor of a sequential hash which would be "unhashed" into the kernel memory during the transaction prologue. This will substantially reduce the cost of accessing storage and authenticating procedures in general. The overall mechanism would be that at account instantiation time (or at upgrade time) we'll set storage offsets for all procedures in the account's public interface. These offsets could be computed by the compiler as it combines different components into a single contract, or could be done manually for simpler contracts. Overall, I think with this we achieve all desired properties:
One other implication is that we should make the public account procedures accessible only via the One the "account template" side (mentioned in #550 (comment)), the additional storage metadata associated with account code may need to look something like this: {
"slots": [
{
"index": 0,
"name": "pub_key",
"value:" {
"arity": 0,
"value": "{{auth::rpo_falcon512::pub_key}}"
}
}
],
"offsets": {
"0x12345...": 0,
"0x23456...": 0,
"0x34567...": 5,
}
} Where procedures with MAST roots |
With implementation of #667, we now have the ability to associate storage offsets with every account procedure. This will be enhanced a little with #866 - but overall, this aspect is near its final form. We still need to have a way to specify this info via MASM, but this will be addressed with procedure annotations in Miden Assembly (0xPolygonMiden/miden-vm#1434). Taking a step back, for account package format, I think we need to have the following pieces:
In my mind, for all of these, except for the last item, we have a pretty good idea of what needs to be done. So, it probably is a good idea to start brainstorming what we want to do with the last item as well. Perhaps, a good way to start is to take a concrete example. Let's say we want to instantiate a fungible faucet account. For this, the storage could look roughly like this:
We could described the above with something like this: [
{
"type": "value",
"value": ["0", "0", "0", "0"]
},
{
"type": "value",
"name": "pub_key",
"value": "{{auth::rpo_falcon512::pub_key}}"
},
{
"type": "value",
"name": "metadata",
"value": [
"{{decimals}}",
"{{symbol}}",
"{{max_supply}}",
"0"
]
}
] The main idea is that values in
I think the way we want to go will probably depend on the answer to this last question. For example, one approach could be that with every account we need to provide an "initialization function" (maybe represented by a WASM program). This function would take some set of user inputs and output the |
With the introduction of account components in #935, the approach changed slightly from what I described above. Now, An instantiated account component currently looks like so: pub struct AccountComponent {
library: Library,
storage_slots: Vec<StorageSlot>,
supported_types: BTreeSet<AccountType>,
} Basically, it contains the code of the component, the storage of the component, and a set of account types to which the component can be added (e.g., some components can be added only to faucet accounts). So, a component package should contain enough information to instantiate a component (or at least to describe how one should be instantiated). One approach is to say that a component consists of 2 files:
We could also add other files to this (e.g., WIT file that could be useful for the compiler) but I'll omit them for now. The metadata file could look something like this: name = "MyComponent"
description = "Description of my component"
version = "0.1.0"
targets = ["MutableAccount", "ImmutableAccount", "FungibleFaucet", "NonFungibleFaucet"]
# simple value slot with the value initialized to 0
[[storage]]
name = "simple-slot"
description = "some description of what this slot is for"
slot = 0
type = "value"
value = ["0", "0", "0", "0"]
# simple value slot which requires initialization
[[storage]]
name = "simple-slot"
slot = 1
type = "value"
value = "{{foo.bar.baz}}"
# sometimes a logical value can span multiple slots; would be good to handle this somehow but also this
# could be something left for the future
[[storage]]
name = "multi-slot"
slots = [2, 3]
# a map storage slot initialized with 2 key-value pairs
[[storage]]
name = "map-slot"
slot = 4
type = "map"
values = [
{ key = "0x123", value = "0x456" },
{ key = "0x234, "value = "0x567 }
] In the above the The To instantiate an account from the above component, we'd need to provide the data for foo.bar.baz = "0x123456" Then, we could have a constructor that given these files can create an account component: impl AccountComponent {
pub fn read_from_file(
component_package: PathBuf,
component_data: PathBuf,
) -> Result<Self, AccountError> {
...
}
} |
This sounds like the |
Yes, I was thinking about this too. One thing we need to decide is whether the account component package is basically the same as Miden VM package just with extra metadata or if they are different and account component package can be derived from the VM package. For example, Miden VM package can contain either an executable or a library code. But account component package is always a library. Miden VM package could also be used to describe a note - but for notes here we might want to have a slightly different format (e.g., the metadata would be different). So, the question is whether the complier produces "generic packages" or something more specific like "account component package" or "note package". Or maybe it does both - first produces a more generic package and then this gets converted into more specific packages. Regardless though, one of the main things in this PR is to define a format for the account storage schema. Once we have this, we can move this definition to other places, if needed. |
After reading this issue and taking a look at the approach on Miden VM, I have a couple of questions/comments: We probably want to make a new crate for this, right? I guess an alternative could be to add it to Regarding formats, I don't think I have too many notes on @bobbinth's schema as it seems to cover all bases while looking very usable. One thing that stands out is that it seems that defining the slot indices seems unnecessary because they are arranged contiguously in the component's Another small thing is that we could add a |
I was thinking this would be in
I think there are still a few open questions for me here:
Initially that's how I thought about them as well, but if we do want to add support for entries that span multiple slots (e.g., a public key that requires 64 bytes) we might have to specify slots indexes explicitly.
I generally agree with the approach not to have redundant info (e.g., if it is clear form the
I've assumed that we'd actually "merge" the
I was thinking |
Regarding these, I think it would be best to be as flexible as possible. Basically, I don't see any significant downside to supporting most (if not all) the variations you mentioned, as most of them represent a valid scenario, at least at first glance: For values intended as single words, simple hex notation is easier. For cases that require partial dynamic substitution, an array-like syntax is beneficial. Scenarios where you are matching against a subset of elements make sense for things such as assets where you are trying to represent something like One possible downside is that this might make parsing a bit more complex, but I don't see this as a blocker (unless it's much more complex than I am originally thinking).
I think this could still be supported without making the slot indices explicit, but rather specifying the length of the storage section that the logical slot will use, and assuming a length of one word otherwise (in your example, instead of |
I'd rather keep everything in a single file. Since we plan to have a repository for Miden packages, any path might be either local or remote, which complicates things. |
Do you mean to keep the contents of the |
Oh, no. I agreed with you, i.e, merging |
I agree with that as well. What I meant was to keep the single |
Closed by #1015. |
Feature description
Background
Currently, the Miden client only allows users to create accounts with standardized code (basic wallet and faucets) and storage. We want users to be able to create accounts with custom code and storage and deploy them on Miden using the Miden client. This is the equivalent of deploying Ethereum smart contracts, and it will widen the space for use cases as users can define their account interfaces.
We need to devise a way for users to write their own smart contracts and initialize their custom storage, which results in a
.mac
file that can be injected into the Miden client. See 0xPolygonMiden/miden-client#204.We could come up with a simple Rust tool at the beginning.
We can mostly follow how we create accounts in here. To create an account, we need the following:
Account id
The account ID should be ground at account creation. We should notify the user that this might take some time.
Account vault and nonce
It should be clear that the vault is empty and the nonce is 0. Users can always populate the account's vault in the first transaction.
Account storage
That might be the most complex part. To initialize the
AccountStorage
, we need to know the type (simple Value, Map, Array) and the content of each slot. We have 255 slots. In technical terms, we need to construct aVec<SlotItem>
whereas aSlotItem
consists of(u8, (StorageSlotType, Word))
.One problem is that certain slots are reserved. The basic authentication smart contract assumes that the private key is always at slot 0. So, our tool or package needs to warn the user if he occupies the slot with something else. We should also clearly document this.
Another problem is that we must ensure that the user provides the necessary data if the
StorageSlotType
isMap
orArray
.We should also think of conflict resolution. The tool can notify a user if he wants to occupy the same slot twice.
Account code
After the user has written his MASM or Rust account code, we need to parse the code, pick the right assembler and create the
AccountCode
. I think this should not be an issue here.Why is this feature needed?
We want generalized smart contracts. And we want users to write them.
The text was updated successfully, but these errors were encountered: