Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexing metadata section in Forc manifest #21

Merged
merged 8 commits into from
Nov 19, 2024
96 changes: 96 additions & 0 deletions text/rfcs/0006-metadata-in-forc-manifest.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
- Feature Name: Metadata Section in Forc Manifest
- Start Date: 2024-01-25
- Author(s): @deekerno
- RFC PR: [FuelLabs/rfcs#21](https://github.com/FuelLabs/rfcs/pull/21)

## Summary

An optional section describing metadata should be added to the Forc manifest.

## Motivation

The objective of this change is to allow a Sway contract developer to describe arbitrary information that will be available for Forc plugins to process. The immediate use of this change will be to allow for Sway contract developers to specify information for indexing the contract in the Forc manifest, if desired. The expected outcome is that developers will be able to define indexing-specific _metadata_ in the same place and manner that they define other important information, e.g. authoring information, dependencies, etc.
deekerno marked this conversation as resolved.
Show resolved Hide resolved

_Note: This document will describe the change in the context of a indexer-specific example._

## Guide-Level Explanation

_Indexing-specific metadata_ refers to the set of information necessary to allow for an indexing service to begin processing resultant data from contract execution.The service should use this information to create distinct and separate tables for contract data as well as route creation on a publicly-available querying API.
deekerno marked this conversation as resolved.
Show resolved Hide resolved

Consider the following example of a Forc project manifest:
```
[project]
authors = ["User"]
entry = "main.sw"
license = "Apache-2.0"
name = "my-fuel-project"

[project.metadata.index]
namespace = "examaple-namespace"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will we ensure that namespace is unique across projects?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this for awhile and I'm still not sure how to address this in the indexing metadata section.

For context, the namespace functionality was actually used as a way to separate indexers that may have the same name. Due to the original indexer's ability to be self-hosted, it was unlikely that namespaces would be collide since teams were generally expected to run their own indexer service deployment. Additionally, the original indexer also contained functionality to use forc wallet to place indexer uploading/stopping/removal behind authenticated routes.

The next iteration of the indexer is intended to be a sort of hosted "public good" service. Thus, the indexer team expects users to deploy their indexers to the service, and in order to ensure that database tables are distinct from one another (as they've generally been named after their GraphQL or Sway types), we need to determine a unique prefix for tables associated with the same contract. The prefix should be user-chosen as the developer experience with using a user-generated identifier is likely to be better than something generated that has no meaning to them. I did think about using a contract ID, but that will change if the content of the contract changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the indexer website should provide the user with a unique key, which can be used as a prefix?

Are users paying for the indexing service? If so, it should be easy to have a key (or even user-chosen unique namespace) linked to their billing account.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's probably a good idea; users could request one using forc wallet and the service could handle generating a unique key in an idempotent manner.

deekerno marked this conversation as resolved.
Show resolved Hide resolved
identifier = "example-identifier"
schema_path = "/path/to/autogenerated/schema"
deekerno marked this conversation as resolved.
Show resolved Hide resolved
deekerno marked this conversation as resolved.
Show resolved Hide resolved
```

An indexing service should use the data shown above in the `[project.metadata.indexing]` section to generate unique prefixes for database tables (e.g. `example-namespace_example-identifer`) as well as create a public API route through which dapps can query the service for information (e.g. `https://indexer.fuel.network/api/example-namespace/example-identifer`). The _namespace_ functions as the topmost organizational level for an indexer; one can think of it as a sort of folder in which distinct indexers (identified by a unique _identifier_) are located. The _schema path_ will be a relative path to an autogenerated schema file; the schema will be created by an as-yet-to-be-implemented `forc index` command and could be placed in the `out` folder in the contract structure.

For context, this was done on the original indexer in which tables were created for each entity in a user's GraphQL schema; using the namespace and identifier above, an `Order` entity would be accessible in Postgres at `example-namespace.example-identifer.order`. In the next implementation, tables will be created for each Sway struct in a contract.

Sway developers should find this section to be a convenient way to specify information necessary for indexing their data as it would be defined in the same location and format as project metadata. It can added manually by the user or via a `forc` plugin (e.g. a yet-to-be-developed `forc index create`).

As mentioned earlier, this section should be optional as some developers may not have a desire to index their Sway contract. However, for those that do want to index their Sway contract, they should use the `forc index deploy` command after ensuring that the `[project.metadata.indexing]` section has been correctly defined and their contract has been successfully built and deployed. The command should leverage the required project metadata as well as the contract ABI to create an asset package that will be uploaded to an indexing service; the service should process the asset package and begin indexing the Sway contract.

## Reference-Level Explanation

To be able to support an indexing subsection of a `[project.metadata]` section, an equivalent mapping for metadata should be added to [the `PackageManifest` type](https://github.com/FuelLabs/sway/blob/d06f3b4f6ed88f1ed9a3f8e601870ce5615b17c0/forc-pkg/src/manifest.rs#L141-L153):
```rust
/// A direct mapping to a `Forc.toml`.
#[derive(Serialize, Deserialize, Clone, Debug, PartialEq, Eq)]
#[serde(rename_all = "kebab-case")]
pub struct PackageManifest {
pub project: Project,
pub network: Option<Network>,
pub dependencies: Option<BTreeMap<String, Dependency>>,
pub patch: Option<BTreeMap<String, PatchMap>>,
/// A list of [configuration-time constants](https://github.com/FuelLabs/sway/issues/1498).
pub build_target: Option<BTreeMap<String, BuildTarget>>,
build_profile: Option<BTreeMap<String, BuildProfile>>,
pub contract_dependencies: Option<BTreeMap<String, ContractDependency>>,
}
```

This could be done in the way that `cargo` does it in which there is a `metadata` field with a value of type `Option<toml::Value>`, which would satisfy the optional property and ensure that the value is correctly-constructed TOML syntax. A Forc plugin dedicated to indexing (e.g. `forc index`) should then parse the manifest. If the necessary indexing metadata section is present, the plugin should leverage the adjusted `PackageManifest` type and other information from Forc in order to build an asset bundle and upload it to an indexing service. If it is not present, then the plugin should return an error. At no point should a well-defined metadata section lead to an error in `forc` or the Sway compiler.

## Drawbacks

There are no foreseeable drawbacks. This optional section is intended to be leveraged by tools separate from the Sway compiler and should be able to be ignored with no side effects.

## Rationale and Alternatives

The original version of the indexer required users to create a separate project directory and write/edit three separate files in order to be able to index their Sway smart contract:
- a configuration manifest: specifies various aspects of how an indexer will function, e.g. unique name, contract to monitor, contract ABI, start/end block, etc.
- a GraphQL schema: contains the data structures which will be persisted as rows into a database
- a Rust module: uses the `#[indexer]` macro to create functions that process data from the Fuel blockchain into the data structures from the schema and save the data into database records

While it allowed for powerful customization, it increased the mental load on smart contract developers; furthermore, this is in addition to the mental load placed upon them by other user constraints arising from the nature of the original version's architecture. Thus, as the indexing team considers other ways to reduce the set of responsibilities on the user, I think this would be a beneficial addition that (due to its proposed optional status) should not have any detrimental effects to the rest of the developer experience.

An alternative would be to specify this information in a separate file and keep the information in `Forc.toml` purely related to building the contract. However, I do believe that having this optional indexing metadata section inline with the other project metadata makes for a more cohesive developer experience.

## Prior Art

### Cargo

The [Cargo manifest format](https://doc.rust-lang.org/cargo/reference/manifest.html) contains a number of sections, many of which are actually optional and not explicitly needed for compiling a Rust project (e.g. `[badges]`); in fact, there is a [`[metadata]` section](https://doc.rust-lang.org/cargo/reference/manifest.html#the-metadata-table) available in the manifest format which mirrors how the indexing metadata section should be used.

### Fuel Indexer

The original version of the Fuel indexer uses [a YAML manifest](https://docs.fuel.network/docs/indexer/project-components/manifest/) that contains the necessary data to start processing and storing blockchain data in a database; it contained the same data specified in the example snippet above as well as other metadata used to change the behavior of contract indexing.
deekerno marked this conversation as resolved.
Show resolved Hide resolved

In the next implementation of the indexer, all of the fields that were present in the original manifest will not be necessary. A user should only have to specify the namespace and identifier in an `index` section of the metadata section of `Forc.toml`. A `forc index` plugin will locate and process the contract ABI in order to generate the schema file; it will then modify the Forc manifest to add the `schema_path` field to the aforementioned section. If necessary, the original sections may be added as a new indexer service is built out; however, the only things that will be required to build out the artifacts that will be uploaded to the service.

### Smart Contract Languages

As far as I can tell, there does not seem to be equivalent prior art (in regards to indexing) in any of the most popular smart contract languages.

## Future Possibilities

One possibility is to implement a way for `forc index` to determine the block height at which a contract was deployed and add that field as part of the optional indexing section. This would allow the indexing service to skip directly to that point in the chain and prevent the unnecessary processing of blocks prior to contract deployment. If added, it would not need to be exact as the indexing service has historically been fast enough to process thousands of blocks in a short period of time; however, at millions of blocks, it would be beneficial at scale. As far as I can tell, `forc` does not persist the chain height at time of contract deployment; it may be prudent to log the information somewhere in the project directory.
Loading