Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose component model in C api #9812

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

lanfeust69
Copy link

The goal is to be able to host wasm components in a wasmtime runtime embedded through the C api.

This was discussed in issue #8036.

Thanks to @rockwotj for the basis of the initial commit, in #7801.

I didn't see tests specific to the c-api, currently only tested through the example (and in our internal use !).

I'm not sure it the sizeable component.rs file should be split...

@lanfeust69 lanfeust69 requested a review from a team as a code owner December 13, 2024 11:17
@lanfeust69 lanfeust69 requested review from fitzgen and removed request for a team December 13, 2024 11:17
@github-actions github-actions bot added wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:c-api Issues pertaining to the C API. labels Dec 13, 2024
@rockwotj
Copy link
Contributor

Nice work! BTW one of the more complex things I had punted on in my POC was handling resources. From my conversations with @alexcrichton those significantly changed the Rust API when adding them, and might do the same in the C API (although maybe this has changed in the time since I did the POC?). I notice there is no resource support in this PR - it's probably worth a plan for that. At least that was the sentiment prior.

@fitzgen fitzgen requested review from alexcrichton and removed request for fitzgen December 13, 2024 13:46
@lanfeust69
Copy link
Author

Nice work! BTW one of the more complex things I had punted on in my POC was handling resources. From my conversations with @alexcrichton those significantly changed the Rust API when adding them, and might do the same in the C API (although maybe this has changed in the time since I did the POC?). I notice there is no resource support in this PR - it's probably worth a plan for that. At least that was the sentiment prior.

Indeed, having started from your change, and not needing resources on my side, I decided not to support that for now, but I should have made that clear.

I'm currently looking at the CI failures :

  • I've been "fooled" by clang-format, because there is no .clang-format file in the repo, so my default one would be picked up, with a significantly different formatting, and I assumed there was no strict formatting rules for C files.
  • Not sure if a function only used with some features should be marked #[allow(dead_code)] (which may miss the time it is no longer used) or #[cfg(any(feature = "feature1", feature = "feature2"))] (which gets unwieldy, and may prevent discoverability if useful elsewhere)
  • CI (reasonably) wants to run the C example for component, but the workaround in component/main.rs is not available, something else is needed.

@lanfeust69 lanfeust69 force-pushed the component_model_c_api branch from d6c996f to 25a626c Compare December 13, 2024 16:55
@dicej
Copy link
Contributor

dicej commented Dec 13, 2024

Regarding resources: we'll definitely need to support them eventually, given that they're already used heavily by WASIp2 and beyond; plus they're just really useful in general. This PR doesn't need to address that, but we should start thinking about what a follow-up PR might look like.

A minimal implementation would include:

Otherwise, I think it's just a matter of passing the uint32_t reps (representing either host-implemented or guest-implemented resource instances) back and forth between the host and guest, letting Wasmtime handle the details of lifting and lowering. @alexcrichton does that sound plausible to you?

@lanfeust69 lanfeust69 force-pushed the component_model_c_api branch from 25a626c to a9c5dbb Compare December 13, 2024 17:14
@lanfeust69 lanfeust69 force-pushed the component_model_c_api branch from a9c5dbb to f14d457 Compare December 13, 2024 17:21
@fitzgen
Copy link
Member

fitzgen commented Dec 13, 2024

Not sure if a function only used with some features should be marked #[allow(dead_code)] (which may miss the time it is no longer used) or #[cfg(any(feature = "feature1", feature = "feature2"))] (which gets unwieldy, and may prevent discoverability if useful elsewhere)

We generally prefer the cfg-based approach, even though it can get unweildy, because it provides better compile times and code sizes. Discoverability is not an issue because we build docs for docs.rs with feature(doc_auto_cfg) which automatically adds "this method depends on blah feature" to all cfged functions.

Copy link
Member

@alexcrichton alexcrichton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for your work here on this! I've left a few comments below, but I want to also gauge a bit on how much time/effort you have for this. This is a pretty big PR and there's quite a lot to bikeshed here, so I'd personally prefer to take this one-piece-at-a-time and perhaps break this PR up into a few smaller PRs. Would you have time for that? For example I'd like to land the "compile a component" first, then perhaps "define a function in a linker and instantiate a component", then "load a function from a component", then finally "call the component function". I realize that it wouldn't actually be useful until the end, but I think it'd help to talk about a piece-at-a-time since this is a pretty major feature being added.

@@ -49,6 +49,7 @@ logging = ['dep:env_logger']
disable-logging = ["log/max_level_off", "tracing/max_level_off"]
coredump = ["wasmtime/coredump"]
addr2line = ["wasmtime/addr2line"]
component-model = ["wasmtime/component-model", "cranelift"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally the component-model feature wouldn't necessarily imply cranelift, so would it be ok to gate some functions (e.g. component compilation) behind both features?

Comment on lines +19 to +26
/**
* \brief Whether or not to enable support for the component model in
* Wasmtime.
*
* For more information see the Rust documentation at
* https://docs.wasmtime.dev/api/wasmtime/struct.Config.html#method.wasm_component_model
*/
WASMTIME_CONFIG_PROP(void, component_model, bool)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be moved to conifg.h?

(and also probably gated by WASMTIME_FEATURE_COMPONENT_MODEL

Comment on lines +8 to +9
#ifndef WASMTIME_COMPONENT_H
#define WASMTIME_COMPONENT_H
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to this #ifdef guard I think this whole file will want to be gated on WASMTIME_FEATURE_COMPONENT_MODEL since it'll otherwise not be available when that feature is disabled.

Comment on lines +119 to +121
/// A variable sized bitset.
WASMTIME_COMPONENT_DECLARE_VEC(val_flags, uint32_t);
WASMTIME_COMPONENT_DECLARE_VEC_NEW(val_flags, uint32_t);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasmtime's API doesn't quite reflect this yet but we're moving in a direction where flags is just a 64-bit integer, so I think it's reasonable to go ahead and define this on the C side as a small struct with a uint64_t member

Comment on lines +402 to +436
/// Representation of a component in the component model.
typedef struct wasmtime_component_t wasmtime_component_t;

/**
* \brief Compiles a WebAssembly binary into a #wasmtime_component_t
*
* This function will compile a WebAssembly binary into an owned
#wasmtime_component_t.
*
* It requires a component binary, such as what is produced by Rust `cargo
component` tooling.
*
* This function does not take ownership of any of its arguments, but the
* returned error and component are owned by the caller.
* \param engine the #wasm_engine_t that will create the component
* \param buf the address of the buffer containing the WebAssembly binary
* \param len the length of the buffer containing the WebAssembly binary
* \param component_out on success, contains the address of the created
* component
*
* \return NULL on success, else a #wasmtime_error_t describing the error
*/
wasmtime_error_t *
wasmtime_component_from_binary(const wasm_engine_t *engine, const uint8_t *buf,
size_t len,
wasmtime_component_t **component_out);

/**
* \brief Deletes a #wasmtime_component_t created by
* #wasmtime_component_from_binary
*
* \param component the component to delete
*/
void wasmtime_component_delete(wasmtime_component_t *component);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I might recommend if you're up for it is to organize component model support the same way as the core wasm support. For example in the wasmtime crate there's wasmtime::Foo for core wasm things and wasmtime::component::Foo for component-model things, basically everything component-related is in its own namespace.

For the C API we've got wasmtime/foo.h for a specific core-wasm thing and wasmtime.h for "everything core wasm". For components what do you think about wasmtime/component/foo.h for a specific item (e.g. this part here would be wasmtime/component/component.h like wasmtime/module.h) and then there'd be wasmtime/component.h which would include all the component headers.

While this header isn't too too big just yet I could imagine it becoming much larger over time as it supports more features of the component model.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I'll look at this.

Comment on lines +473 to +474
// declaration from store.h
typedef struct wasmtime_context wasmtime_context_t;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be removed in favor of a #include <wasmtime/store.h>?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to favor forward declarations when possible, mostly for build time reasons (more to include is slower, and changes in the included header imply a re-build that could be avoided). What are the reasons for preferring a full #include ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK build-time concerns matter a lot for C++ but far less so for C, so I'd prefer to not duplicate definitions because structures like this can change over time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plus one - C is super cheap to build and this is a trivial amount of just declarations.

Comment on lines +540 to +550
/**
* \brief Builds the linker, providing the host functions defined by calls to
* #wasmtime_component_linker_define_func
*
* \param linker the #wasmtime_component_linker_t to build
*
* \return wasmtime_error_t* On success `NULL` is returned, otherwise an error
* is returned which describes why the build failed.
*/
wasmtime_error_t *
wasmtime_component_linker_build(wasmtime_component_linker_t *linker);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still only reviewing the header file so far, but I'm a bit confused by this in the sense that a linker at least in Rust doesn't have a "build" operation, it can just always be used to instantiate after an item is added to the linker. Given that I'm not sure what this is doing internally and what the expected contract would be with respect to when callers need to use this in relation to instantiate below?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the way I tried to "hide" the difference between Linker and LinkerInstance, the latter didn't seem useful to expose to the C api. But in order to add host functions to the linker, you need the appropriate LinkerInstance, and that can only be created once (through .root() of .instance(name)). An alternative could be to keep the LinkerInstances around in a map, or to provide a lookup mechanism to Linker.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I think it's fine to change the Rust APIs to better accomodate the C APIs if necessary, but otherwise I'd prefer to keep the two relatively in-sync to make them more managable to maintain, so having a LinkerInstance equivalent in the C API I think would be reasonable (or changing the Rust API to not be based on LinkerInstance)

void wasmtime_component_linker_delete(wasmtime_component_linker_t *linker);

/// Representation of a component instance
typedef struct wasmtime_component_instance_t wasmtime_component_instance_t;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a missing wasmtime_component_instance_delete?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I missed something (very possible !) these instances are always created inside a store, belong to it, and will be deleted when the store goes down.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true yeah, but as an opaque pointer isn't this allocating something to make it pointer-sized in the implementation?

wasmtime_component_instance_t **instance_out);

/// Representation of an exported function in Wasmtime component model.
typedef struct wasmtime_component_func_t wasmtime_component_func_t;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a missing wasmtime_component_func_delete?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly, I assumed get_func would return some kind of handle to existing data, but even (not so sure of that looking at the code) there is still the C wrapper to delete.

Comment on lines +586 to +588
bool wasmtime_component_instance_get_func(
const wasmtime_component_instance_t *instance, wasmtime_context_t *context,
const char *name, size_t name_len, wasmtime_component_func_t **item_out);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I might recommend doing here is to match the get_export style from the Rust API where lookup-by-name isn't what's done in the component model exactly but rather it's "lookup within this optional instance this name" which return "an exported thing" and then the "exported thing" can be downcast into a function. That way this can handle nested exports in instances and such.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, as this was part of the initial POC I started from, and matched my needs, I didn't look much more at the underlying structures on the Rust side. This may be a bit similar to my attempt to "hide" the LinkerInstances nested structures within the Linker.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I can see how this works for getting top-level exports, but ideally we'll want to at least set this API design up for extending to the entire component model even if it initially doesn't support it. In that sense I think it'd be good to model the nested instance structure of exports here.

@lanfeust69
Copy link
Author

@alexcrichton : Thanks for a very quick first look !

I personally prefer sizeable PR broken down through meaningful commits (because of the logical coupling, an oversight in an earlier PR may only be really visible when used later on), but unfortunately GitHub (as well as GitLab) make the review process much more PR/MR oriented than commit-oriented, which kind of defeats this approach. I'll try to see if I can find a split that makes sense to me (without looking closely, I have a feeling that some of the steps you suggest might be too small, at least compared to others).

I'll try to answer your code-related comments tomorrow. I'm reasonably confident I'll eventually be able to actually address what should be done, though I won't make any promise on the timeline 😉.

@lanfeust69
Copy link
Author

Not sure if a function only used with some features should be marked #[allow(dead_code)] (which may miss the time it is no longer used) or #[cfg(any(feature = "feature1", feature = "feature2"))] (which gets unwieldy, and may prevent discoverability if useful elsewhere)

We generally prefer the cfg-based approach, even though it can get unweildy, because it provides better compile times and code sizes. Discoverability is not an issue because we build docs for docs.rs with feature(doc_auto_cfg) which automatically adds "this method depends on blah feature" to all cfged functions.

Well, clippy barked at the allow(dead_code) version, so I ended up using the cfg-based version.

@alexcrichton
Copy link
Member

Personally I'd prefer to split this into separate PRs if you're ok with that because there's a fair amount to discuss here and it's easy to lose track of discussions when it's all in one PR. I agree this is an unfortunate artifact of using github, but there's not a whole lot we can do about that other than work around it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wasmtime:api Related to the API of the `wasmtime` crate itself wasmtime:c-api Issues pertaining to the C API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants