-
Notifications
You must be signed in to change notification settings - Fork 975
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[metal] Use objc2-metal
#5641
base: trunk
Are you sure you want to change the base?
[metal] Use objc2-metal
#5641
Conversation
This may technically be a breaking change if the user implemented these protocols themselves on a `Mutable` class, but that'd be unsound anyhow, so I'll consider this a correctness fix. This is useful for wgpu, see gfx-rs/wgpu#5641, and the hack will become unnecessary after #563.
bd69f0d
to
943f2a7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a great step forward from having to maintain our own set of bindings manually.
I do have some questions/comments:
- We still seem to be using the
msg_send!
macro in a bunch of places, could we use the generated methods instead? - We also use
autoreleasepool
s in a bunch of places because we had issues with leaks but as far as I understand from the docs ofRetained
, a lot of these (if not all) should now be unnecessary. Is this correct? I think it would make sense to remove them in this PR. - While I appreciate the naming being more consistent with metal's, some of the enum variants now have the enum name as their prefix which feels redundant; but not always which feels odd. Examples:
MTLFeatureSet
's variants are not prefixedMTLLanguageVersion
's variants are all prefixed- one of
MTLReadWriteTextureTier
's variants is not prefixed while the other 2 are)
- Some of the function names are quite long now (ex:
copyFromTexture_sourceSlice_sourceLevel_sourceOrigin_sourceSize_toBuffer_destinationOffset_destinationBytesPerRow_destinationBytesPerImage_options
😆) but I'm not sure how they can be shortened by the generator while also keeping things easy to search for on Apple's docs. - It would be great if all objects had docs from Apple's docs website and/or at least a link to the page on said website but no hurry, just giving my +1 for it :).
wgpu-hal/src/metal/device.rs
Outdated
// TODO: `newComputePipelineStateWithDescriptor:error:` is not exposed on | ||
// `MTLDevice`, is this always correct? | ||
fn new_compute_pipeline_state_with_descriptor( | ||
device: &ProtocolObject<dyn MTLDevice>, | ||
descriptor: &MTLComputePipelineDescriptor, | ||
) -> Result<Retained<ProtocolObject<dyn MTLComputePipelineState>>, Retained<NSError>> { | ||
unsafe { msg_send_id![device, newComputePipelineStateWithDescriptor: descriptor, error: _] } | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's odd that it doesn't exist since newRenderPipelineStateWithDescriptor:error:
does.
We could use newComputePipelineStateWithDescriptor:options:reflection:error:
though and pass MTLPipelineOptionNone
for options
and nil
for reflection
.
so we did use it for a while, but then were able to remove the use on it, so yes, its not necessary anymore |
Thanks for taking a look!
I've already done this in a separate branch, but there were some issues around the semantics not being quite the same, so I wanted to keep it for a separate PR where that discussion could be more focused.
There's a lot of nuance to this question, including the fact that the optimization that allows us to avoid an autorelease is not guaranteed, only likely (depends on the exact emitted assembly, inlining, and the phase of the moon). What has changed though is that autorelease pools no longer have any effect on the program, other than in terms of memory usage (autoreleased objects can be reclaimed sooner) and runtime performance (pushing and popping an autorelease pool has a slight overhead). I'd really prefer to keep it out of this PR, mostly because it'll screw with the diff even more than it already is (indentation of large function bodies will change), but also partly because I do not know why each of these are here, and I'd like to retain that memory-usage profile until deemed wise to do otherwise.
Yup, that's a bug, the logic for implementing this name translation is very simplistic and a bit hastily thrown together - Swift has the correct rules written down, but it'll be breaking to change, so I'll fix it in
Yeah :/ Feel free to comment on madsmtm/objc2#284 if you think of a good solution (or just any solution)!
There's madsmtm/objc2#309 open for it, the local Xcode documentation is stored in an undocumented format that I spent a few hours on, but couldn't immediately reverse-engineer. Linking is similarly difficult, not for class names, but for methods, they have some ID in them, which is why I didn't pursue this - but I guess just linking to the class name would still be a step up, will try to prioritize it. |
I just tried to actually benchmark this, but it's not an art that I'm familiar with, and the results seemed inconclusive (both improvements and regressions, without any clear pattern that I could discern). I'd suggest that someone more familiar with that tries to benchmark this. |
@madsmtm For what it's worth we used to see a huge amount of time in |
I've opened #6107 to fix the semantic issues, btw. Whichever of these two PRs merge first, I'll update the other to remove the remaining |
980b16b
to
86a12fb
Compare
This PR has been open since the spring, so I want to let folks know what's going on with it. We (the wgpu maintainers) think Firefox has been incorporating wgpu Firefox uses a supply-chain auditing tool called But this means that taking this PR means that Firefox cannot update the wgpu sources in Mozilla Central until we have audited every line of The next steps for this PR are:
What I'm hoping to get across here is that, despite the delay, this PR is important to us, and although we have some very stern constraints, we're moving the process along. |
Thank you very much for the update @jimblandy! It's really nice to see that you're taking supply chain attacks seriously, it's something we very much need to be better at in our industry! Regarding the ability to audit/review the Code-wise, I can give a quick primer:
Feel free to ask if there's something here that you need help with, or if there's something I can change to make things easier to audit (now and in the future)! On the social/people side, would it help if we had an online meeting or something? Though I guess that could be construed as a form of social engineering attack, and wouldn't really add anything trust-wise. In any case, thanks again for the update, and I'm totally fine with waiting a few more months, or however long it takes you! |
In our meeting today, the wgpu maintainers decided to mark this as draft, since we'll be getting back to this in 2025. |
1e2ad44
to
03eff06
Compare
To keep the diff smaller and easier to review, this uses a temporary fork of `objc2-metal` and `objc2-quartz-core` whose methods use the naming scheme of the `metal` crate. One particular difficult part with this is that the `metal` crate has several methods where the order of the arguments are swapped relative to the corresponding Objective-C methods. This includes most perilously (since these have both an offset and an index argument, both of which are integers): - `set_bytes` - `set_vertex_bytes` - `set_fragment_bytes` - `set_buffer` - `set_vertex_buffer` - `set_fragment_buffer` - `set_threadgroup_memory_length` But also: - `set_vertex_texture` - `set_fragment_texture` - `set_sampler_state` - `set_vertex_sampler_state` - `set_fragment_sampler_state`
Small update: I've released a new version of I also tried running the benchmarks again, seems to be better than last time I tried it, now all changes are within ~2% of the current |
Description
Use the
objc2-metal
crate instead of themetal
crate. This will:Arc
, as Metal objects are already reference-counted.objc_retainAutoreleasedReturnValue
underneath the hood to avoid putting objects into the autorelease pool when possible, reducing memory pressure.Background
The
metal
crate contains bindings to the Metal framework. This usesobjc
to manually perform the message sends, which is quite error-prone, see gfx-rs/metal-rs#284, gfx-rs/metal-rs#319 and gfx-rs/metal-rs#209 for a few examples of unsoundness (out of many).To solve such problems in the Rust ecosystem in general, I created a successor to
objc
calledobjc2
, which contains most notably the smart-pointerRetained
and an improvedmsg_send!
macro, that together ensure that Objective-C's memory-management rules are upheld correctly.This is only part of the solution though - we'd still have to write the bindings manually. To solve this, I created a tool (planning to integrate it with
bindgen
, but that's likely a multi-year project) to generate such framework crates automatically. In acknowledgement that this tool is by far not perfect, and never will be, I've ensured that there's a bunch of options to modify each generated crate.The modifications for
objc2-metal
in particular are currently just a few hundred lines of code, weak evidence that the generator is fairly good at this point. I'll also bring attention to the file whereunsafe
methods are marked safe - I have plans to investigate ways to semi-automatically figure out if something is safe or not, or at least reduce the burden of doing so, see madsmtm/objc2#685, but it's a hard problem to ensure is completely sound, so for now it's a bit verbose.Connections
gpu-allocator
has transitioned toobjc2-metal
in Traverse-Research/gpu-allocator#225, andblade
has transitioned in kvark/blade#229.gfx-rs/metal-rs#241 is an old open PR for using
objc2
inmetal
internally instead. There currently isn't really a clear path forwards there, and it's a lot of work for less direct benefits to the ecosystem (wgpu-hal
is by far the biggest user ofmetal
). But more fundamentally IMO, it's a problem of separation of concerns;metal
defines several Foundation types likeNSArray
,NSString
,NSError
and so on, and evenCAMetalLayer
from QuartzCore, and that's a bad idea for interoperability compared to having separate crates for each of these frameworks.#5752 removed the
link
feature which is not available in theobjc2
crates.Implementation
The first commit implements the actual migration, by using a branch of
objc2
, with a method naming scheme that (more closely) matchesmetal
, to make it easier to review and test what's changed. Especially confusing is arguments that are re-ordered between the Objective-C API, and the API thatmetal-rs
exposes, see gfx-rs/metal-rs#144.The second commit moves to the real naming scheme that
objc2
uses.I'd strongly recommend you review these two commits separately.
Testing
Tested by using the checklist below, as well as running each example individually, and checking that they seem to work.
During the development of this I made two quite critical typos, which were luckily found by the test suite, but there's bound to be at least one more lurking in here somewhere, please test this thoroughly!
Checklist
cargo fmt
.cargo clippy
.cargo xtask test
to run tests.CHANGELOG.md
. See simple instructions inside file.