Skip to content

Conversation

SupaMaggie70Incorporated
Copy link
Collaborator

@SupaMaggie70Incorporated SupaMaggie70Incorporated commented Aug 14, 2025

Connections
Mostly split off from #7930
Works towards #7197

Description
This PR adds mesh shading info to naga IR so that parsers and writers have an interface to use.

Testing
No testing yet (coming in later PRs, the code here has been tested in #7930)

Squash or Rebase?

Squash

Checklist

  • Run cargo fmt.
  • Run taplo format.
  • Run cargo clippy --tests. If applicable, add:
    • --target wasm32-unknown-unknown
  • Run cargo xtask test to run tests.
  • If this contains user-facing changes, add a CHANGELOG.md entry.

@SupaMaggie70Incorporated
Copy link
Collaborator Author

I'm unsure why the MSRV minimal versions thing is failing, it doesn't look related to this PR since it happens in another crate and this PR doesn't touch anything cargo.

@cwfitzgerald
Copy link
Member

cwfitzgerald commented Aug 17, 2025

Should be fixed with #8112, thanks @andyleiserson!

/// Optional `blend_src` index used for dual source blending.
/// See <https://www.w3.org/TR/WGSL/#attribute-blend_src>
blend_src: Option<u32>,
per_primitive: bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have some docs for this? The Location variant's docs talk about passing values from the vertex stage to the fragment stage; we should make sure the story told here makes sense for readers working on mesh shaders too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried to write some docs. Let me know if you have further comments.

@SupaMaggie70Incorporated
Copy link
Collaborator Author

Going to ping you again @jimblandy just to make sure that this doesn't get forgotten about until the next meeting :)

@jimblandy
Copy link
Member

@SupaMaggie70Incorporated I'll try to get to this tonight, but it may be Saturday.

@jimblandy
Copy link
Member

@SupaMaggie70Incorporated Are these comments at the top of mesh_shading.md still accurate?

Currently naga has no support for mesh shaders beyond recognizing the additional shader stages. For this reason, all shaders must be created with Device::create_shader_module_passthrough.

@SupaMaggie70Incorporated
Copy link
Collaborator Author

@SupaMaggie70Incorporated Are these comments at the top of mesh_shading.md still accurate?

Currently naga has no support for mesh shaders beyond recognizing the additional shader stages. For this reason, all shaders must be created with Device::create_shader_module_passthrough.

I've updated the top of mesh_shading.md. Until further PRs it remains true that you will need to use passthrough shaders but it now has more "processing" ability so the statement wasn't strictly true.

@jimblandy
Copy link
Member

jimblandy commented Aug 25, 2025

(This is not a review comment on this PR, and not directed at SupaMaggie, just commenting on the pre-existing state of the docs.)

This is not a thorough explanation of mesh shading and how it works. Those wishing to understand mesh shading more broadly should look elsewhere first.

This is just not adequate. If we're going to add syntax to Naga that is not covered by the WGSL specification, we need to document that syntax. Knowing how mesh shading works in Vulkan does not magically explain the syntax for mesh shaders in WGSL, or how to invoke them in wgpu.

@SupaMaggie70Incorporated
Copy link
Collaborator Author

(This is not a review comment on this PR, and not directed at SupaMaggie, just commenting on the pre-existing state of the docs.)

This is not a thorough explanation of mesh shading and how it works. Those wishing to understand mesh shading more broadly should look elsewhere first.

This is just not adequate. If we're going to add syntax to Naga that is not covered by the WGSL specification, we need to document that syntax. Knowing how mesh shading works in Vulkan does not magically explain the syntax for mesh shaders in WGSL, or how to invoke them in wgpu.

It wouldn't be part of this PR, but if you think that having a writeup somewhere to describe mesh shaders would be useful, I'm happy to do that separately. However, I have never written a comprehensive GPU API spec before, so I don't exactly think it would be of comparable quality to e.g. the webgpu spec :)

Also, that specific disclaimer was copy-pasted directly from the RT spec.

@jimblandy
Copy link
Member

It wouldn't be part of this PR, but if you think that having a writeup somewhere to describe mesh shaders would be useful, I'm happy to do that separately. However, I have never written a comprehensive GPU API spec before, so I don't exactly think it would be of comparable quality to e.g. the webgpu spec :)

Well, the alternative is just saying "read the code and the examples and figure it out". You want people to actually use your work, right?

I just pushed some docs; do they look correct?

@SupaMaggie70Incorporated
Copy link
Collaborator Author

@jimblandy Not to bug you too much but this is really the one PR that I want to get in as soon as possible, since it blocks everything else (which will be harder to work on as school ramps up). So if you could prioritize getting a review and subsequent merge like within the next week that'd be very much appreciated.

@SupaMaggie70Incorporated
Copy link
Collaborator Author

@jimblandy This PR has now been dead for more than 3 weeks, will you have time to review it before the face to face?

@SupaMaggie70Incorporated
Copy link
Collaborator Author

SupaMaggie70Incorporated commented Oct 1, 2025

@jimblandy I see your comment in the meeting notes about getting to it today or reassigning it, I will hold you to that :)

At this point the only deadline is the next release (28.0) so I'm not in a huge rush, more just hoping that subsequent PRs that will likely require more back-and-forth can be landed in a more reasonable amount of time.

@ErichDonGubler
Copy link
Member

@SupaMaggie70Incorporated: But...our next release is planned to be today, innit? Or were you referring to your own downstream release?

@SupaMaggie70Incorporated
Copy link
Collaborator Author

SupaMaggie70Incorporated commented Oct 1, 2025

@SupaMaggie70Incorporated: But...our next release is planned to be today, innit? Or were you referring to your own downstream release?

@ErichDonGubler I meant the release after this one. My goal had been for this release but that (obviously) won't be happening.

- Extensive revisions to `docs/api-specs/mesh_shading.md`.

- Doc comments.

- Ensure `Module` stays at the bottom of `ir/mod.rs`.

- Avoid a clone.

- Rename some arguments to be more specific.

- Minor readability tweaks.
@jimblandy
Copy link
Member

jimblandy commented Oct 2, 2025

I was finally able to make time to look at this again today. I've pushed a commit with some further changes; please check them out to see if they look right. Here's my current review checklist for the code:

  • Reviewing Naga mesh shader patches
    • enable mesh_shading required?
    • @payload checks
      • is @payload optional on @task shaders, or required? It's optional in SPIR-V.
      • size/type restrictions?
    • @task checks
      • @Workgroup size attribute present
      • Returns vec3?
      • Return type marked with @builtin(mesh_task_size)?
    • SPIR-V requirements
      • TaskEXT
      • MeshEXT
      • OutputPrimitivesEXT
      • OutputVertices
    • @mesh checks
      • @Workgroup size attribute present
      • has no return value
      • has a @primitive_output(P, PN) attribute
        • P is a struct type
        • Exactly one of the following @builtins is present: triangle_indices, line_indices, point_indices
        • @Builtin / @Locations are unique (perhaps already enforced)
        • Every member with @location must also have @per_primitive
        • conditions on PN?
      • has a @vertex_output(V, VN) attribute
        • V satisfies the same requirements as a @vertex shader return value
    • @vertex checks
    • @Fragment checks
      • @per_primitive attribute is present only on inputs that have @location
    • pipeline creation time checks
    • General
      • If your change iterates over a collection, did you ensure the
        order of iteration was deterministic? Using HashMap and
        HashSet is fine, as long as you don't iterate over it.
      • If you insert elements into a set or map that you expect are not
        already present, did you make an assertion about insert's
        return value?
      • If you added a new feature to WGSL that is not covered by the
        WebGPU specification:
        • Did you add a Capability flag for it?
        • Did you document the feature fully in that flag's doc comment?
        • Did you ensure the validator rejects programs that use the
          feature unless its capability is enabled?
      • If your change adds or removes Handles from the IR:
        • Did you update handle validation in valid::handles?
        • Did you update the compactor in compact?
        • Did you update back::pipeline_constants::adjust_expr?
      • If your change adds a new operation:
        • Did you update the typifier in proc::typifier?
        • Did you update the validator in valid::expression?
        • If the operation can be used in constant expressions, did you
          update the constant evaluator in proc::constant_evaluator?
      • If your change introduces any new identifiers to generated code,
        how did you ensure they won't conflict with the users'
        identifiers? (This is usually not relevant to the SPIR-V
        backend.)
        • Did you use the Namer to generate a fresh identifier?
        • Did you register the identifier as a reserved word with the the Namer?
        • Did you use a reserved prefix registered with the Namer?

- `@primitive_output(P, NP)`: This indicates that the mesh shader workgroup will generate at most `NP` primitives, each of type `P`.

Each mesh shader entry point invocation must call the `setMeshOutputs(numVertices: u32, numPrimitives: u32)` builtin function exactly once, in uniform control flow. The values passed by each workgroup's first invocation (that is, the one whose `local_invocation_index` is `0`) determine how many vertices (values of type `V`) and primitives (values of type `P`) the workgroup must produce. This call essentially establishes two implicit arrays of vertex and primitive values, shared across the workgroup, for invocations to populate.
Before generating any results, each mesh shader entry point invocation must call the `setMeshOutputs(numVertices: u32, numPrimitives: u32)` builtin function exactly once, in uniform control flow. The values passed by each workgroup's first invocation (that is, the one whose `local_invocation_index` is `0`) determine how many vertices (values of type `V`) and primitives (values of type `P`) the workgroup must produce. This call essentially establishes two implicit arrays of vertex and primitive values, shared across the workgroup, for invocations to populate.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it used to be this, but after writing the SPIR-V writer I decided that would be unnecessary. On SPIR-V, we have a temporary output buffer that is then copied from when finishing execution, so the order doesn't matter. I think we would do the same with HLSL. I haven't thought about MSL but that should work fine too.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might still good to document this limitation, even if it isn't a concern yet, as it would also allow us to change writer behavior down the line.

@SupaMaggie70Incorporated
Copy link
Collaborator Author

@jimblandy Thanks for getting to this! The code/documentation changes all look great to me. I'll probably spend some time looking over the spec since it looks like you focused a lot on that. The task list you laid out looks long and intimidating but I think each individual item should be very quick. Hopefully we can get this in soon-ish! And hopefully that represents a more solidified API that won't be changing much as I realize more limitations.

One note though, some of the items aren't covered in this PR. The SPIR-V changes come later, and I really want to push pipeline creation checks to another PR so I can get this one in ASAP and have multiple PRs in the works afterward. The last few points though definitely I will think about on my WGSL parser and SPIR-V writer PRs

* `Limits::max_task_workgroup_total_count` - the maximum total number of workgroups from a `draw_mesh_tasks` command or similar. The dimensions passed must be less than or equal to this limit when multiplied together.
* `Limits::max_task_workgroups_per_dimension` - the maximum for each of the 3 workgroup dimensions in a `draw_mesh_tasks` command. Each dimension passed must be less than or equal to this limit.
* `max_mesh_multiview_count` - The maximum number of views used when multiview rendering with a mesh shader pipeline.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When #8206 lands, this will change to max_mesh_multiview_view_count I think. Something to keep in mind, for whichever lands first (hopefully this one!)


A mesh shader entry point must have a `@workgroup_size` attribute, meeting the same requirements as one appearing on a compute shader entry point.

If the mesh shader pipeline has a task shader entry point with a `@payload(G)` attribute, then the pipeline's mesh shader entry point must also have a `@payload(G)` attribute, naming the same variable. Mesh shader invocations can read, but not write, this variable, which is initialized to whatever value was written to it by the task shader workgroup that dispatched this mesh shader grid.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HLSL requires that task shaders have a non-zero sized task payload, so we will probably reflect that in naga.

- `cull_primitive`: The annotated member must be of type `bool`. If it is true, then the primitive is skipped during rendering.

Additionally, the `@location` attributes from the vertex and primitive outputs can't overlap.
Every member of `P` with a `@location` attribute must either have a `@per_primitive` attribute, or be part of a struct type that appears in the primitive data as a struct member with the `@per_primitive` attribute.
Copy link
Collaborator Author

@SupaMaggie70Incorporated SupaMaggie70Incorporated Oct 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not exactly sure what this means. Maybe we should just require that the @per_primitive attribute only be applied to struct members with @location. Unsure

@SupaMaggie70Incorporated
Copy link
Collaborator Author

SupaMaggie70Incorporated commented Oct 2, 2025

I'm super super pleased in general with the new spec :) No further comments

Just one thing I want to note. This is the naga IR pull request, if we could defer most of the spec/documentation work to later PRs and tweaks I'd appreciate it since as I mentioned I kinda want the code here to get in sooner rather than later.

@jimblandy
Copy link
Member

Yes, hopefully the checklist is fine-grained enough that it won't actually take that long.

I see that the backends are stubbed out for now, but I wanted to at least write down all the validation that seemed to be necessary. If some of the checklist items are carried over to later PRs that's fine.

@jimblandy
Copy link
Member

I think it used to be this, but after writing the SPIR-V writer I decided that would be unnecessary. On SPIR-V, we have a temporary output buffer that is then copied from when finishing execution, so the order doesn't matter. I think we would do the same with HLSL. I haven't thought about MSL but that should work fine too.

@SupaMaggie70Incorporated So the idea is that you wouldn't do the OpSetMeshOutputsEXT until the very end of the function, and it'd be followed by a copy? Because the SPV_EXT_mesh_shader spec says:

OpSetMeshOutputsEXT must be called before any variable from Output storage class is written to.

That would require two copies of the vertex and primitive buffers. Is there no way to avoid that?

@SupaMaggie70Incorporated
Copy link
Collaborator Author

I think it used to be this, but after writing the SPIR-V writer I decided that would be unnecessary. On SPIR-V, we have a temporary output buffer that is then copied from when finishing execution, so the order doesn't matter. I think we would do the same with HLSL. I haven't thought about MSL but that should work fine too.

@SupaMaggie70Incorporated So the idea is that you wouldn't do the OpSetMeshOutputsEXT until the very end of the function, and it'd be followed by a copy? Because the SPV_EXT_mesh_shader spec says:

OpSetMeshOutputsEXT must be called before any variable from Output storage class is written to.

That would require two copies of the vertex and primitive buffers. Is there no way to avoid that?

In the SPIR-V writer part I document why there must be a temporary buffer for both. It boils down to the fact that the output buffers must match the max vertices and max primitives for the entry point. So if a function is called by multiple mesh shader entry points with differnet max vertex and max primitive counts, it can still only write to one buffer, so a private variable with size equal to the largest of the entry points is used. Its super hacky and unfortunate, but that's the best I could come up with.

The idea would be that you could call setMeshOutputs anywhere, multiple times, and possibly not at all (it would default to zero). Then when the entry point exits, it would call that and then write all of the vertices/primitives. The other benefit of this approach is that validation is much easier, since when exiting you never write past the values given in setMeshOutputs without giving the user the opportunity to write beyond the limit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants