-
Notifications
You must be signed in to change notification settings - Fork 64
Rework pipeline shader spec info #871
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: stagesless_shaders
Are you sure you want to change the base?
Conversation
include/nbl/asset/ICPUPipeline.h
Outdated
core::smart_refctd_ptr<IShader> shader = nullptr; | ||
std::string entryPoint = ""; | ||
IPipelineBase::SUBGROUP_SIZE requiredSubgroupSize : 3 = IPipelineBase::SUBGROUP_SIZE::UNKNOWN; //!< Default value of 8 means no requirement | ||
uint8_t requireFullSubgroups : 1 = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm maybe we should have a ICPUComputePipeline::SShaderSpecInfo : ICPUPipeline::SShaderSpecInfo
to enforce even more strong typing (no requireFullSubgroup
in pipelines not using the compute or mesh stage)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I see that since you saw that compute only has one stage, you hoisted the requireFullSubgroup
into the creation parameters, which is acceptable, unfortunately you took the requiredSubgroupSize
with it which was not supposed to happen
return {}; | ||
} | ||
|
||
|
||
inline virtual bool valid() const override final | ||
{ | ||
if (!m_layout) return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
layout valid() check needed too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
inline core::smart_refctd_ptr<base_t> clone_impl(core::smart_refctd_ptr<const ICPUPipelineLayout>&& layout, uint32_t depth) const override final | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cannot be publid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
inline virtual bool valid() const override final | ||
{ | ||
if (!m_layout) return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
layout valid check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
inline virtual std::span<const SShaderSpecInfo> getSpecInfo(hlsl::ShaderStage stage) const override final | ||
{ | ||
if (stage==hlsl::ShaderStage::ESS_COMPUTE && isMutable()) | ||
return {&m_specInfo,1}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const method does not need to check isMutable()
, the non-const (which I take is implemented in the base class) does
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
include/nbl/asset/ICPUPipeline.h
Outdated
virtual std::span<SShaderSpecInfo> getSpecInfo(const hlsl::ShaderStage stage) = 0; | ||
inline std::span<const SShaderSpecInfo> getSpecInfo(const hlsl::ShaderStage stage) const | ||
{ | ||
return getSpecInfo(stage); | ||
} | ||
virtual std::span<const SShaderSpecInfo> getSpecInfo(const hlsl::ShaderStage stage) const = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where's our murable variant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I see its in ICPUPipeline, but why not here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isMutable is a method of ICPUPipeline, not ICPUPipelineBase
} | ||
return newSpecInfo; | ||
if (!isMutable()) return {}; | ||
const auto specInfo = static_cast<const this_t*>(this)->getSpecInfo(stage); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const_cast
not static cast, you're not going up or down the hierarchy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
size_t childCount = 0; | ||
size_t childrenVisited = 0; | ||
const IAsset* asset; | ||
core::unordered_set<const IAsset*> unvisitedChilds; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
grammar nitpick, the plural of child is children
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a more efficient way to do this, you can keep one unordered set (the alreadyVisited
) and instead of pushing the asset + the returned unordered_map onto a stack, you quickly iterate the returned unordered_map and push their contents onto the stack if they're not in alreadyVisited
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
SKIP_BUILT_IN_PRIMITIVES = 1<<12, | ||
SKIP_AABBS = 1<<13, | ||
NO_NULL_ANY_HIT_SHADERS = 1<<14, | ||
NO_NULL_CLOSEST_HIT_SHADERS = 1<<15, | ||
NO_NULL_MISS_SHADERS = 1<<16, | ||
NO_NULL_INTERSECTION_SHADERS = 1<<17, | ||
ALLOW_MOTION = 1<<20, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think ICPURayTracingPipeline also needs to know about these flags and they must be settable on it ( so the cached creation params have the flags then)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
esp the SKIP and ALLOW_MOTION, the NO_NULL we can have a debate about
// Nabla requires device's reported subgroup size to be between 4 and 128 | ||
enum class SUBGROUP_SIZE : uint8_t | ||
{ | ||
// No constraint but probably means `gl_SubgroupSize` is Dynamically Uniform | ||
UNKNOWN = 0, | ||
// Allows the Subgroup Uniform `gl_SubgroupSize` to be non-Dynamically Uniform and vary between Device's min and max | ||
VARYING = 1, | ||
// The rest we encode as log2(x) of the required value | ||
REQUIRE_4 = 2, | ||
REQUIRE_8 = 3, | ||
REQUIRE_16 = 4, | ||
REQUIRE_32 = 5, | ||
REQUIRE_64 = 6, | ||
REQUIRE_128 = 7 | ||
}; | ||
|
||
struct SCachedCreationParams final | ||
{ | ||
SUBGROUP_SIZE requiredSubgroupSize : 3 = SUBGROUP_SIZE::UNKNOWN; //!< Default value of 8 means no requirement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nooo, you can require a subgroup size on any shader! needs to go back into ShaderSpecInfo
its only requireFullSubgroups
thats a compute-specific thing
include/nbl/asset/IPipeline.h
Outdated
// Nabla requires device's reported subgroup size to be between 4 and 128 | ||
enum class SUBGROUP_SIZE : uint8_t | ||
{ | ||
// No constraint but probably means `gl_SubgroupSize` is Dynamically Uniform | ||
UNKNOWN = 0, | ||
// Allows the Subgroup Uniform `gl_SubgroupSize` to be non-Dynamically Uniform and vary between Device's min and max | ||
VARYING = 1, | ||
// The rest we encode as log2(x) of the required value | ||
REQUIRE_4 = 2, | ||
REQUIRE_8 = 3, | ||
REQUIRE_16 = 4, | ||
REQUIRE_32 = 5, | ||
REQUIRE_64 = 6, | ||
REQUIRE_128 = 7 | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should have stayed here
core::smart_refctd_ptr<ICPUPipelineLayout> layout; | ||
if (_depth>0u && getLayout()) | ||
if (_depth > 0u) | ||
layout = core::smart_refctd_ptr_static_cast<ICPUPipelineLayout>(getLayout->clone(_depth-1u)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax error, no ()
on the getLayout
?
inline base_t* clone_impl(core::smart_refctd_ptr<const ICPUPipelineLayout>&& layout, uint32_t depth) const override final | ||
inline core::smart_refctd_ptr<base_t> clone_impl(core::smart_refctd_ptr<const ICPUPipelineLayout>&& layout, uint32_t depth) const override final |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, cannot be public
if (!specInfo.shader) return false; | ||
stagePresence != stage; | ||
return true; | ||
}; | ||
if (!processSpecInfo(vertexShader)) return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're returning false if a shader is missing, its only the vertex shader that's required, I think you should replace
if (!specInfo.shader) return false;
stagePresence != stage;
return true;
with
if (specInfo.shader)
{
if (specInfo.valid()) // does this check spec constants being correctly set up etc?
return false;
stagePresence |= stage;
}
return false;
you also have a typo with !=
having the not operator instead of |
const auto stage = info.stage; | ||
if ((stage & ~hlsl::ShaderStage::ESS_ALL_RAY_TRACING) != 0) | ||
return false; | ||
if (!std::has_single_bit<std::underlying_type_t<hlsl::ShaderStage>>(stage)) | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stage is now implied from the member position and doesn't need to be checked
// every shader must not be null. use SIndex::Unused to represent unused shader. | ||
return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now nullptr
shader stages will imply and Unused index
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you only need to check that raygen is present and valid
return impl_valid([](const SShaderSpecInfo& info)->bool | ||
{ | ||
if (!info.valid()) | ||
return false; | ||
return false; | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thats just repeating over and over, I don't think we need callbacks to check shader stages anymore
auto getShaderStage = [this](size_t index) -> hlsl::ShaderStage | ||
{ | ||
return shaders[index].stage; | ||
}; | ||
|
||
auto isValidShaderIndex = [this, getShaderStage](size_t index, hlsl::ShaderStage expectedStage, bool is_unused_shader_forbidden) -> bool | ||
{ | ||
if (index == SShaderGroupsParams::SIndex::Unused) | ||
return !is_unused_shader_forbidden; | ||
if (index >= shaders.size()) | ||
return false; | ||
if (getShaderStage(index) != expectedStage) | ||
return false; | ||
return true; | ||
}; | ||
|
||
if (!isValidShaderIndex(shaderGroups.raygen.index, hlsl::ShaderStage::ESS_RAYGEN, true)) | ||
{ | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all of this is stale code and needs to go
for (const auto& shaderGroup : shaderGroups.hits) | ||
{ | ||
// https://docs.vulkan.org/spec/latest/chapters/pipelines.html#VUID-VkRayTracingPipelineCreateInfoKHR-flags-03470 | ||
if (!isValidShaderIndex(shaderGroup.anyHit, | ||
hlsl::ShaderStage::ESS_ANY_HIT, | ||
bool(flags & FLAGS::NO_NULL_ANY_HIT_SHADERS))) | ||
return false; | ||
|
||
// https://docs.vulkan.org/spec/latest/chapters/pipelines.html#VUID-VkRayTracingPipelineCreateInfoKHR-flags-03471 | ||
if (!isValidShaderIndex(shaderGroup.closestHit, | ||
hlsl::ShaderStage::ESS_CLOSEST_HIT, | ||
bool(flags & FLAGS::NO_NULL_CLOSEST_HIT_SHADERS))) | ||
return false; | ||
|
||
if (!isValidShaderIndex(shaderGroup.intersection, | ||
hlsl::ShaderStage::ESS_INTERSECTION, | ||
false)) | ||
return false; | ||
} | ||
|
||
for (const auto& shaderGroup : shaderGroups.misses) | ||
{ | ||
if (!isValidShaderIndex(shaderGroup.index, | ||
hlsl::ShaderStage::ESS_MISS, | ||
false)) | ||
return false; | ||
} | ||
|
||
for (const auto& shaderGroup : shaderGroups.callables) | ||
{ | ||
if (!isValidShaderIndex(shaderGroup.index, hlsl::ShaderStage::ESS_CALLABLE, false)) | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rewrite needed, correct?
@@ -15,6 +15,147 @@ class IGPURayTracingPipeline : public IGPUPipeline<asset::IRayTracingPipeline<c | |||
using pipeline_t = asset::IRayTracingPipeline<const IGPUPipelineLayout>; | |||
|
|||
public: | |||
struct SCreationParams |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you have both this and SCreationParams
below?
return core::smart_refctd_ptr<ICPURayTracingPipeline>(retval,core::dont_grab); | ||
} | ||
|
||
inline core::smart_refctd_ptr<base_t> clone_impl(core::smart_refctd_ptr<const ICPUPipelineLayout>&& layout, uint32_t depth) const override final |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cannot be public
inline virtual std::span<const SShaderSpecInfo> getSpecInfo(hlsl::ShaderStage stage) const override final | ||
{ | ||
switch (stage) | ||
{ | ||
case hlsl::ShaderStage::ESS_RAYGEN: | ||
return { &m_raygen, 1 }; | ||
case hlsl::ShaderStage::ESS_MISS: | ||
return m_misses; | ||
case hlsl::ShaderStage::ESS_ANY_HIT: | ||
return m_hitGroups.anyHits; | ||
case hlsl::ShaderStage::ESS_CLOSEST_HIT: | ||
return m_hitGroups.closestHits; | ||
case hlsl::ShaderStage::ESS_INTERSECTION: | ||
return m_hitGroups.intersections; | ||
case hlsl::ShaderStage::ESS_CALLABLE: | ||
return m_callables; | ||
|
||
} | ||
return {}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need a mutable getter of the vector<ShaderSpecInfo>&
itself because otherwise we can't add/remove elements after creation only mutate ones which are already there
// TODO(kevinyu): Fix this temporary dummy code | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're on the same page, then this is baiscally checking a layout and raygen shaders are there
All other stages are optional, but if they are present, they need to be checked for validity (spec constants and stuff)
Remember that we can patch NO_NULL flags in the asset converter if we detect incompatibility (missing stages), those will be in our patch.
Rework SShaderSpecInfo for ICPUPipeline to be more mutable:
IPipelineBase::SShaderSpecInfo should be templated on a boolean being mutable or not, such that members are conditional_t, and for CPU: