diff --git a/src/SUMMARY.md b/src/SUMMARY.md index a653bdb..f48d19c 100644 --- a/src/SUMMARY.md +++ b/src/SUMMARY.md @@ -17,4 +17,16 @@ - [Tutorial](./tutorial/introduction.md) - [Getting Started](./tutorial/getting_started.md) - [Loading Models](./tutorial/loading_models.md) - - [Physics](./tutorial/physics.md) + - [Tick Function & Systems](./tutorial/systems.md) + - [Coordinates & Transformations](./tutorial/coordinates.md) + - [Physics & Colliders](./tutorial/physics.md) + - [Dynamic Textures](./tutorial/dynamic_texturing.md) + - [Drawing Text](./tutorial/drawing_text.md) + - [The Vulkan Context](./tutorial/vulkan_context.md) + - [The Render Context](./tutorial/render_context.md) + - [Custom Rendering Part 1](./tutorial/custom_rendering_1.md) + - [Custom Rendering Part 2](./tutorial/custom_rendering_2.md) + - [Custom Rendering Part 3](./tutorial/custom_rendering_3.md) + - [Custom Rendering Part 4](./tutorial/custom_rendering_4.md) + - [Resources & Conclusion](./tutorial/conclusion.md) + diff --git a/src/images/coordinate-system.svg b/src/images/coordinate-system.svg new file mode 100644 index 0000000..333fb06 --- /dev/null +++ b/src/images/coordinate-system.svg @@ -0,0 +1,8 @@ + + + + + + + ++y+x+z \ No newline at end of file diff --git a/src/images/hotham-icon.png b/src/images/hotham-icon.png new file mode 100644 index 0000000..6e8b58a Binary files /dev/null and b/src/images/hotham-icon.png differ diff --git a/src/images/openxr-grip.png b/src/images/openxr-grip.png new file mode 100644 index 0000000..5dca9e6 Binary files /dev/null and b/src/images/openxr-grip.png differ diff --git a/src/images/posz-properties.png b/src/images/posz-properties.png new file mode 100644 index 0000000..203de71 Binary files /dev/null and b/src/images/posz-properties.png differ diff --git a/src/images/quest2-icon.png b/src/images/quest2-icon.png new file mode 100644 index 0000000..299b9ce Binary files /dev/null and b/src/images/quest2-icon.png differ diff --git a/src/images/spark-reactor-icon.png b/src/images/spark-reactor-icon.png new file mode 100644 index 0000000..32d1791 Binary files /dev/null and b/src/images/spark-reactor-icon.png differ diff --git a/src/images/vulkan-logo-scaled.png b/src/images/vulkan-logo-scaled.png new file mode 100644 index 0000000..0fd713e Binary files /dev/null and b/src/images/vulkan-logo-scaled.png differ diff --git a/src/tutorial/#custom_rendering_4.md# b/src/tutorial/#custom_rendering_4.md# new file mode 100644 index 0000000..9d19861 --- /dev/null +++ b/src/tutorial/#custom_rendering_4.md# @@ -0,0 +1,5 @@ +# Custom Rendering Part 4 + +## Tying it all together + +Back at the beginning of the custom rendering section \ No newline at end of file diff --git a/src/tutorial/.#custom_rendering_4.md b/src/tutorial/.#custom_rendering_4.md new file mode 100644 index 0000000..a91849e --- /dev/null +++ b/src/tutorial/.#custom_rendering_4.md @@ -0,0 +1 @@ +matth@LAPTOP-RFJDO76M.3444:1694625325 \ No newline at end of file diff --git a/src/tutorial/conclusion.md b/src/tutorial/conclusion.md new file mode 100644 index 0000000..b21a0ec --- /dev/null +++ b/src/tutorial/conclusion.md @@ -0,0 +1,77 @@ +# Resources & Conclusion + +This set of 13 introductory lessons barely scratched the surface on how to use the library. Things such as the audio context and the +animation context have not been covered as they are rather fairly trivial thin wrappers around existing third party API's such as +oddio, and their API's do not require extensive knowledge of computer graphics to make efficient use of. + +Hotham at its current stage of 0.2.0 is a library rather than a huge hulking framework, and this often means there is no right or +wrong way to do things with the library. The key pattern is to mix and match the systems you need to develop your solution, and +implement those that do not exist or which are found wanting in their existing implementation. + +This degree of freedom means the potential for novel shaders to be used in conjunction with the existing world/physics/model loading +systems, and to dynamically choose a series of created pipelines depending on the existing world-situation. + +To take hotham to its higher potential, knowledge of existing emergent paradigms of GPU based rendering which have evolved over +the past 15-20 years is useful, including the standards that have arisen through the work of the Khronos Group, such as OpenGL, +and Vulkan, as well as file formats such as KTX2 and GLB/GLTF. + +It can be extremely daunting when first encountering the world of GPU based rendering. To help those new to the field, I have +included a short set of references below that I have found personally useful in one form or another, or that I am aware are somewhat +definitive within the field. I highly recommend going full-immersion into this type of material for a month or more and getting +the mental muscle memory required to make these concepts less formidable. Once you have developed the necessary abstractions, +your own personal libraries to complement Hotham's low level exposure of a number of complex APIs, you can move forward with +the real work of developing your perfect immersive UX. + +If there is call for a video version of these tutorials to explain the concepts more in depth, these may be forthcoming in +the future, however in the meantime feel free to email me if you have any questions, or better yet, come on over to the Discord +and meet others who are using the library to get some help! + +Thanks for hanging out with us here, and good luck on the rest of your journey! + +Matt from Spark Reactor, September 2023. + +![Spark Reactor](../images/spark-reactor-icon.png) ![Oculus](../images/quest2-icon.png) ![Hotham](../images/hotham-icon.png) ![Vulkan](../images/vulkan-logo-scaled.png) + +## Books which develop skills in computer graphics + +* **Pawel Lapinski**'s [Vulkan Cookbook](https://www.amazon.com/Vulkan-Cookbook-potential-generation-graphics/dp/1786468158) is a recipe based book on Vulkan that nevertheless also provides coherent explanations for all of its recipes. +* **Marco Castorina** and **Gabriel Sassone** created [Mastering Graphics Programming with Vulkan](https://www.amazon.com.au/Mastering-Graphics-Programming-Vulkan-state/dp/1803244798), a book on creating a rendering engine from first principles using Vulkan. +* **Graham Sellers** and **John Kessenich** created [The Vulkan Programming Guide](https://www.vulkanprogrammingguide.com/), the official guide to learning Vulkan +* Multiple authors including **Tomas Akenine-Moller** contributed to produce [Real Time Rendering](https://www.realtimerendering.com/), the resource page linked above includes chapters on collision detection and ray tracing. They also provide a [book recommendations page](https://www.realtimerendering.com/books.html) which includes a lot of **free** books such as Principles of Digital Image Synthesis, Immersive Linear Algebra, and more. +* **Richard S. Wright*, **Nicholas Haemel** and others contributed to the [OpenGL SuperBible](https://www.opengl.org/sdk/docs/books/SuperBible/), now in its sixth edition. This is *the book* on OpenGL, with principles that are relevant to Vulkan and GLSL shaders. +* Eric Lengyell produced [Foundations of Game Engine Development](https://foundationsofgameenginedev.com/), a four volume series dedicated to the algorithms and mathematical underpinnings of the craft. + +## Computer graphics: Transformations + +* GPU Open has a [Matrix Compendium](https://gpuopen.com/learn/matrix-compendium/matrix-compendium-intro/), a collection of information on matrix math for transformations in one place. +* [This Youtube tutorial](https://www.youtube.com/watch?v=zjMuIxRvygQ) links to an [interactive series](https://eater.net/quaternions) on visualising quaternions and 3d rotation. + +## Shaders and OpenGL + +* [The Book of Shaders](https://thebookofshaders.com/) is a good site for information on programmatic fragment shaders. +* [Inigo Quilez](https://iquilezles.org/articles/) has articles on numerous computer graphics related topics, also focusing on fragment shaders. +* [Learn OpenGL](https://learnopengl.com/) is a resource devoted to, as the name suggests, learning about OpenGL. Both OpenGL and Vulkan were +developed by Khronos Group and share a number of key similarities especially with respect to the use of shaders and the application of GPU +parallelism concepts to these programs which lie at the heart of real time rendering. +* [glslEditor](https://github.com/patriciogonzalezvivo/glslEditor) is a project which lets you develop programmatic fragment shaders in real time, and +is available to use live at [this website address](http://editor.thebookofshaders.com/) + +## Vulkan Resources + +* The original resource for learning Vulkan is the well known [Vulkan tutorial](https://vulkan-tutorial.com/), which focuses on drawing a single +triangle before moving on to slightly more advanced topics such as mipmaps, multisampling and depth buffering. +* Another good written tutorial is [VulkanGuide](https://vkguide.dev/), which has an excellent selection of tutorials and links to relevant +websites such as GPU open, different sets of Vulkan samples and more. +* Khronos Group's [Github Page](https://github.khronos.org/) lists all of the official Khronos repositories including the [Vulkan Samples](https://github.com/KhronosGroup/Vulkan-Samples) github. This includes multiple gigabytes of examples in different languages. +* [Vulkan Tutorial in Rust](https://github.com/unknownue/vulkan-tutorial-rust) is the above Vulkan tutorial which has been converted to Rust and Ash. This is worth a look to see the coding techniques used and the API calls translated into design patterns that can be replicated in your own applications. +* The GPU Open website has [a section](https://gpuopen.com/learn/developing-vulkan-apps/) dedicated to developing Vulkan applications. This includes blog posts, sample code, libraries and tools. +* The [Vulkan Youtube channel](https://www.youtube.com/@Vulkan) has a variety of talks which help to shed light on difficult topics. Within the last year, this channel has also posted a video from Vulkanised 2023 with a list of developer resources. Other useful talks: + * [The low level mysteries of pipeline barriers](https://www.youtube.com/watch?v=aIR3x_X92y8) + * [Vulkan subpasses](https://www.youtube.com/watch?v=M0upwYZqRMI) + * [Render passes in Vulkan](https://www.youtube.com/watch?v=yeKxsmlvvus) +* Brendan Galea has a series of 31 videos on developing a game engine using Vulkan which covers the material of the Vulkan tutorial and more in a coherent, code-driven way. You can find his tutorial [here](https://www.youtube.com/playlist?list=PL8327DO66nu9qYVKLDmdLW_84-yE4auCR) +* [Voxelphile](https://www.youtube.com/@Voxelphile/videos) on Youtube has a number of videos on Vulkan in Rust, including: + * [Vulkan graphics pipeline and buffer creation](https://youtu.be/h0CvNOLIggY) + * [How to make a Renderer (in Rust)](https://youtu.be/qkdy5yL-EjU) +* [Tantan](https://www.youtube.com/@Tantandev/videos)'s videos on writing a Voxel engine in Rust are IMO the best videos I've seen on the topic, and I recommend keeping an eye on this channel for useful content. The author is using Bevy for their implementation, but the concepts are OpenGL/Vulkan related at their core. +* [Mike Bailey](https://web.engr.oregonstate.edu/~mjb/vulkan/)'s Vulkan page provides Vulkan material licensed under a CC4 ATT/NC/ND license. He also links to a very good summary of Vulkan's structures called *Vulkan in 30 minutes*, available [here](https://renderdoc.org/vulkan-in-30-minutes.html) diff --git a/src/tutorial/coordinates.md b/src/tutorial/coordinates.md new file mode 100644 index 0000000..92d75e7 --- /dev/null +++ b/src/tutorial/coordinates.md @@ -0,0 +1,306 @@ +# Coordinates & Transformations + +Back in the last tutorial, we saw how to load our objects from a series of .glb files into the world. To provide more context, here is the full code for the function that snippet was a part of: + +```rust,noplayground +fn add_skybox(models: &std::collections::HashMap, world: &mut World) { + let negx = add_model_to_world("negx", models, world, None); + let posx = add_model_to_world("posx", models, world, None); + let negy = add_model_to_world("negy", models, world, None).unwrap(); + let posy = add_model_to_world("posy", models, world, None).unwrap(); + let negz = add_model_to_world("negz", models, world, None).unwrap(); + let posz = add_model_to_world("posz", models, world, None).unwrap(); + + let rigid_posz = RigidBody { + body_type: BodyType::Fixed, + ..Default::default() + }; + + let collider_posz = Collider::new(SharedShape::cuboid(2.5, 0.05, 2.5)); + + world.insert_one(posz, rigid_posz); + world.insert_one(posz, collider_posz); + + let rigid_negz = RigidBody { + body_type: BodyType::Fixed, + ..Default::default() + }; + + let collider_negz = Collider::new(SharedShape::cuboid(2.5, 0.05, 2.5)); + + world.insert_one(negz, rigid_negz); + world.insert_one(negz, collider_negz); + + let bust = add_model_to_world("horned one", models, world, None).unwrap(); + { + let mut local_transform = world.get::<&mut LocalTransform>(bust).unwrap(); + local_transform.translation.y = -1.5 ; + local_transform.translation.x = 2.; + local_transform.translation.z = 0.; + local_transform.rotation = Quat::from_rotation_y(-90.0); + } + let collider = Collider::new(SharedShape::ball(0.15)); + world.insert(bust, (collider, Grabbable {})).unwrap(); + + let photo_frame = add_model_to_world("Photo Frame", models, world, None).unwrap(); + { + let mut local_transform = world.get::<&mut LocalTransform>(photo_frame).unwrap(); + local_transform.translation.y = -1.5 ; + local_transform.translation.x = 2.; + local_transform.translation.z = 0.5; + local_transform.rotation = Quat::from_rotation_y(-90.0); + } + + let rigid_photo = RigidBody { + body_type: BodyType::Dynamic, + mass: 0.2, + ..Default::default() + }; + + let collider = Collider::new(SharedShape::cuboid(0.15, 0.15, 0.02)); + world.insert(photo_frame, (collider, Grabbable {}, RigidBody::default())).unwrap(); +} +``` + +Here's what you'll see if you take a look at the source file in Blender: + +![Positive Z cuboid Properties](../images/posz-properties.png) + +As we discussed previously, Blender's Z axis is the same as Hotham and OpenXR's Y axis. The coordinate system is outlined in the [OpenXR specification](https://registry.khronos.org/OpenXR/specs/1.0/html/xrspec.html#coordinate-system), and as you will see, positive Z moves toward the camera, while negative Z is into the screen. Positive X is to your right, negative X to your left, and positive Y is the upward direction. + +![Coordinate system](../images/coordinate-system.svg) + +Consider the Blender file shown above. The Z coordinate is 2.55 metres above the origin, and the object is 10 cm thick. This means that its thickness spreads 5 cm above and below 2.55m, with the bottom of the cuboid touching Z=2.5m. This is the 0.05 that you notice in the parameters passed to create the collider for the floor and ceiling. In 3D, objects are typically quantified with respect to a point of origin, with length width and breadth spreading out around that centre. Thus, a cuboid is specified by its half-widths (2.5 and 0.05 in the above example), and a ball shape is defined by its radius. + +In an ideal world, you might expect to load up a scene like this and find yourself in the middle of the room floating in space, at the origin, looking at the centre of the wall. In practice, this is not what happens. + +When you boot up your headset, initially you are at the origin of what is called the *globally oriented stage*. You are at the centre of your Guardian. As you walk about, your translation from the centre of your Guardian changes. However, the position of the head mounted display is *above the floor*. If you: +```rust,noplayground +println!(engine.world.get::<&mut LocalTransform>(engine.hmd_entity).unwrap()) +``` + +You will find that the translation of the hmd above the origin varies as you move your head left, right, up and down, as does the rotation. Typically if you are sitting down, and you are of average height, the display will be between 60-70cm off the floor. + +If you examine `src/engine.rs` in the crate, you will find this code within `engine.update()`: + +```rust,noplayground + // Since the HMD is parented to the Stage, its LocalTransform (ie. its transform with respect to the parent) + // is equal to its pose in stage space. + let hmd_in_stage = self.input_context.hmd.hmd_in_stage(); + let mut transform = self + .world + .get::<&mut LocalTransform>(self.hmd_entity) + .unwrap(); + transform.update_from_affine(&hmd_in_stage); +``` + +The position of the head mounted display is obtained from OpenXR itself, and it is treated just like any other input device. If you have head tracking enabled in your app and you walk around, your view (including your translation and rotation) changes. Moreover, the value returned by the input context is entirely independent of any global transform you might set on the stage entity. + +# Transformations + +Lets look a little closer at the engine setup: + +```rust,noplayground +fn create_tracking_entities(world: &mut hecs::World) -> (hecs::Entity, hecs::Entity) { + let stage_entity = world.spawn(( + Stage {}, + LocalTransform::default(), + GlobalTransform::default(), + )); + let hmd_entity = world.spawn(( + HMD {}, + Parent(stage_entity), + LocalTransform::default(), + GlobalTransform::default(), + )); + (stage_entity, hmd_entity) +} +``` + +Notice how the head mounted display mentioned earlier is a child of the stage object. The implication of this for navigation is spelled out in the documentation for the stage component. + +*"In short, the final position of the player in the game simulation (ie. global space) is: `stage.position * hmd.position` **You** are responsible for controlling the `Stage`, and the **engine** will update the `HMD`."* + +Why is this so important? Take a look at the implementation of Mul for two Affine transformations. + +```rust,noplayground +impl Mul for Affine3A { + type Output = Affine3A; + + #[inline] + fn mul(self, rhs: Affine3A) -> Self::Output { + Self { + matrix3: self.matrix3 * rhs.matrix3, + translation: self.matrix3 * rhs.translation + self.translation, + } + } +} +``` + +You'll notice that the rotation of the parent system, which is part of the 3x3 matrix of the affine self (on the lhs), gets multiplied by the translation of the local system on the rhs, before adding its translation to the final result. Scale and rotation are applied first, then translation. + +A `LocalTransform` is simply multiplied by the parent transform like the following code from `update_global_transform_recursively` in `update_global_transform_with_parent`: + +```rust,noplayground + { + let child_matrix = &mut world.get::<&mut GlobalTransform>(*child).unwrap().0; + *child_matrix = *parent_matrix * *child_matrix; + } +``` + +**This means that if I get up and move away from the origin of my guardian, and then rotate the stage about its own origin, I will rotate on a circle around the stage object's global translation, with the radius of that circle being my headset's translation from the origin of the guardian.** + +If you are not careful in considering the parenting of objects, you can end up describing circles within circles, and getting quite confused. At first, this seems counter intuitive, until you understand the larger picture. A visual aid is helpful. Intuitively, it is easy in your head to reverse the order of rotation and translation occuring, because we normally visualise rotation occuring along a radius which we conflate with the translation. But this would be an incorrect interpretation of the situation. + +Instead, you should picture all child objects of some parent (such as the head mounted display) as being *oriented* with respect to their parent space, with an *origin* being determined by the parent's translation. Scale spreads out uniformly from this translation point, and then the child is translated, with all axes having been rotated around the origin by the parent's rotation and scaled accordingly. Spend some time getting comfortable with this image, because it will serve you well when figuring out more complicated situations. Meditate on it, run the transformation forward and backward in your mind. + +See the additional arcs of rotation occurring, but the translation point remaining unchanged. See the child, translated with respect to that rotation, moving by degrees around a circle, running in the same direction up left and right each time, but its up left and right are constrained, as if it is being moved through toffee or better yet, through a magnetic field. + +We see this type of relative motion in nature with physical frames of reference such as a planet, a solar system, a galaxy. If you imagine your transformations in this way, you won't get confused. + +# Summary of methods + +To adjust the position of an object, we usually query it's `LocalTransform` like this from the example above: + +```rust,noplayground + let mut local_transform = world.get::<&mut LocalTransform>(bust).unwrap(); +``` + +We then call the implementation functions on that borrowed memory or set the fields within the borrowed struct to adjust the transformation like this: + +```rust,noplayground + local_transform.translation.y = -1.5 ; + local_transform.translation.x = 2.; + local_transform.translation.z = 0.; + local_transform.rotation = Quat::from_rotation_y(-90.0); +``` + +or, like this example which borrows the world in its struct's construction: + +```rust,noplayground + pub fn set_affine(&self, entity: &Entity, affine: Affine3A) { + let mut local_trans = self.world.get::<&mut LocalTransform>(*entity).unwrap(); + local_trans.update_from_affine(&affine); +} +``` + +If an object has a parent, we always use the LocalTransform. The GlobalTransform is useful in getting the position in global space. + +# GOS and Global from Stage + +If you read through the example code in the Hotham crate, you may come across terms such as `gos_from_stage`, `global_from_stage`, +`global_from_stage`, and `gos_from_global`. These are multiplied together and can look overwhelming at first glance, but here's an explanation: + +* `stage::global_from_stage(&world)` gets the `GlobalTransform` of the `stage_entity` +* `gos_from_global` is often the inverse of the translation taken from this global transform, ignoring rotation. +* `gos_from_stage` is `gos_from_global * global_from_stage`. This changes the `GlobalTransform` of the stage object to remove the translation. + +These terms are used in order to transform everything into global coordinates without affecting their orientation. This is used, for instance, +in computing the real world positions of all objects with respect to global space for the shader in the rendering code. + +Removing translation while allowing rotation is useful when you don't care about an objects orientation and only care about the position of +objects with respect to one another. This `gos_from_global` transformation is used in conjunction with their `GlobalTransform` to prepare a +series of objects for scaling and rotation. + + + +These explanations can remain abstract without an example, so lets look at the navigation system code from the hotham examples. + +```rust,noplayground + // Get the stage transform. + let mut stage_transform = world.get::<&mut LocalTransform>(stage_entity).unwrap(); + let global_from_stage = stage_transform.to_affine(); + + // Get the hand transforms. + let stage_from_left_grip = input_context.left.stage_from_grip(); + let stage_from_right_grip = input_context.right.stage_from_grip(); + + // Update grip states. + if input_context.left.grip_button_just_pressed() { + state.global_from_left_grip = Some(global_from_stage * stage_from_left_grip); + } + if input_context.right.grip_button_just_pressed() { + state.global_from_right_grip = Some(global_from_stage * stage_from_right_grip); + } + if input_context.right.grip_button() && input_context.left.grip_button_just_released() { + // Handle when going from two grips to one + state.global_from_right_grip = Some(global_from_stage * stage_from_right_grip); + } + if !input_context.left.grip_button() { + state.global_from_left_grip = None; + state.scale = None; + } + if !input_context.right.grip_button() { + state.global_from_right_grip = None; + } +``` + +In this first snippet, the positions of the left and right grip returned from OpenXR are transformed using the `LocalTransform` of the stage object. This puts them into stage space. The remaining logic handles ensuring the stored transforms of these grips within stage space remain consistent. Note that although `GlobalTransform` and `LocalTransform` should be consistent at the start of a frame after the appropriate systems have run, `LocalTransform` is being used here in case it has been updated during this frame, as the navigation system runs before `update_global_transform_system` and `update_global_transform_with_parent_system`. When these two systems run after the navigation system, any changes made by this code are updated on the relevant objects so the rendering of the frame can occur. + +The real magic happens in the next snippet. For brevity we will only examine the final code block matched by the condition that both left and right grip values have been stored. + +```rust,noplayground + (Some(global_from_stored_left_grip), Some(global_from_stored_right_grip), _) => { + // Gripping with both hands allows scaling the scene + // The first hand acts as an anchor and the second hand only scales the scene. + let stored_left_grip_in_global = global_from_stored_left_grip.translation; + let stored_right_grip_in_global = global_from_stored_right_grip.translation; + let left_grip_in_stage = stage_from_left_grip.translation; + let right_grip_in_stage = stage_from_right_grip.translation; + + let unscaled_global_from_stage = + global_from_stored_left_grip * stage_from_left_grip.inverse(); + let left_grip_in_unscaled_global = + unscaled_global_from_stage.transform_point3a(left_grip_in_stage); + let right_grip_in_unscaled_global = + unscaled_global_from_stage.transform_point3a(right_grip_in_stage); + let stored_dist_in_global = + stored_left_grip_in_global.distance(stored_right_grip_in_global); + let dist_in_unscaled_global = + left_grip_in_unscaled_global.distance(right_grip_in_unscaled_global); + let scale = stored_dist_in_global / dist_in_unscaled_global; + + // Remember scale for when one grip gets released. + state.scale = Some(scale); + + // Let left hand be dominant for now. + let stored_left_grip_from_left_grip = + Affine3A::from_scale(Vec3::new(scale, scale, scale)); + + stage_transform.update_from_affine( + &(global_from_stored_left_grip + * stored_left_grip_from_left_grip + * stage_from_left_grip.inverse()), + ); + } +``` + +The easiest part of this code to understand is the setting of the scale factor. Recall in the first snippet, `stage_from_left_grip` and `stage_from_right_grip` (the raw positions given by OpenXR) were multiplied by a matrix on the left hand side representing the transform of the stage before being stored in the state variable. This implies that in the transform of the stage, there may be a scale factor already applied, as well as a rotation. The translation component of these two stored and transformed grips will have a distance between them in stage space. This distance is given by the `lhs.distance(rhs)` function pattern, which computes the length of the vector which is the difference between two. + +But why does this code compute the `unscaled_global_from_stage` transformation as `global_from_stored_left_grip * stage_from_left_grip.inverse()`? Recall that `global_from_stored_left_grip` is taken from the stored value `global_from_stage * stage_from_left_grip`, which is stored just once when the grip button is first pressed. This stores the rotation, translation and scale of the controller at the time of the grip in relation to stage space, to use "as an anchor". Multiplying this by the inverse of the current left grip removes the rotation and translation of the left grip (it has no scale) and gives a kind of global anchor of scale, translation and rotation at the time the button was pressed, offset by however much the user has rotated or moved the left grip since. This allows the scene to be rotated or translated using a single grip alone, while the scale is calculated from the distance between the hands. + +By then transforming the left and right grips current translation into that original coordinate system, and then dividing that by the original distance between the stored left and right grips, a scaled factor can be maintained and constructed. + +The final line of this code block: + +```rust,noplayground + stage_transform.update_from_affine( + &(global_from_stored_left_grip + * stored_left_grip_from_left_grip + * stage_from_left_grip.inverse()), + ); +``` + +simply multiplies the original stored global left grip by an affine scaling the scene in all three directions by this calculated scale factor. It then cancels out the original orientation and translation of the controller by multiplying by the inverse, leaving only the differential compensation. + +Notice how the inverse matrix is on the right hand side of the multiplication. Let's call the original transformation of the controller T. If the controller does not move or rotate, T.T-1 = the identity matrix. Recall that the stored grip is the stage transformation matrix S multiplied by T, thus S.T.T-1 = S. + +Suppose that the controller rotates or moves by an amount quantified by the transformation U. We can quantify this new controller position as T' = T.U + +Multiplying the original stage transform S by the original controller transformation T, then multiplying this by the inverse of T', we get S.T.T'-1 = S.T.T-1.U-1 = S.U-1 + +You might think this would end up turning the model in the opposite direction to what was intended, as application of the inverse of U rotates or translates in the opposite direction. I will leave it as an exercise for you to determine how the definition how the nature of the definition of the grip axes in the OpenXR specification causes this unexpected result. + +![Grip and AIM in OpenXR](../images/openxr-grip.png) + +If an object has a rigid body, its position is controlled by the physics simulation and should not be set manually. We will touch on the physics simulation next. diff --git a/src/tutorial/custom_rendering_1.md b/src/tutorial/custom_rendering_1.md new file mode 100644 index 0000000..4044f2d --- /dev/null +++ b/src/tutorial/custom_rendering_1.md @@ -0,0 +1,77 @@ +# Custom Rendering Part 1 + +So far in this series, we have looked at how to work with hotham's existing rendering process. Let's do a brief overview of that process. + +In the shaders folder, there are three sets of shaders that execute to create the frame by frame output we see on screen: + +- First of all, a compute shader `shaders/culling.comp` sets the visibility of every primitive based on data passed to the shader about the left and right clip planes and the bounding sphere that has been calculated for each primitive. +- Secondly, a PBR shader for the vertex and fragment stages calculates the position of each vertex based on the view projection and the position of each vertex in globally oriented space. It calculates the color as RGB and discards alpha channel information, specifying no blend operation, outputing F16 tonemapped data. +- Thirdly, a vertex and fragment shader for the GUI components preserves the alpha channel for these samples and translates the screen coordinates into vulkan coordinates, creating a final color for each pixel by multiplying the input color by the sampled font texture. + +When the `engine.update()` function is called, it eventually calls `begin_frame` to prepare openxr for a new frame and to begin recording a command buffer for the frame. This command buffer is then available as a field within each indexed frame inside the render context. + +When a Hotham program calls `draw_gui::draw_gui_system` during the tick loop, it iterates over each panel/UI panel combination in the world and calls paint_gui, which then performs a render pass for each of these panels to render them into the frame buffer. Later, the `rendering_system` organizes the data from all the primitive meshes in the scene together and then executes the compute shader against this primitive data to determine which of them are visible. The rendering system executes a begin render pass for the PBR shader, and for the list of primitives that were determined to be visible executes a draw command for each of these. + +Finally, when `engine.finish()` is called, the command buffers are executed. Due to the way color blending and the font texture are set up, along with the memory barrier and dependencies defined in the PBR render and GUI render passes, the grey coverage in the font texture is sampled at the point on the GUI fragment corresponding with the relevant UV coordinate, and is multiplied with the existing color at that coordinate. Then the blending operation takes place using the `SRC` color fully represented (`vk::BlendFactor` 1) and the `DST` color represented in percentage `1-SRC alpha`. This effectively means that, using the default operation of `COLOR_OP_ADD`, pixels from the GUI render passes only contribute to the final output in accordance with the alpha value of the source pixel. Since this source pixel is the pixel from the font texture, the final output has the original color present only to the extent the source pixel (the new one from the font texture) is transparent. + +The two pipelines define a set of dependencies like so: + +| Dependency Component | PBR Render Pass | GUI renderpass | +|----------------------|-------------------------------------------------------|---------------------------| +| SRC | SUBPASS_EXTERNAL | 0 | +| DST | 0 | SUBPASS_EXTERNAL | +| SRC MASK | empty | MEMORY READ, MEMORY WRITE | +| SRC STAGE | COLOR_ATTACHMENT_OUTPUT EARLY_FRAGMENT_TESTS | ALL GRAPHICS | +| DST MASK | COLOR_ATTACHMENT_WRITE DEPTH_STENCIL_ATTACHMENT_WRITE | MEMORY READ, MEMORY WRITE | +| DST STAGE | COLOR_ATTACHMENT_OUTPUT EARLY_FRAGMENT_TESTS | ALL GRAPHICS | + +This means the following: +1. The first synchronisation scope for the PBR render pass is an external subpass, vis a vis: The GUI renderpass and any other prior relevant pass. +2. The second synchronisation scope for the PBR render pass is itself. +3. There is no particular mask specifying the types of commands the first synchronisation scope refers to +4. The first synchronisation scope's dependency stages being refered to are color attachment output, which is referred to in the Vulkan Specification as "[... specifying] the stage of the pipeline after blending where the final color values are output from the pipeline. This stage includes blending, logic operations, render pass load and store operations for color attachments, render pass multisample resolve operations, and vkCmdClearAttachments." Similarly, it is also dependent on early fragment tests, which is defined as "the stage of the pipeline where early fragment tests (depth and stencil tests before fragment shading) are performed." +5. The destination mask is color attachment write and depth stencil attachment write bits. These are defined as "write access to a color, resolve, or depth/stencil resolve attachment during a render pass or via certain render pass load and store operations. Such access occurs in the VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT pipeline stage." +6. The destination stages are the same. +7. Putting all of this together, this restricts output from being written to the color attachment or depth buffers until the entire stage of writing such output on the prior render pass has completed. + +This is done to ensure sanity in fragment blending and ensure the different systems write their color output sequentially. + +# What constitutes custom rendering? + +Effectively, custom rendering constitutes *anything, particularly Vulkan commands or shader modifications which changes the Vulkan rendering pipeline which Hotham uses to display the world.* + +Here are a couple of examples of what might be considered custom rendering: +* Rendering custom object types such as quadrics +* Performing color blending operations to change the way Hotham deals with transparency. +* Rendering objects with a different tint/hue depending on whether they are in a special state (eg selected) +* Pre-processing operations, for example to display an overlay on the entire scene such as a HUD. +* Rendering into a texture some sub-set of the world, for example to create a virtual mirror in which the user can see their avatar when standing in front of it. +* Replacing the entire rendering process. + +# How does Hotham deal with custom rendering? + +As with all systems within Hotham, it is optional to call the rendering system provided. Hotham sets up a render pass which includes color and depth attachments with multiple sample anti aliasing, which are resolved from four samples down to one in the process of rendering. It also sets up multiviews, which allows for rendering a slightly different image for each eye. However, the attachments within the swapchain which it creates by default can be used by other render passes instead, and there is no reason the fixed processes elaborated in the render context discussion earlier need to happen. + +A hotham program could choose instead to generate its own pipeline, for example using outlined primitives, or generate its own vertex and/or position buffers from the world without recourse to the existing buffers created by loading in meshes. Or it could choose to combine the two to use both world based data and other state based data collected in the course of the program. + +So hotham takes the approach of allowing the user to have maximum flexibility to take over the rendering process particularly during the command buffer recording phase. By calling or not calling or using different parts of the existing rendering system, different intents can be achieved. + +# How does this impact your own approach to custom rendering? + +Conversely, this also means the burden is on you as the developer to ensure you work with the existing sub-systems and with Vulkan to achieve your outcome. There are several considerations you will need to make. + +1. What objects are visible on screen? For this, you will most likely want to use the compute shader and primitive cull data buffer described in the prior section. +2. What objects exist within the world or the program state other than those ordinarily recorded within the primitive buffer which you may need to make consideration of in adapting your code to call the compute shader? +3. In what order do the different types of objects within the world need to be drawn to be displayed correctly on screen? For example, transparent or semi transparennt objects often need to be rendered last and in reverse order of distance from the camera. +4. What pipeline barriers need to be inserted to ensure correct synchronisation? +5. What additional resources do you need to provide to Vulkan in order to ensure correct output? For example: uniform buffers, textures, lighting information. + +# Some general considerations for managing custom rendering. + +Make sure you have some flag, struct or other method within your world data to identify objects you want to render differently. + +To separate the world objects which need to be treated differently, I recommend the use of marker structs. An example of this would be abstracting the `Visible {}` marker which the rendering system checks to gather objects prior to sending the data to the compute shader. Visible objects which are handled by another shader may be marked with an attribute such as `SemiTransparent` or `SpecialObject` to define items that go to one pipeline, multiple pipelines, or none. + +Also be aware that in order to achieve many effects, synchronisation is going to be invaluable. The command `cmd_pipeline_barrier` provides a flexible way of setting up complex dependencies for commands within a render pass, while the render pass definition itself provides a means of identifying dependencies between earlier and later passes. These dependencies and synchronisation methods are crucial to ensuring correct outcomes without undefined behaviors. + +In the remaining parts of this tutorial, we are going to look the general steps such as pipeline creation, culling, memory barriers and swapchain images. We'll also look at how you might structure a change in the rendering to suit a particular use case. Specifically, we'll take a quick look at color blending. It is common to want objects to be partially transparent, in order to implement things such as wrist menus, captioning, heads up displays and other billboarding solutions. This is not as simple as it appears, but we will take the first steps to create a passable output before leaving you to experiment further. I will be focusing on code in the next section and specifically the use of `ash::vk` to set up the pipeline. diff --git a/src/tutorial/custom_rendering_2.md b/src/tutorial/custom_rendering_2.md new file mode 100644 index 0000000..9fff2fc --- /dev/null +++ b/src/tutorial/custom_rendering_2.md @@ -0,0 +1,577 @@ +# Custom Rendering Part 2 + +One of the great difficulties of understanding Vulkan graphics pipelines is understanding how the different parts of the system work together. Let's start by listing the types of data you will access on the way to rendering your scene: + +1. Push Constants. These are defined during pipeline creation using the `push_constant_ranges` function of the `PipelineLayoutCreateInfo` builder pattern, and are pushed to the command buffer using the `device.cmd_push_constants` function prior to a command which uses the push constants using the data. They are pushed on a per stage basis, typically once per draw command. +2. Vertex Data. This is accessed by the vertex shader and the format of this data is defined when creating the pipeline in which you define the custom rendering behavior. Specifically the `vertex_input_state` part of pipeline creation defines the layout and format of the data, which is then used in conjunction with the binding of vertex and index buffers to a command buffer prior to submitting draw commands relying on that format. +3. Objects located in the framebuffer. Note that the framebuffer exists on a per render pass basis.

In the code to the quadrics custom rendering example provided with Hotham 0.2.0, the custom rendering code upon switching to the Quadrics shader binds the adjusted descriptor sets but does not begin a new render pass. In this case, this custom render code is piggy backing off the PBR shader's initial established render pass which bound the frame buffer containing the fixed foveated rendering attachments, depth buffer attachments, stencil attachments and so on. Some objects within the frame buffer may be referenced within the shaders via descriptor sets bound to the command buffer. For example, the GUI pipeline and set of render passes sets up a descriptor set to access the font texture which contains its unique visual information by defining it as an image sampler in its descriptor set at binding location zero, and then referencing this as a uniform sampler2D in the fragment shader code. +4. Objects not in the framebuffer but accessed via descriptor sets. Each `hotham::rendering::buffer::Buffer` object can call a function `update_descriptor_set`, as can be seen in the code for `Frame::new` which updates the binding for each buffer. It is in this function that a `vk::WriteDescriptorSet` is built with the `buffer_info` member set to the buffer, the binding number specified from the constants in `rendering/descriptors.rs`, and the set passed in from `Frame::new`. This method is used to update both storage and uniform buffers. + +Furthermore, there are other considerations as expressed previously: +- Does this change to the rendering require an entirely new render pass, or just a change to an existing shader? +- To produce the desired output, are multiple subpasses required? What are the dependencies between the subpasses including attachments later used as input attachments? +- What synchronisation methods are required to ensure commands execute in desired order? +- Can we use existing framebuffers or should we create our own? How do we ensure the final output goes into the images acquired from the swap chain so they can be presented when `engine.finish` is called to the operating system by Open XR? + +It is easy to overcomplicate the process of rendering in Vulkan. One of the reasons for this is that commands recorded into command buffers may execute out of order, and it can be challenging identifying the stage or region markers, memory barriers and so on that will enable the correct operation of the rendering pipeline. Another reason is the great amount of information that is needed in order to construct a pipeline, before a render pass involving that pipeline can even be instantiated. + +As we can only focus on one area of concern at a time, let's focus first on how we should create a pipeline. + +# Creating a pipeline + +If this is your first time engaging in custom rendering, I recommend you leave the existing render pass alone and choose to render into an empty texture which you then display to screen. This prevents embarassing situations such as starting your example program and seeing garbage on the screen due to incorrect synchronisation or incorrect image format assumptions. You can transition your code to render directly to the screen when you're confident the rendering is accurate. + +The process of creating a pipeline is a long, involved one. Let's summarize it: +1. You must create a pipeline layout. This involves defining push constant ranges and descriptor set layouts which will be used during pipeline execution. +2. You must define the input assembly state. This specifies the topology of the data received in the index buffers and/or vertex buffers. Most of the time this will just be a triangle list. +3. You must define the vertex input state. This is the definition of the input data to the vertex shader which include binding and attribute definitions. +4. You must define the viewport and scissors as appropriate. Most of the time, this is simply the whole screen. +5. You must define the rasterization state. This includes how to draw the primitives assembled, how to cull polygons where appropriate, depth bias configuration if needed, and line width. +6. You must define the multi-sampling state, ie whether this render pass performs multi sample anti aliasing/down sampling. +7. You must define the depth stencil state, which includes whether to perform a depth test, whether to use a depth stencil test, and the minimum and maximum z. +8. You must define the color blend state, which defines for each relevant attachment its channels and how they are blended with existing fragments, if at all. +9. You must define a render pass with at least one subpass, including the render pass transitions, the subpass dependencies, and so on. +10. If you have defined attachments in your render pass, you must create a framebuffer to reference those attachments, after you have created the render pass. This also implies the need to create images and image views to be referenced as color or depth attachments, etc. +11. You must load the compiled SPIR-V shader byte code and associate it with the relevant stages. +12. You must finally combine all of the above information into a `GraphicsPipelineCreateInfo` which is then provided to the `unsafe` function `create_graphics_pipelines` on the device instance in question. +13. You then need to store the pipeline, your descriptor sets, your framebuffers and render pass somewhere to use in the execution of your pipeline. + +Lets now go through each of these stages and provide an example. You can also follow along with the [Vulkan tutorial](https://vulkan-tutorial.com/Drawing_a_triangle/Graphics_pipeline_basics/Fixed_functions) for the fixed function stages to get more context. + +# Pipeline Layout + +```rust,noplayground + let push_constant_range = vk::PushConstantRange::builder() + .offset(0) + .size(std::mem::size_of::() as _) + .stage_flags(vk::ShaderStageFlags::FRAGMENT).build(); + + let pipeline_layout_info = vk::PipelineLayoutCreateInfo::builder() + .push_constant_ranges(&[push_constant_range]) + .set_layouts(&[render_context.descriptors.graphics_layout]).build(); + + let pipeline_layout = unsafe { vulkan_context.device.create_pipeline_layout(&pipeline_layout_info, None).unwrap() }; +``` + +In this example, we define for the fragment shader a single push constant range, stating that the push constant will be just the material being used to render this primitive. Secondly, we use the pre-defined graphics layout mentioned in the section on the Render Context to ensure the buffers for draw data, scene data, textures and so on are available when we use this pipeline. + +# Vertex input state + +```rust,noplayground + // Vertex input state -- we choose the same position and vertex bindings and + // corresponding attribute descriptions used by the default render context as + // our shaders will mimic the PBR shader while allowing the alpha channel through + // for final blending for seamless textures for all objects on the specific + // render pipeline. + let position_binding_description = vk::VertexInputBindingDescription::builder() + .binding(0) + .stride(std::mem::size_of::() as _) + .input_rate(vk::VertexInputRate::VERTEX) + .build(); + let vertex_binding_description = vk::VertexInputBindingDescription::builder() + .binding(1) + .stride(std::mem::size_of::() as _) + .input_rate(vk::VertexInputRate::VERTEX) + .build(); + let vertex_binding_descriptions = [position_binding_description, vertex_binding_description]; + let vertex_attribute_descriptions = Vertex::attribute_descriptions(); + + let vertex_input_state = vk::PipelineVertexInputStateCreateInfo::builder() + .vertex_attribute_descriptions(&vertex_attribute_descriptions) + .vertex_binding_descriptions(&vertex_binding_descriptions); +``` + +Here we see the two types of data which are sent in per vertex by the typical PBR shader. Notice how stride and binding are used to define what is bound where. The `Vertex::attribute_descriptions()` function returns the attribute descriptions for both binding 0 and binding 1 defined above. If you examine the code, the `VERTEX_FORMAT` is just an alias for `R32G32B32_SFLOAT`, and the remaining attribute descriptions correspond with the locations defined in the vertex shader. + +Another important thing to remember when constructing these structures is it is important to ensure the structures you pass in live long enough for the function to act on any references passed in. For this reason, in the above code the vertex input descrioptions and the vertex binding descriptions are saved to a temporary variable which is then passed in to the `builder()` pattern to construct the final object. It is a good idea to do this as a design pattern to avoid accidentally causing a segmentation fault due to accessing unallocated memory. + +# Input assembly state and viewport + +```rust,noplayground + // Input assembly state -- all objects passed to the input assembler are + // a series of triangles composed of positions within the vertex buffer. + let input_assembly_state = vk::PipelineInputAssemblyStateCreateInfo::builder() + .topology(vk::PrimitiveTopology::TRIANGLE_LIST); + + let render_area = render_context.swapchain.render_area; + + // Viewport State -- our pipeline affects the entire screen as designated by + // the render area defined in the render context + let viewport = vk::Viewport { + x: 0.0, + y: 0.0, + width: render_area.extent.width as _, + height: render_area.extent.height as _, + min_depth: 0.0, + max_depth: 1.0, + }; + let viewports = [viewport]; + + // Scissors + let scissors = [render_area]; + + // Viewport state: We select the entire viewport as the area to operate on + let viewport_state = vk::PipelineViewportStateCreateInfo::builder() + .viewports(&viewports) + .scissors(&scissors); +``` + +Here, as described previously, the topology is set to triangle list. The `render_context` contains a swapchain field which includes the render area which is typically the entire screen. The maximum and minimum depth is also defined as the maximum extent possible. + +Notice how the scissors is a simple `vk::Extent2D` and we use the same render_area to define both the viewport and the scissors, and follow the same design principle mentioned previously. + +Also, for all of these patterns, although you may see example code within Hotham which does not perform the final `build()` call, you should do this to convert your object into the final format used later on. + +# Rasterisation state + +```rust,noplayground + // Rasterization state -- we cull back faces, fill the polygons with their texture + // and for now leave depth bias off, although we will test with depth bias enabled + // at a later stage. + let rasterization_state = vk::PipelineRasterizationStateCreateInfo::builder() + .polygon_mode(vk::PolygonMode::FILL) + .cull_mode(vk::CullModeFlags::BACK) + .front_face(vk::FrontFace::COUNTER_CLOCKWISE) + .rasterizer_discard_enable(false) + .depth_clamp_enable(false) + .depth_bias_enable(false) + .depth_bias_constant_factor(0.0) + .depth_bias_clamp(0.0) + .depth_bias_slope_factor(0.0) + .line_width(1.0); + + // Multisample state + let multisample_state = + vk::PipelineMultisampleStateCreateInfo::builder().rasterization_samples(vk::SampleCountFlags::TYPE_1); +``` + +Here we continue to the see the builder pattern without the final `build()` call. This works in many instances due to the builder object containing in order the remaining fields that the final structure contains in order within its definition. + +The builder pattern simply assigns to the fields of the struct in such a way as to ensure consistency, such as setting a count of attachments when attachments are specified and so on. Thus taking a reference to the builder structure often results in the same outcome. It is not recommended however to rely on this fortuitous behavior which may not always hold true. + +Most of the time you will want to use FILL mode, however other options are available including outline only. Note how the determination of whether a face is front facing or backward facing is based on how the ordering of its vertices are specified. You should pay attention to this when you are constructing your meshes. + +# Depth stencil state + +```rust,noplayground + let depth_stencil_state = vk::PipelineDepthStencilStateCreateInfo::builder() + .depth_test_enable(true) + .depth_write_enable(true) + .depth_compare_op(vk::CompareOp::GREATER) + .depth_bounds_test_enable(false) + .min_depth_bounds(0.0) + .max_depth_bounds(1.0) + .stencil_test_enable(false); +``` + +Most of the time, unless you are doing something special, you will want to enable the depth test. Despite what may be implied elsewhere, regardless of whether you use a depth stencil test or depth bounds test, if you enable the depth test you will need a depth attachment in your framebuffer with a suitable number of samples in order for a depth test to be performed, as the depth test needs somewhere to write its information to. Pay attention to the fact that the `create_image` function within hotham 0.2.0 makes some assumptions about the number of samples you want in your image based on whether you specify usage flags as a transient attachment. So if you are requesting a non MSAA depth buffer, you should specify non transient, or adapt the image creation code used in the vulkan context code for your own purposes. + +In OpenXR, the Z value is more negative as it goes out from the camera. In Vulkan, 1.0 is is the far view plane, while 0.0 is the near view plane. Hotham takes care of translating between this GOS of OpenXR and Vulkan's normalized space via the view projection matrices passed to the vertex shader. + +The compare op is key. The GUI render pass, for example specifies that the comparison op is guaranteed to always evaluate to true via the comparison op ALWAYS. This means it performs no depth test (because all panels are recorded at the same depth), but records the depth information for use in the next render pass. On the other hand, both the PBR render context / pipeline and the above example use the GREATER op. This might seem counter intuitive given that most example code works off the description of samples being further away from the camera as depth increases, as described above. + +The render code already performs a visibility test and knows that all objects drawn are visible between the right and left clip planes. Typically, the LESS operation is used for opaque objects. However, when transparent objects enter the scene, the ordering in which fragments are drawn matters. By rendering all GUI content to the screen at depth 1.0, the GUI fragments will always be on top, with their depth buffer information recorded, and then fragments with a greater depth value will be rendered next to allow color blending to occur. This is known as transparency sorting. + +# Color blending + +```rust,noplayground + let color_blend_attachment = vk::PipelineColorBlendAttachmentState::builder() + .color_write_mask( + vk::ColorComponentFlags::R + | vk::ColorComponentFlags::G + | vk::ColorComponentFlags::B + | vk::ColorComponentFlags::A, + ) + .blend_enable(true) + .src_color_blend_factor(vk::BlendFactor::SRC_ALPHA) + .dst_color_blend_factor(vk::BlendFactor::ONE_MINUS_SRC_ALPHA) + .build(); + + let color_blend_attachments = [color_blend_attachment]; + + let color_blend_state = vk::PipelineColorBlendStateCreateInfo::builder() + .attachments(&color_blend_attachments); +``` + +Here we configure the color blending on a per attachment basis, specifically focusing on the color attachment. Notice how each attachment is included within an array of attachments which is then passed to the builder function. Note again the lack of a `build()` call; avoid this where possible. + +Finally, to work with the comparison operator of GREATER specified earlier, this may not be the best blend factor. Experiment to find one that works for you. Depending on which comparison operator is being used, you may find source blend factor one and destination blend factor one minus source alpha to be more appropriate. + +# Creating your shaders + +Let's discuss this example code: + +```rust,noplayground + // The following two lines are external to any function and are stored within + // the module file, compiled to constant data on the heap. + const SEMI_TRANSPARENT_VERT: &[u32] = include_glsl!("src\\shaders\\new.vert"); + const SEMI_TRANSPARENT_FRAG: &[u32] = include_glsl!("src\\shaders\\new.frag"); + // ... Additional code ... + // SHADERS: Vertex and fragment only + let (vertex_shader, vertex_stage) = create_shader( + SEMI_TRANSPARENT_VERT, + vk::ShaderStageFlags::VERTEX, + vulkan_context, + ) + .expect("Unable to create vertex shader"); + + let (fragment_shader, fragment_stage) = create_shader( + SEMI_TRANSPARENT_FRAG, + vk::ShaderStageFlags::FRAGMENT, + vulkan_context, + ) + .expect("Unable to create fragment shader"); + + let pipeline_shader_stages = [vertex_stage, fragment_stage]; +``` + +When specifying the path to your shaders, usually they will be relative to the path to your crate / sub-project, where you have stored your `Cargo.toml` file. While compiling your shaders, if there are any compilation errors, this will be included in the compilation output of your Rust file, and the Rust code will fail to compile if the shaders are unable to compile to SPIR-V format. + +You do not need the shader modules once they have been used to create your pipeline and you can destroy them at the end of your pipeline creation to reclaim memory. + +In this example, the pipeline stages returned from the creation of the shader modules will be passed to the pipeline creation function; hence they are included in the array `pipeline_shader_stages`. + +# Creating your attachments + +```rust,noplayground + let rp_color_attachment = vk::AttachmentDescription::builder() + .format(COLOR_FORMAT) + .samples(vk::SampleCountFlags::TYPE_1) + .load_op(vk::AttachmentLoadOp::CLEAR) + .store_op(vk::AttachmentStoreOp::STORE) + .stencil_load_op(vk::AttachmentLoadOp::DONT_CARE) + .stencil_store_op(vk::AttachmentStoreOp::DONT_CARE) + .initial_layout(vk::ImageLayout::UNDEFINED) + .final_layout(vk::ImageLayout::COLOR_ATTACHMENT_OPTIMAL) + .build(); + + let rp_depth_attachment = vk::AttachmentDescription::builder() + .format(DEPTH_FORMAT) + .samples(vk::SampleCountFlags::TYPE_1) + .load_op(vk::AttachmentLoadOp::CLEAR) + .store_op(vk::AttachmentStoreOp::DONT_CARE) + .stencil_load_op(vk::AttachmentLoadOp::DONT_CARE) + .stencil_store_op(vk::AttachmentStoreOp::DONT_CARE) + .initial_layout(vk::ImageLayout::UNDEFINED) + .final_layout(vk::ImageLayout::DEPTH_STENCIL_ATTACHMENT_OPTIMAL) + .build(); +``` + +To create a render pass, you need to define the attachments that will be used in it, and how they will be handled. This is the purpose of the `AttachmentDescription` builder pattern. `COLOR_FORMAT` and `DEPTH_FORMAT` are defined in Hotham's lib.rs at the crate root. + +You should make the number of samples in your attachments consistent with the multi sample state defined earlier. If this number of samples is greater than one, you may need to specify resolve attachments to resolve into the final color image. + +In this example we clear the image in accordance with the default clear colors specified by Hotham (which are specified when the framebuffer is bound). We also clear the depth buffer to zeros. This is in accordance with the transparency sorting outlined earlier. + +In most cases you will not need to set the stencil load or store operations to anything other than don't care, unless you are going to be doing work with a depth stencil test. In those cases, you can refer to the online Vulkan documentation to determine an appropriate load/store op. In almost all cases, it is fine to leave the initial layout undefined and transition to the final optimal layout; but consult the documentation if you are unsure. + +# Specifying your subpass and dependencies + +```rust,noplayground + let subpass = [vk::SubpassDescription::builder() + .pipeline_bind_point(vk::PipelineBindPoint::GRAPHICS) + .color_attachments(&[vk::AttachmentReference::builder() + .attachment(0) + .layout(vk::ImageLayout::COLOR_ATTACHMENT_OPTIMAL) + .build()]) + .depth_stencil_attachment(&vk::AttachmentReference::builder() + .attachment(1) + .layout(vk::ImageLayout::DEPTH_STENCIL_ATTACHMENT_OPTIMAL) + .build()) + .build()]; + + let dependencies = [vk::SubpassDependency::builder() + .src_subpass(0) + .dst_subpass(vk::SUBPASS_EXTERNAL) + .src_access_mask(vk::AccessFlags::MEMORY_READ | vk::AccessFlags::MEMORY_WRITE) + .dst_access_mask(vk::AccessFlags::MEMORY_READ | vk::AccessFlags::MEMORY_WRITE) + .src_stage_mask(vk::PipelineStageFlags::ALL_GRAPHICS) + .dst_stage_mask(vk::PipelineStageFlags::ALL_GRAPHICS) + .build()]; + +``` + +Note how in this example code, each attachment has a number associated with it. This corresponds to its position within the framebuffer. You also specify the format of the attachment again, as this information is being provided to a different part of the drivers to that which handles transitions. + +In this example code, I have simply reproduced the dependencies specified for the GUI pipeline which were specified to ensure independence between the render pass and the next one in terms of memory accesses. Generally, the dependencies you set up are going to depend on how the pipeline/render pass will be interacting with other render passes to produce the final output. I recommend a video linked on the Vulkan Youtube channel, [Vulkan render passes](https://youtu.be/yeKxsmlvvus). + +# Creating the render pass + +```rust,noplayground + let attachment_list = [rp_color_attachment, rp_depth_attachment]; + let render_pass = unsafe { + vulkan_context.device.create_render_pass(&vk::RenderPassCreateInfo::builder() + .attachments(&attachment_list) + .subpasses(&subpass) + .dependencies(&dependencies).build(), None).expect("Unable to create render pass!") + }; +``` + +The `create_render_pass` is an unsafe function which returns a Result of the render pass to be unwrapped, or an error. The render pass should be saved so that it can be used to begin render passes with the bound pipeline later on. + +# Creating your framebuffer + + +```rust,noplayground + let depth_image = vulkan_context.create_image( + DEPTH_FORMAT, + &render_context.swapchain.render_area.extent, + vk::ImageUsageFlags::DEPTH_STENCIL_ATTACHMENT, + 1, + 1, + ) + .unwrap(); + + let attachment_views = [render_img.view, depth_image.view]; + + let framebufferlist = unsafe { vulkan_context.device.create_framebuffer(&vk::FramebufferCreateInfo::builder() + .render_pass(render_pass) + .attachments(&attachment_views) + .width(render_context.swapchain.render_area.extent.width) + .height(render_context.swapchain.render_area.extent.height) + .layers(1).build(), None) + .expect("Unable to create framebuffer!") }; + +``` + +Note that the default render pass for the PBR engine uses two layers. Sometimes multiple layers may be required. One example is for example with respect to multi views. In this example render pass I have specified one layer only, and removed the transient attachment requirement in the call to `create_image` to ensure a single sample is returned. If you need a multi-sample depth buffer, obviously you can specify the transient option to ensure Hotham picks four samples per pixel, if you need this option. + +It is important when creating the framebuffer that the attachments list lasts long enough for the function to access it, otherwise a segmentation fault without further explanation in the program output may be encountered. + +# Tying it all together - creating the pipeline + +```rust,noplayground + let create_info = vk::GraphicsPipelineCreateInfo::builder() + .stages(&pipeline_shader_stages) + .vertex_input_state(&vertex_input_state) + .input_assembly_state(&input_assembly_state) + .viewport_state(&viewport_state) + .rasterization_state(&rasterization_state) + .multisample_state(&multisample_state) + .depth_stencil_state(&depth_stencil_state) + .color_blend_state(&color_blend_state) + .layout(pipeline_layout) + .render_pass(render_pass) + .subpass(0) + .build(); + + let create_infos = [create_info]; + + println!("Creating pipeline"); + let pipelines = unsafe { + vulkan_context.device.create_graphics_pipelines( + vk::PipelineCache::null(), + &create_infos, + None, + ) + }; + if pipelines.is_err() { + panic!("Unable to create custom rendering pipeline!"); + } + + let pipelines = pipelines.unwrap(); + + unsafe { + vulkan_context + .device + .destroy_shader_module(vertex_shader, None); + vulkan_context + .device + .destroy_shader_module(fragment_shader, None); + } +``` + +Finally, all of the components described above are referenced to create the pipeline object. Multiple pipelines can be created at once. In this example, our pipeline is returned in `pipelines[0]`. Notice we destroy the shader modules with a call to `destroy_shader_module` after the pipeline is created. + +Then any structures you may wish to re-use in executing your pipeline should be saved. For this reason I recommend wrapping your pipeline creation in a struct with a suitable impl block to create a new instance and store the pipeline, images and image views, frame buffer, the render pass object itself in the returned struct. + +At this point you're halfway there! Grab yourself a coffee, or maybe something stronger :/ + + +To begin working with the existing PBR render context, it is important to know how it functions, so lets start by taking a look at the code of the `begin` function: + +```rust,noplayground +pub unsafe fn begin( + world: &mut World, + vulkan_context: &VulkanContext, + render_context: &mut RenderContext, + views: &[xr::View], + swapchain_image_index: usize, +) { + // First, we need to walk through each entity that contains a mesh, collect its primitives + // and create a list of instances, indexed by primitive ID. + // + // We use primitive.index_buffer_offset as our primitive ID as it is guaranteed to be unique between + // primitives. + let meshes = &render_context.resources.mesh_data; + + // Create transformations to globally oriented stage space + let global_from_stage = stage::get_global_from_stage(world); + + // `gos_from_global` is just the inverse of `global_from_stage`'s translation - rotation is ignored. + let gos_from_global = + Affine3A::from_translation(global_from_stage.translation.into()).inverse(); + + let gos_from_stage: Affine3A = gos_from_global * global_from_stage; + + for (_, (mesh, global_transform, skin)) in + world.query_mut::), &Visible>>() + { + let mesh = meshes.get(mesh.handle).unwrap(); + let skin_id = skin.map(|s| s.id).unwrap_or(NO_SKIN); + for primitive in &mesh.primitives { + let key = primitive.index_buffer_offset; + + // Create a transform from this primitive's local space into gos space. + let gos_from_local = gos_from_global * global_transform.0; + + render_context + .primitive_map + .entry(key) + .or_insert(InstancedPrimitive { + primitive: primitive.clone(), + instances: Default::default(), + }) + .instances + .push(Instance { + gos_from_local, + bounding_sphere: primitive.get_bounding_sphere_in_gos(&gos_from_local), + skin_id, + }); + } + } +``` + +Here, all visible meshes with a global transform are iterated and then their individual primitives are iterated. A cloned copy of the primitive indexed on its position within the index buffer is inserted into the primitive map if it has not already been inserted, with an empty vec of instances. Into the instances of that indexed primitive is inserted a single instance containing the bounding sphere of the primitive, its location and optional skin. + +```rust,noplayground + // Next organize this data into a layout that's easily consumed by the compute shader. + // ORDER IS IMPORTANT HERE! The final buffer should look something like: + // + // primitive_a + // primitive_a + // primitive_c + // primitive_b + // primitive_b + // primitive_e + // primitive_e + // + // ..etc. The most important thing is that each instances are grouped by their primitive. + let frame = &mut render_context.frames[render_context.frame_index]; + let cull_data = &mut frame.primitive_cull_data_buffer; + cull_data.clear(); + + for instanced_primitive in render_context.primitive_map.values() { + let primitive = &instanced_primitive.primitive; + for (instance, i) in instanced_primitive.instances.iter().zip(0u32..) { + cull_data.push(&PrimitiveCullData { + bounding_sphere: instance.bounding_sphere, + index_instance: i, + primitive_id: primitive.index_buffer_offset, + visible: false, + }); + } + } +``` + +Here, the `cull_data` is a `Buffer` which is a buffer of primitive instances created by the previous loop and indexed by their offset within the index buffer and the number of the instance. since one primitive may be drawn more than once at a different location. Initially, each potentially visible primitive is set to invisible within the cull data buffer. + +```rust,noplayground + // This is the VERY LATEST we can possibly update our views, as the compute shader will need them. + render_context.update_scene_data(views, &gos_from_global, &gos_from_stage); + + // Execute the culling shader on the GPU. + render_context.cull_objects(vulkan_context); +``` + +Then, the culling shader is invoked with the compute shader's descriptor sets bound. The culling shader only updates data within the primitive cull data buffer up to the number of draw calls recorded in the cull data parameters passed to the shader, which is set to the length of the primitive cull data buffer in the call to `cull_objects`. Thus the intersection test with the clip plane is performed only for genuine instances. + +After the render pass is begun, the primitives are drawn only if they're visible. The following code from `draw_world` illustrates the process: + +```rust,noplayground + draw_data_buffer.clear(); + + let mut instance_offset = 0; + let mut current_primitive_id = u32::MAX; + let mut instance_count = 0; + let cull_data = frame.primitive_cull_data_buffer.as_slice(); + + for cull_result in cull_data { + // If we haven't yet set our primitive ID, set it now. + if current_primitive_id == u32::MAX { + current_primitive_id = cull_result.primitive_id; + } + + // We're finished with this primitive. Record the command and increase our offset. + if cull_result.primitive_id != current_primitive_id { + // Don't record commands for primitives which have no instances, eg. have been culled. + if instance_count > 0 { + let primitive = &render_context + .primitive_map + .get(¤t_primitive_id) + .unwrap() + .primitive; + draw_primitive( + material_buffer, + render_context.pipeline_layout, + primitive, + device, + command_buffer, + instance_count, + instance_offset, + ); + } + + current_primitive_id = cull_result.primitive_id; + instance_offset += instance_count; + instance_count = 0; + } + + // If this primitive is visible, increase the instance count and record its draw data. + if cull_result.visible { + let instanced_primitive = render_context + .primitive_map + .get(&cull_result.primitive_id) + .unwrap(); + let instance = &instanced_primitive.instances[cull_result.index_instance as usize]; + let draw_data = DrawData { + gos_from_local: instance.gos_from_local.into(), + local_from_gos: instance.gos_from_local.inverse().into(), + skin_id: instance.skin_id, + }; + draw_data_buffer.push(&draw_data); + instance_count += 1; + } + } +``` + +Notice that in the first if block, the first iteration is skipped as the `cull_result.primitive_id` will be identical. The code first checks whether the instance being examined is visible, and if it is, increments the instance count, after adding the draw data for the primitive (its GOS transformation and the inverse) into the draw data buffer. The `draw_primitive` function extracts the material id from the primitive, pushes it as a push constant into the push constant data for the subsequent indexed command for the fragment shader stage, and then executes: + +```rust,noplayground + device.cmd_draw_indexed( + command_buffer, + primitive.indices_count, + instance_count, + primitive.index_buffer_offset, + primitive.vertex_buffer_offset as _, + instance_offset, + ); +``` + +You'll notice if you search for these `vertex_buffer_offset` and `index_buffer_offset` attributes within the `Primitive` struct that they are set within the `Primitive::new` function when a primitive is created. + +Recall that when we created our rectangular mesh to draw text on in a previous section, we called Primitive::new using the vertex and index data in order to create the `Vec` that was provided to MeshData::new to create the data that eventually creates a Mesh. `Mesh::new` allocates space for the meshdata passed to it and returns an `Id` from `id_arena`. + +A final call is placed after the loop described above to draw the final visible primitive, if any, using the same methods described above. The render pass is then ended. + +What does this tell us about working with the existing render pass and its descriptor sets? How can we adapt the same code to send multiple instanced primitives to different pipelines, while retaining the same structure for the accompanying primitive data? + +First of all, we must look at the ownership of the data. Each frame rendered holds a `primitive_cull_data_buffer` which is generated in the second section of code outlined above in `render_context.begin`, from the primitive_map previously populated at the start of the `begin()` function. This `primitive_map` is cleared during `render_context.end()` and is *only used* in `systems/rendering.rs`. Each primitive inserted into the primitive map is cloned and consists of simple u32 fields and vector data which reference the index and vertex buffers which were used in the creation of the primitive. + +Consider that the only thing preventing multiple index and vertex buffers from being used here is the use of the index buffer offset as the key used to index into the hashmap `primitive_map`. In the `custom_rendering.rs` of the custom rendering example within Hotham, this problem is worked around by oring the value of the u32 key with a suitable bit flag to indicate the separate shader. The primitive cull data buffer maximum size is currently set at 100K. 131072 is the 17th power of 2, so that implies there is room to use this technique to handle potentially 16 unique bits of data within the key, which is plenty for any use that might be made for Hotham -- effectively it gives 65536 unique combinations of shader utilizing the one primitive map for culling. + +Vulkan indexes into the index and vertex buffers currently bound by the render pass to draw a given instanced primitive. This is not a problem if we bind the new index and vertex buffers along with the pipeline before the subsequent draw calls for the new set of primitives. By binding each pipeline only once and then writing its related primitives in sorted order of pipeline, multiple pipelines can be executed within the same render pass. + +It must be pointed out as well that there is no reason the same index and vertex buffers could not be used for multiple pipelines using different shaders. The only thing which causes all primitives to be drawn via the same pipeline at present is the way the draw loop is constructed. There is no reason you could not construct a struct to accept parameters for the construction of multiple pipelines which use the same layout and descriptor sets but different shaders, blend modes, and (if the index and vertex buffers were rebound), even different topologies or input data formats. + +To prove this, let's create a custom rendering implementation that will make use of what we've seen so far. diff --git a/src/tutorial/custom_rendering_3.md b/src/tutorial/custom_rendering_3.md new file mode 100644 index 0000000..1d907bc --- /dev/null +++ b/src/tutorial/custom_rendering_3.md @@ -0,0 +1,211 @@ +# Custom Rendering Part 3 + +Once you have a working pipeline, you need to use it. This means: +- beginning a render pass +- sending bind calls for the pipeline, vertex buffers, index buffers, descriptor sets +- making draw calls into the command buffer after the pipeline bind point to begin sending the data to the GPU. + +In this series we will not be covering the trivial case of the only custom rendering change being a change in the +default shaders. If you are doing this, all you will need to do is consider the implications of your changes on +the rendering and nothing more. + +We previously mentioned several examples of custom rendering. Most versions of custom rendering which are non trivial +will involve changes to the primary render pass conducted by the rendering system. The two possible paths your custom +rendering may take are: +1. Execute your render pass prior to the PBR render pass with appropriate dependencies set up to ensure the output of +your own render pass is taken into consideration by the PBR render pass. +2. Alter or remove the PBR render pass, replacing it with one which selectively renders what you want it to render. +Execute further render passes before and/or afterwards if necessary to render information into other images. + +In both of these cases, but more so with case number two, you may need to make use of the methods or objects which the +existing render pass uses. Specifically, you may need to: +- Render into the existing acquired swap chain images. +- Perform culling using the compute shader +- Ensure commands run in a specific order. + +This section will focus on these tasks. Using the compute shader has already been covered under the Render Context +page and will be elaborated on again later. Let's start with rendering into the swapchain images. + +# Swapchain Images + +During the `engine.finish()` function, the render context is first called to end the frame. This ends the command +buffer and submits the finished commands. It then adjusts the frame index, and calls OpenXR to stream the images +to the device using a Frame stream. + +Similarly, when a frame is begun, the XR context `begin_frame` is called, which then uses Open XR's `FrameWaiter` +construct to wait on the correct time to begin rendering, then calls begin on the `frame_stream` object. If the +frame waiter advises that rendering should continue, a swapchain image is acquired, and then waited on using the +existing `Swapchain` object to ensure the acquired image index has been read by the compositor. This image +index is returned by the `engine.update()` function to the main hotham program, to be used during the rendering. + +So far, so good. But this does not tell us how the render context renders into the swapchain image. If we examine +the render context, we see it has a `Swapchain` component. A `Swapchain` object is created by the render context +in `new_from_swapchain_info`, which calls `Swapchain::new` with the `SwapchainInfo` structure created by calling +`from_openxr_swapchain`. Looking at `rendering/swapchain.rs` we find this function on Android calls the function +`get_swapchain_images_with_ffr` to return the FFR and ordinary swap chain images. For this it passes in the +`Swapchain` handle it acquired from OpenXR. + +The `get_swapchain_images_with_ffr` function returns the FFR and ordinary swapchain images directly from the OpenXR +API. The swapchain information complete with images is returned to the `Swapchain::new` function which passes it +into the `create_framebuffers` function to create the framebuffer for the primary render pass. Then the swapchain +information structure is dropped and the returned Swapchain object only contains the framebuffer with the swapchain +images in it. + +`create_framebuffers` takes the swapchain images previously returned by OpenXR and simply adds the color and depth +images already created by the swapchain, and creates one framebuffer per swapchain image. What is then done with +these swapchain images which are at this point only able to be accessed via this temporary image view? + +Well, the framebuffer itself is simply a handle. Vulkan provides no API to extract from the framebuffer the images +that went into it. So we have two options: + +The first is to use the same per frame framebuffer that the original rendering system would have otherwise used to +render to the screen. If you do not need to add other attachments to your render pass that will be accessible by +the render pass, this is probably your best bet. This removes any need for you to allocate images yourself into +which the scene will be rendered. + +The render context framebuffer resolves the multi-sampled original colour attachment into the 3rd or 4th image in +the framebuffer, according to whether this is Android and FFR images are being used. The zero based index of the +swapchain image within the framebuffer is held in the `RESOLVE_ATTACHMENT` constant, a `u32` within the render +context module itself. + +There is nothing inherently wrong with two views onto the same image (including the swapchain images) existing, +as long as they are not written to at the same time. Moreover, using the frame buffer does not preclude you from +using multiple render passes. So there is some flexibility here. + +The second option which will provide more control is to create your own frame buffer. The function `from_openxr_swapchain` +function which instantiates a SwapchainInfo structure will return the swapchain images and FFR images which can then be used +to construct an image view into the swapchain and FFR images. This would then allow you to use multiple render passes which +include images not included in the primary render context framebuffers. + +The swapchain module provides example code for creating an image view of the FFR images like so: + +```rust,noplayground + let ffr_image_view = vulkan_context + .create_image_view( + &swapchain_info.ffr_images[0].image, + vk::Format::R8G8_UNORM, + vk::ImageViewType::TYPE_2D_ARRAY, + 2, + 1, + DEFAULT_COMPONENT_MAPPING, + ) + .unwrap(); +``` + +Note the two layers, for multi-views. The format is `R8G8_UNORM`. This means each sample is a 16 bit value. + +If you use the method of resolving into the swapchain image that is used by the render context, make sure +that you update your attachments in the render pass creation to ensure all the images are of the correct +number of samples. At present, this simply means passing the transient option to the `create_image` to +ensure it uses the provided constant number of samples (at present, four) for the MSAA images, and stipulate +the swapchain image as being single sample. + + +# Command ordering / pipeline barriers + +You may wish to set up pipeline barriers to ensure that commands execute at the correct point in the execution +sequence. A common point of failure when learning how to use Vulkan is understanding the massively parallel nature +of GPU based rendering, where fragments or vertices can be processed many at once, not in a contiguous and +ordered sense but in an order which hopefully optimises the GPU's access to the memory it needs to read and write to. + +As most people are CPU based programmers, the dominant paradigm of linear, synchronous processing is familiar and safe. +Moreover, even in the context of asychronous programming, or threading, some vestige of the linear nature of CPU bound +execution remains. A highly non linear, multiprocess architecture such as a GPU uses many techniques for optimisation. +An example of this is foveated rendering, which is a tile based rendering paradigm in which greater levels of detail are +rendered for areas that are in direct focus for the eye. Such tile based rendering solutions may store data in memory +and process it in a different way to traditional linear representations of an image as contiguous sequences of bytes in +memory. This has led to the concept in Vulkan for example of the image transition which occurs between stages or subpasses +which ensures the image is loaded in a way that is optimal for purpose. + +This idea of a program which executes in parallel, out of order is critical to understanding dependencies. A trivial example +would be the rendering of transparent objects. Typically transparent objects in a scene are rendered last and in appropriate +order to ensure that blending happens in a deterministic way. On a parallel architecture, this needs to be explicitly +requested, otherwise the transparent objects might render before the other objects in the frame buffer, or both before and after. + +Pipeline barriers allow you to get more control over the ordering of your execution than the mere dependencies +between passes. Here is an example of a pipeline barrier using Ash. + +```rust,noplayground + let memory_barrier = [vk::MemoryBarrier::builder() + .src_access_mask(vk::AccessFlags::HOST_WRITE) + .dst_access_mask(vk::AccessFlags::MEMORY_READ) + .build()]; + unsafe { device.cmd_pipeline_barrier(command_buffer, + vk::PipelineStageFlags::BOTTOM_OF_PIPE, + vk::PipelineStageFlags::TOP_OF_PIPE, + vk::DependencyFlags::empty(), + &memory_barrier, + &[], + &[]); }; +``` + +The ubiquitous site GPU Open provides information about pipeline barriers [here](https://gpuopen.com/learn/vulkan-barriers-explained/). +There are other articles about them. [Here is another](https://arm-software.github.io/vulkan_best_practice_for_mobile_developers/samples/performance/pipeline_barriers/pipeline_barriers_tutorial.html) that discusses best practices for mobile devices and specifically mentions the tiling example I touched on before. + +The type of pipeline barrier described here can be used within a render pass to ensure specific commands execute *after* others in the pipe. +This is outlined in the [Vulkan documentation](https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/vkCmdPipelineBarrier.html) +for the function which explains that *"If vkCmdPipelineBarrier was recorded inside a render pass instance, the first synchronization scope includes only commands that occur earlier in submission order within the same subpass."* Similarly, it specifies that *"If vkCmdPipelineBarrier was recorded inside a render pass instance, the second synchronization scope includes only commands that occur later in submission order within the same subpass."* + +A pipeline barrier can also occur outside a render pass, in which case **all** commands that occurred earlier, even in a prior subpass, fall into +the first synchronisation scope, while the second synchronisation scope becomes **all** subsequent commands including future subpasses. + +What this means is that you can use render pass dependencies to control higher level flow between different subpasses, and use pipeline barriers +within this context to control execution flow within the subpasses themselves where relevant. Combining these two techniques gives you full +control over the execution ordering. The bottom of pipe/top of pipe example above constitutes a trivial example. + +You should review the documentation [here](https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VkPipelineStageFlagBits.html) for information on +the meaning of the different stages. It is important to note that *VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT is equivalent to VK_PIPELINE_STAGE_ALL_COMMANDS_BIT with VkAccessFlags set to 0 when specified in the second synchronization scope, **but specifies no stage of execution when specified in the first scope.*** Ditto of course, for bottom of pipe and the second scope. + +# Culling + +Here as a reminder is an example of using the culling compute shader. + +```rust,noplayground + let frame = &mut render_context.frames[render_context.frame_index]; + let cull_data = &mut frame.primitive_cull_data_buffer; + cull_data.clear(); + + for instanced_primitive in render_context.primitive_map.values() { + let primitive = &instanced_primitive.primitive; + for (instance, i) in instanced_primitive.instances.iter().zip(0u32..) { + cull_data.push(&PrimitiveCullData { + bounding_sphere: instance.bounding_sphere, + index_instance: i, + primitive_id: primitive.index_buffer_offset | shaderlist.get(&primitive.index_buffer_offset).unwrap(), + visible: false, + }); + } + } + + // This is the VERY LATEST we can possibly update our views, as the compute shader will need them. + render_context.update_scene_data(views, &gos_from_global, &gos_from_stage); + + // Execute the culling shader on the GPU. + render_context.cull_objects(vulkan_context); + + // Begin the render pass, bind descriptor sets. + render_context.begin_pbr_render_pass(vulkan_context, swapchain_image_index); +``` + +The key function is `render_context.cull_objects`. Examining what this function does, it sets the contents of the cull +params storage buffer, into which the view projecion and number of objects in the buffer are placed, binds the compute +command buffer which is separate from the existing command buffer being recorded, binds its descriptor sets, dispatches +the command to call the compute shader, and waits for it to complete. After execution, the `primitive_cull_data_buffer` +which is part of the render frame in question should have visibility updated for all objects. + +The bounding sphere being pushed into the primitive cull data is what tells us whether an object is visible. The +multiplication of the center of the bounding sphere by the projection matrix in the clip planes data translates it into +Vulkan's normalized space. Then, if the center of the sphere is closer to zero than the radius of the sphere, we know +that part of that bounding sphere rests on a point greater than zero that lies within the positive normalized space. + +Also very important is `render_context.update_scene_data` The `views` parameter passed to it originates from calling +`engine.xr_context.update_views()` which updates the eye views based on where OpenXR expects the users view will be at +the predicted display time. `update_scene_data` takes these views and uses them to generate the frustum and camera +positions that will be included in the scene data buffer, which it updates. `cull_objects` then makes use of this +updated scene data to populate its own cull parameters buffer. + +If you have any non Mesh related primitive you need to calculate a bounding sphere for, `rendering/primitive.rs` exposes +a `pub fn calculate_bounding_sphere` which takes a slice of `Vec3` points and returns a `Vec4` with the w component equal +to the bounding sphere's radius. There is also a related function for a given primitive, get_bounding_sphere_in_gos, +which takes a parameter of an `&Affine3A` and transforms the primitive's bounding sphere into globally oriented space using +that Affine. These functions will be useful for those working with custom objects. diff --git a/src/tutorial/custom_rendering_4.md b/src/tutorial/custom_rendering_4.md new file mode 100644 index 0000000..8b40598 --- /dev/null +++ b/src/tutorial/custom_rendering_4.md @@ -0,0 +1,563 @@ +# Custom Rendering Part 4 + +## Tying it all together + +Back at the beginning of the custom rendering section I mentioned transparency. Transparency can be problematic to implement efficiently because of a few factors: +- The parallel processing nature of GPU based rendering +- The need to examine depth information when implementing transparency. + +One method often used is what's known as a Z-only pass, where the first subpass updates the depth buffer without updating the color attachment, and the second subpass then renders into the color attachment with the depth attachment read only, first drawing all the opaque objects and then drawing the transparent/semi-transparent objects in order of distance from the camera, from farthest to nearest, thus allowing blending to be done according to depth. This is also known as a Depth Pre-pass. + +The problem with this approach is that when multiple transparent objects overlap, due to the transparent objects often not being written to the depth buffer, Vulkan cannot tell which transparent surface is on top. Thus, if I render a red mesh plane with varying transparency from 0 to 255/1.0, behind which is a wall with a mural on it, the mural will show through, tinted by the red transparent surface. However, if I then render an object with fully transparent pixels on top of it, the fully opaque pixels will render fine while the overlapping transparent pixels will expose the pixels of the opaque object behind it, the wall with the mural. Put simply, the translucent red screen will have a hole in it (regardless of how the blending of the alpha channels is handled). + +The problem is outlined well [here](https://www.gamedev.net/blogs/entry/2271383-transparency-with-depth-sorting/) and is further discussed for OpenGL [here](https://www.khronos.org/opengl/wiki/Transparency_Sorting). + +A Z-only pass *will slow down* your render. But if you need real translucency the methods outlined above are your best bet. + +However, the most common use for transparency is to have some pixels that are opaque with the remainder fully transparent. An example of this is rendering text and other 2-dimensional graphics around a mesh plane but leaving the rest of the mesh plane transparent, so that the graphics can float in mid air. This is used for things like heads up displays, billboards, and texture overlays. If this is all you need, you're in luck. Within the fragment shader, you have the opportunity to discard the fragment (pixel) if the alpha value is less than a threshold. + +This is a case of "designing your solution to fit your use-case". As the Gamedev article points out, *"Typically game developers get around that problem by simply designing their assets so that multiple transparent surfaces don’t stack up."*. It isn't common in most games to see large numbers of floating transparent panels overlapping. More often you will have fully opaque surfaces with fully transparent pixels surrounding them to allow you to see through to the back of the object. To do this in your fragment shader, you'll need to do something like: + +```glsl + if (outColor.a < 0.05) { + discard; + } +``` + +If this was all you needed, we could end the tutorial here and say thanks for reading. + +However, there are a couple of gotchas and further considerations for the rendering. + +When converting your render pass to render to screen, there are a few places you can go wrong. Let's start by having a look at an example of how to set up framebuffers. + +```rust,noplayground + let swapchain_info = SwapchainInfo::from_openxr_swapchain(&xr_context.swapchain, render_area.extent).unwrap(); + + let mut framebuffers = vec![]; + + let depth_image = vulkan_context.create_image( + DEPTH_FORMAT, + &render_context.swapchain.render_area.extent, + vk::ImageUsageFlags::DEPTH_STENCIL_ATTACHMENT | vk::ImageUsageFlags::TRANSIENT_ATTACHMENT, + 2, + 1, + ).unwrap(); + + let msaa_image = vulkan_context.create_image( + COLOR_FORMAT, + &render_context.swapchain.render_area.extent, + vk::ImageUsageFlags::COLOR_ATTACHMENT | vk::ImageUsageFlags::TRANSIENT_ATTACHMENT, + 2, + 1, + ).unwrap(); + + // Build an array of framebuffers as large as the number of swapchain images, with a separate depth buffer + // and msaa image for each frame. + for img_index in 0..swapchain_info.images.len() { + let swapchain_view = vulkan_context.create_image_view(&swapchain_info.images[img_index], COLOR_FORMAT, + vk::ImageViewType::TYPE_2D_ARRAY, 2, 1, DEFAULT_COMPONENT_MAPPING).unwrap(); + + let ffr_view = vulkan_context.create_image_view(&swapchain_info.ffr_images[0].image, vk::Format::R8G8_UNORM, + vk::ImageViewType::TYPE_2D_ARRAY, 2, 1, DEFAULT_COMPONENT_MAPPING).unwrap(); + + let attachment_views = [msaa_image.view, depth_image.view, ffr_view, swapchain_view]; + + let framebuffer0 = unsafe { vulkan_context.device.create_framebuffer(&vk::FramebufferCreateInfo::builder() + .render_pass(render_pass) + .attachments(&attachment_views) + .width(render_context.swapchain.render_area.extent.width) + .height(render_context.swapchain.render_area.extent.height) + .layers(1).build(), None) + .expect("Unable to create framebuffer!") }; + + framebuffers.push(framebuffer0); + } +``` + +If you add this within your own code and run it, even with variable names adjusted, you'll run into a problem. In Hotham 0.2.0, +`SwapchainInfo::from_openxr_swapchain` is `pub(crate)`, not pub. The simple solution is to alter the method signature to make it +pub, however you will also need to add a perfunctory comment with triple slashes to each of the methods you expose for documentation purposes. + +Here are a few other places the conversion of your render pass can go wrong. + +1. **Problem:** Objects display in the wrong places on the screen, and navigation is problematic
+ **Solution:** Check the `instance_offset` parameter and ensure it is equivalent to the position the object's draw data has been inserted. +2. **Problem:** Scene displays normally but disappears after several seconds
+**Solution:** Check to ensure you are clearing the draw_data buffer before populating it again. This also applies to the cull data buffer +and primitive map. +3. **Problem:** Objects display out of order and in pieces, or a segmentation fault occurs
+**Solution:** Check your render pass creation code to ensure the references that pass in attachment definitions live long enough. A +segmentation fault can occur due to short lived references passed to framebuffer creation for example. A segfault will also occur if +an end render pass is called without a matching begin render pass. +4. Make sure that the number of begin render passes is equal to the number of end render passes. +5. When drawing to dynamic textures, mind your casting. Conversions between f32, u32 and usize when calculating buffer indexes can be +sensitive to bracketing/ordering and cause unexpected results. Ditto for casting your buffer as a mut pointer; make sure the scoping +drops the mutable reference when you're done with it. + +To make matters easier for my own future work, I created some basic classes to manage a collection of semi transparent mesh planes with +RefCell borrowing of a mutable reference to the buffer and graphics implementation functions on the mesh object itself. Here's a short snippet. + +```rust,noplayground +#[derive(Debug,Default)] +pub struct STMeshManager { + mesh_list: HashMap, +} + +impl STMeshManager { + pub fn add(&mut self, engine: &mut Engine, mesh_name: String, transform: LocalTransform, width: f32, height: f32, ppi: f32, parent: Option ) { + let mesh=STMesh::new(engine, transform, width, height, ppi, parent); + self.mesh_list.insert(mesh_name, mesh.clone()); + } + + pub fn get(&mut self,mesh_name: String) -> Option<&mut STMesh> { + self.mesh_list.get_mut(&mesh_name) + } + + pub fn len(&self) -> usize { + self.mesh_list.len() + } + + pub fn keys(&self) -> Vec { + self.mesh_list.keys().map(|s| s.clone()).collect() + } +} +``` + +Here we are abstracting the process of adding a mesh to a list of meshes with a descriptive name. The keys function clones the strings for safety to +ensure that once a key is added to the list, we don't mangle the data in the original reference. + +```rust,noplayground +impl STMesh { + pub fn new(engine: &mut Engine, transform: LocalTransform, width: f32, height: f32, ppi: f32, parent: Option ) -> Self { + let position_buffer: Vec = vec![[width*-0.5 , height*0.5, 0.].into(), + [width*-0.5, height*-0.5, 0.].into(), + [width*0.5, height*-0.5, 0.].into(), + [width*0.5, height*0.5, 0.].into() + ]; + + let index_buffer = [0, 1, 2, 0, 2, 3]; + + let vertex_buffer: Vec = vec![ Vertex::new([0., 0., 1.].into(), [0., 0.].into(), 0, 0), + Vertex::new([0., 0., 1.].into(), [0., 1.].into(), 0, 0), + Vertex::new([0., 0., 1.].into(), [1., 1.].into(), 0, 0), + Vertex::new([0., 0., 1.].into(), [1., 0.].into(), 0, 0), + ]; + + // Since each object's size is specified in metres, the resolution of our texture + // will be determined by the PPI that fits into the specified area, ie *100/2.54 = 39.37 + // conversion factor to inches or about 40 inches per metre by PPI. + let x_width=width*39.37*ppi; + let y_height=height*39.37*ppi; + let dyn_texture=Texture::empty(&engine.vulkan_context, &mut engine.render_context, + vk::Extent2D::builder().width(x_width as u32).height(y_height as u32).build()); + let mut mat_flags = MaterialFlags::empty(); + mat_flags.insert(MaterialFlags::HAS_BASE_COLOR_TEXTURE); + mat_flags.insert(MaterialFlags::UNLIT_WORKFLOW); + + let dyn_material=Material { + packed_flags_and_base_texture_id: mat_flags.bits() | (dyn_texture.index << 16), + packed_base_color_factor: 0, + packed_metallic_roughness_factor: 0, + }; + + let mat_dyntext = unsafe { engine.render_context.resources.materials_buffer.push(&dyn_material) }; + let prim_dynscreen = vec![Primitive::new(&position_buffer[..], &vertex_buffer[..], &index_buffer[..], mat_dyntext, &mut engine.render_context)]; + let meshdata_dynscreen = MeshData::new(prim_dynscreen); + let mesh_dynscreen = Mesh::new(meshdata_dynscreen, &mut engine.render_context); + let mut entity; + + if parent.is_some() { + entity=engine.world.spawn((renderimagebuff::SemiTransparent {}, mesh_dynscreen, + parent.unwrap(), transform.clone(), GlobalTransform::default())); + } else { + entity=engine.world.spawn((renderimagebuff::SemiTransparent {}, mesh_dynscreen, + transform.clone(), GlobalTransform::default())); + } + let texture_buffer = vec![0; x_width.round() as usize*y_height.round() as usize*4]; + let t_width=x_width.round() as u32; + let t_height=y_height.round() as u32; + + Self { + entity: entity, + texture_buffer: RefCell::new(texture_buffer), + material: Rc::new(dyn_material), + texture_object: Rc::new(dyn_texture), + t_width: t_width, + t_height: t_height, + } + } + + pub fn update(&self, vulkan_context: &VulkanContext) { + // Caller must take care to drop any exclusive/&mut references so that borrow will succeed. + vulkan_context.upload_image(&self.texture_buffer.borrow(), 1, vec![0], &self.texture_object.image); + } +``` + +This takes our dynamic texturing code and wraps it into a function to return an entity added to the +world passed in, with a `RefCell` for the singular object we will need to mutate, and material and +texture objects stored as immutable `Rc`'s. + +The update function uploads the changed texture to the GPU. + +Further content can be added to perform basic 2D graphics functionality, or it can be farmed out +to 3rd party modules: + +```rust,noplayvground + pub fn line(&mut self, x1: u32, y1: u32, x2: u32, y2: u32, color: font_texturing::FontColor) { + let mut buff = self.texture_buffer.borrow_mut(); + let buff = buff.as_mut_slice(); + let (mut sx, mut sy)=(x1 as f32, y1 as f32); + let (ex, ey)=(x2 as f32, y2 as f32); + let mut iters=0; + if (ex-sx).abs() > (ey-sy).abs() { + iters=(ex-sx).abs() as u32; + } else { + iters=(ey-sy).abs() as u32; + } + + if iters == 0 { + iters=1; + } + + let dx=(ex-sx)/(iters as f32); + let dy=(ey-sy)/(iters as f32); + for i in 0..iters { + let buff_pos=((sx as u32)+(sy as u32)*self.t_width)*4; + let buff_slice=&mut buff[buff_pos as usize..(buff_pos+4) as usize]; + buff_slice.copy_from_slice(&[color.r as u8, color.g as u8, color.b as u8, color.a as u8]); + sx+= dx; sy +=dy; + } + } +} +``` + +You're going to have a lot of use statements in your custom renderer code. Here's the ones in my custom +rendering module: + +```rust,noplayground +use std::{slice, collections::HashMap, borrow::BorrowMut}; +use hotham::{rendering::{vertex::Vertex, primitive::Primitive, texture::{Texture, DEFAULT_COMPONENT_MAPPING}, material::{Material, MaterialFlags}, + mesh_data::MeshData, buffer::Buffer, resources::Resources, image::Image, swapchain::SwapchainInfo}, COLOR_FORMAT, DEPTH_FORMAT, contexts::XrContext, VIEW_COUNT, vk::Handle}; +use hotham::contexts::{vulkan_context::VulkanContext, render_context::RenderContext}; +use hotham::contexts::render_context::{Instance, InstancedPrimitive}; +use hotham::rendering::resources::{DrawData, PrimitiveCullData}; +use hotham::ash::vk; +use glam::{Vec3, Mat4, Affine3A}; +use hotham::vk_shader_macros::include_glsl; +use hotham::contexts::render_context::create_shader; +use hotham::components::stage; +use hotham::components::skin::{Skin, NO_SKIN}; +use hotham::components::{GlobalTransform, Mesh, Visible}; +use hotham::Engine; +use hecs::{With, World}; +use hotham::xr; +use hotham::systems::rendering::draw_primitive; +``` + +Here are the fields I ended up having to store in order to implement my transparent rendering: + +```rust,noplayground +pub struct CustomRenderBuff { + pub pipeline: vk::Pipeline, + pub pipeline_layout: vk:: PipelineLayout, + pub framebuffers: Vec, + pub swapchain_info: SwapchainInfo, + pub renderpass: vk::RenderPass, + prim_index: u32, +} +``` + +This list could probably be reduced. I originally stored the u32 of the primitive into which I would be +rendering the scene output. That way I could filter that primitive out of the world while drawing it. + +The render pass is needed to begin the render pass. The pipeline needs to be bound to the render pass. +The pipeline layout is needed for when your descriptor sets are bound. + +The framebuffers, if you are doing full custom rendering to replace the default PBR pass, should be +a vector or list of framebuffers, one for each swapchain image index. Each framebuffer can re-use +the same color and depth attachments etc, as the contents of the attachments are no longer needed +once the swapchain image is rendered and can be discarded. + +Other than that, I stored the swapchain information just in case I need to access those again later +as iterating the swapchain through openxr is an expensive operation. + +For the implementation of this custom render context I decided to have the rendering system function +form part of the functions that can be called on mut self. + +```rust,noplayground + fn rendering_system_inner(&self, world: &mut World, vulkan_context: &VulkanContext, + render_context: &mut RenderContext, views: &[xr::View], + swapchain_image_index: usize) { + unsafe { + self.begin( + world, + vulkan_context, + render_context, + views, + swapchain_image_index, + ); + self.render_pass2(vulkan_context, render_context, swapchain_image_index); + self.end(vulkan_context, render_context); + } + } + + pub fn rendering_system(&self, engine: &mut Engine, swapchain_image_index: usize) { + let world = &mut engine.world; + let vulkan_context = &mut engine.vulkan_context; + let render_context = &mut engine.render_context; + + // Update views just before rendering. + let views = engine.xr_context.update_views(); + self.rendering_system_inner(world, vulkan_context, render_context, views, swapchain_image_index); + } +``` + +We've already covered a lot of the `CustomRenderBuff::new` function, but here is part of the pattern that wasn't +covered in the pipeline setup in part 1 + +```rust,noplayground + let ffr_reference = vk::AttachmentReference::builder() + .attachment(2) + .layout(vk::ImageLayout::FRAGMENT_DENSITY_MAP_OPTIMAL_EXT) + .build(); + + let mut ffr_info = vk::RenderPassFragmentDensityMapCreateInfoEXT::builder() + .fragment_density_map_attachment(ffr_reference).build(); + let attachment_list = [rp_color_attachment, rp_depth_attachment, rp_ffr_attachment, rp_resolve_attachment]; + let view_masks = [!(!0 << VIEW_COUNT)]; + let mut multiview = vk::RenderPassMultiviewCreateInfo::builder() + .view_masks(&view_masks) + .correlation_masks(&view_masks) + .build(); + + let render_pass = unsafe { + let mut rp_info=vk::RenderPassCreateInfo::builder() + .attachments(&attachment_list) + .subpasses(&subpass) + .dependencies(&dependencies) + .push_next(&mut multiview); + let rp_info=rp_info.push_next(&mut ffr_info).build(); + + vulkan_context.device.create_render_pass(&rp_info, None).expect("Unable to create render pass!") + }; +``` + +FFR images are added via the fragment density map extension and do not form part of the `SubpassDescription` builder +pattern. For those looking to replicate the type of setup used in the PBR render pass, here are the attachments: + +```rust,noplayground + let rp_color_attachment = vk::AttachmentDescription::builder() + .format(COLOR_FORMAT) + .samples(vk::SampleCountFlags::TYPE_4) + .load_op(vk::AttachmentLoadOp::CLEAR) + .store_op(vk::AttachmentStoreOp::STORE) + .stencil_load_op(vk::AttachmentLoadOp::DONT_CARE) + .stencil_store_op(vk::AttachmentStoreOp::DONT_CARE) + .initial_layout(vk::ImageLayout::UNDEFINED) + .final_layout(vk::ImageLayout::COLOR_ATTACHMENT_OPTIMAL) + .build(); + + let rp_depth_attachment = vk::AttachmentDescription::builder() + .format(DEPTH_FORMAT) + .samples(vk::SampleCountFlags::TYPE_4) + .load_op(vk::AttachmentLoadOp::CLEAR) + .store_op(vk::AttachmentStoreOp::DONT_CARE) + .stencil_load_op(vk::AttachmentLoadOp::DONT_CARE) + .stencil_store_op(vk::AttachmentStoreOp::DONT_CARE) + .initial_layout(vk::ImageLayout::UNDEFINED) + .final_layout(vk::ImageLayout::DEPTH_STENCIL_ATTACHMENT_OPTIMAL) + .build(); + + let rp_ffr_attachment = vk::AttachmentDescription::builder() + .format(vk::Format::R8G8_UNORM) + .samples(vk::SampleCountFlags::TYPE_1) + .load_op(vk::AttachmentLoadOp::DONT_CARE) + .store_op(vk::AttachmentStoreOp::DONT_CARE) + .stencil_load_op(vk::AttachmentLoadOp::DONT_CARE) + .stencil_store_op(vk::AttachmentStoreOp::DONT_CARE) + .initial_layout(vk::ImageLayout::UNDEFINED) + .final_layout(vk::ImageLayout::FRAGMENT_DENSITY_MAP_OPTIMAL_EXT) + .build(); + + + let rp_resolve_attachment = vk::AttachmentDescription::builder() + .format(COLOR_FORMAT) + .samples(vk::SampleCountFlags::TYPE_1) + .load_op(vk::AttachmentLoadOp::DONT_CARE) + .store_op(vk::AttachmentStoreOp::STORE) + .stencil_load_op(vk::AttachmentLoadOp::DONT_CARE) + .stencil_store_op(vk::AttachmentStoreOp::DONT_CARE) + .initial_layout(vk::ImageLayout::UNDEFINED) + .final_layout(vk::ImageLayout::COLOR_ATTACHMENT_OPTIMAL) + .build(); +``` + +In this example, the resolve attachment and the FFR attachments are provided to us by Vulkan +through openxr and the swapchain. The code at the top of part 4 outlines the setup process. + +Here is how the render pass is set up: + +```rust,noplayground + fn start_pass(&self, vulkan_context: &VulkanContext, render_context: &mut RenderContext, swapchain_image_index: usize) { + let fb = self.framebuffers[swapchain_image_index]; + let rp_begin_info = vk::RenderPassBeginInfo::builder() + .framebuffer(fb) + .render_pass(self.renderpass) + .render_area(render_context.render_area()) + .clear_values(&hotham::contexts::render_context::CLEAR_VALUES).build(); + + let command_buffer = render_context.frames[render_context.frame_index].command_buffer; + + + unsafe { + println!("Beginning render pass using framebuffer {:?}, image index {}", fb, swapchain_image_index); + vulkan_context.device.cmd_begin_render_pass(command_buffer, &rp_begin_info, vk::SubpassContents::INLINE); + + vulkan_context.device.cmd_bind_pipeline( + command_buffer, + vk::PipelineBindPoint::GRAPHICS, + self.pipeline, + ); + + vulkan_context.device.cmd_bind_descriptor_sets( + command_buffer, + vk::PipelineBindPoint::GRAPHICS, + self.pipeline_layout, + 0, + slice::from_ref(&render_context.descriptors.sets[render_context.frame_index]), + &[], + ); + vulkan_context.device.cmd_bind_vertex_buffers( + command_buffer, + 0, + &[render_context.resources.position_buffer.buffer, render_context.resources.vertex_buffer.buffer], + &[0,0]); + vulkan_context.device.cmd_bind_index_buffer( + command_buffer, + render_context.resources.index_buffer.buffer, + 0, + vk::IndexType::UINT32); + }; + } +``` + +This is pretty simple. Bind the framebuffers, pipeline, vertex buffers, index buffers and descriptor sets. + +```rust,noplayground + fn end(&self, vulkan_context: &VulkanContext, render_context: &mut RenderContext) { + // OK. We're all done! + render_context.primitive_map.clear(); + render_context.end_pbr_render_pass(vulkan_context); + } +``` + +End is trivial. Close off your render pass and clear any temporary buffers. + +Here is the real meat and potatoes of the rendering code that works with the populated cull data buffer as +described earlier: + +```rust,noplayground + unsafe fn render_pass2(&self, vulkan_context: &VulkanContext, render_context: &mut RenderContext, swapchain_image_index: usize) { + // Parse through the cull buffer and record commands. This is a bit complex. + println!("Entering RP2"); + let device = &vulkan_context.device; + let frame = &mut render_context.frames[render_context.frame_index]; + let command_buffer = frame.command_buffer; + let draw_data_buffer = &mut frame.draw_data_buffer; + let material_buffer = &mut render_context.resources.materials_buffer; + let mut instance_offset: u32 = 0; + let mut current_primitive_id = u32::MAX; + let mut current_shader = 0; + let instance_count = 1; + let cull_data = frame.primitive_cull_data_buffer.as_slice(); + draw_data_buffer.clear(); + + let fov = render_context.cameras[0].position_in_gos(); + let camera_pos = Vec3::new(fov.x, fov.y, fov.z); + let mut visible_solids: Vec = cull_data.iter() + .filter(|cull_res| cull_res.visible && cull_res.primitive_id != self.prim_index) + .map(|p| p.clone()).collect(); + + visible_solids.sort_by(|a,b| { + if a.primitive_id != b.primitive_id { + a.primitive_id.cmp(&b.primitive_id) + } else { + let dist1=Vec3::new(a.bounding_sphere.x, a.bounding_sphere.y, a.bounding_sphere.z); + let dist2=Vec3::new(b.bounding_sphere.x, b.bounding_sphere.y, b.bounding_sphere.z); + dist1.distance_squared(camera_pos).partial_cmp(&dist2.distance_squared(camera_pos)).unwrap() + } + }); + + for prim in visible_solids { + let new_shader = prim.primitive_id & SEMI_TRANSPARENT_BIT; + + if new_shader > 0 { + unsafe { device.cmd_pipeline_barrier(command_buffer,vk::PipelineStageFlags::BOTTOM_OF_PIPE, + vk::PipelineStageFlags::BOTTOM_OF_PIPE, vk::DependencyFlags::empty(), + &[], &[], &[]); }; + current_shader = new_shader; + } + + let instanced_primitive = render_context + .primitive_map + .get(&prim.primitive_id) + .unwrap(); + + let instance = &instanced_primitive.instances[prim.index_instance as usize]; + let draw_data = DrawData { + gos_from_local: instance.gos_from_local.into(), + local_from_gos: instance.gos_from_local.inverse().into(), + skin_id: instance.skin_id, + }; + + instance_offset = draw_data_buffer.push(&draw_data); + + draw_primitive( + material_buffer, + self.pipeline_layout, + &instanced_primitive.primitive, + device, + command_buffer, + instance_count, + instance_offset, + ); + } + } +``` + +This code is a bit messy. The instance offset updating from the original custom render example is no longer required, +nor is the `current_shader` or the `current_primitive_id` or the instance count which will always be one. +The custom render code for drawing quadrics manually updated `instance_offset` as it parsed through which primitives +were visible and pushed their data into the draw data buffer. However, the vertex shader indexes into the draw data +buffer using the gl instance index generated by `instance_offset`. Thus it made sense to simply use the index returned by the buffer push. + +The doubled up draw loop visible in the custom rendering example and the PBR render code within `render_context.rs` +has been reduced to a filter/map/collect statement. The sort statement sorts primitives into order of the instance +and shader bit, and then into order of distance from the camera, in this case using the left eye. + +The pipeline barrier inserts a conservative barrier to ensure all prior commands are finished before rendering the +semi transparent objects. + +For brevity, I will not reproduce the same example code for the population of the cull data buffer here. + +We have now covered: +- Pipeline creation +- Accessing information about the swapchain +- Creating framebuffers +- Mimicking the attachment setup of the primary PBR render pass. +- Culling non visible objects +- Beginning the render pass and binding the framebuffer, descriptors and other important buffers +- Iterating the visible objects to execute draw commands +- Adding pipeline barrier commands + +The example code produced here is an inefficient, first principles attempt at combining Hotham's IBL/PBR model +with partial and full transparency using a combination of different techniques. If I was to design this for +efficiency, I would probably: +- Split the transparent/semi-transparent objects and opaque objects into separate sub-passes. +- Split the two categories of object out into separate lists and only sort the transparent objects +- Allow the opaque objects pass to populate the depth buffer +- Insert a pipeline barrier to ensure depth buffer writes had finished +- Draw the remaining objects in order of distance from the camera + +I'd also consider whether I truly need partial transparency, since if you don't need that, the use of the +discard keyword makes full transparency trivial. diff --git a/src/tutorial/drawing_text.md b/src/tutorial/drawing_text.md new file mode 100644 index 0000000..73b1017 --- /dev/null +++ b/src/tutorial/drawing_text.md @@ -0,0 +1,197 @@ +# Drawing Text + +Drawing text in a 3D world is not a simple matter. There are a number of approaches, but the most common is to rasterize the text/font data into a texture of suitable resolution for display on something like a mesh plane, or a more complex mesh such as a curved screen. This can then be parented to the display for example if you want an object which is always in front of the user's eyes, or it can be parented to a specific object to serve the purposes of a billboard. + +Building on the example of dynamic texturing we provided in the last section, I'll be presenting an example of a reasonably performant implementation of updating a buffer with text. I have tested this on an Oculus Quest 2 headset and found that the text updates so quickly, despite the frame by frame texture uploading and continuous polling of the input devices, that the changing text itself will blur into the data displayed the next frame. + +For this example, we'll be using the `ab_glyph` library, which supports the loading of several different font formats. It is also important to know a bit about how typography works. Apple's developer website has a [good overview of typographical concepts](https://developer.apple.com/library/archive/documentation/TextFonts/Conceptual/CocoaTextArchitecture/TypoFeatures/TextSystemFeatures.html) in their discussion of the Cocoa texture architecture. The concepts will serve us well for this tutorial. + +Each glyph of a font has a baseline from which part of the character rises above the baseline, called the *ascent*, and part of the character dips below the baseline, by a distance called the *descent*. As each character of a font may be of difference size, each glyph has its own *bounding box*. Finally, there is a distance which the printer should move left or right, depending on the directionality of the font, known as the *advance*. + +Each pixel of a glyph may be of a specific *coverage*. This can be up to 64 separate levels of color, from the absence of a pixel ("black") to its complete presence ("white"). To correctly handle colored text, we need to be able to interpolate between the background color we are drawing onto, and the foreground color, based on the degree of coverage. + +To begin with, lets create a representation of a color where the individual components are `f32` between 0 and 255, for use with the drawing library. + +```rust,noplayground +#[derive(Debug,Clone,Copy)] +pub struct FontColor { + r: f32, + g: f32, + b: f32, + a: f32, +} + +impl FontColor { + #[inline] + pub fn new(r: f32, g: f32, b: f32, a: f32) -> Self { + Self { + r, + g, + b, + a, + } + } + + #[inline] + pub fn from_slice(color_slice: &[u8]) -> Self { + Self { + r: color_slice[0] as f32, + g: color_slice[1] as f32, + b: color_slice[2] as f32, + a: color_slice[3] as f32, + } + } + + #[inline] + pub fn interpolate(&self, bgcolor: Self, factor: f32) -> Self { + Self { + r: (self.r-bgcolor.r)*factor+bgcolor.r, + g: (self.g-bgcolor.g)*factor+bgcolor.g, + b: (self.b-bgcolor.b)*factor+bgcolor.b, + a: self.a, + } + } +} +``` + +To make this slightly more performant, because the functions we'll be using are simple expressions without any conditional logic, I've specified the `#[inline]` directive to suggest that the compiler optimise these functions to avoid a far call to external code every time they are used, as these functions will be used quite regularly. + +The interpolate function will produce a color close to or equal to bgcolor as the coverage factor passed in approaches zero. As coverage approaches 1.0, it will reach a maximum of the foreground color itself. + +Here is the code for an example draw function which I will break down shortly. + +```rust,noplayground + pub fn draw_buffer(&self, text: String, pt_size: f32, x: u32, y: u32, buff: &mut [u8], extent: vk::Extent2D) { + let fontref = self.fontvec.as_ref().unwrap(); + let scaledfont=fontref.as_scaled(pt_size); + let ascent=scaledfont.ascent(); + let descent=scaledfont.descent(); + let glyph_height=ascent-descent; + + let mut xposn=x; + let mut yposn=y+(ascent as u32); + // To place a dot directly to the left of the character, at the + // baseline, uncomment the below two lines, so you can see the + // relationship between the different parameters. + //let tmp_posn=((xposn-1+yposn*extent.width)*4) as usize; + //(&mut buff[tmp_posn..tmp_posn+4]).copy_from_slice(&[0,0,255,255]); + for c in text.chars() { + let gid = fontref.glyph_id(c); + let g = gid.with_scale(pt_size); + let advance=scaledfont.h_advance(gid); + + if let Some(g_outline) = fontref.outline_glyph(g) { + let glyph_bounds = g_outline.px_bounds(); + + let mut glyph_top_x = xposn; + let mut glyph_top_y = yposn; + + if glyph_bounds.min.x < 0. { + glyph_top_x -= glyph_bounds.min.x.abs() as u32; + } else { + glyph_top_x += glyph_bounds.min.x as u32; + } + + if glyph_bounds.min.y < 0. { + glyph_top_y -= glyph_bounds.min.y.abs() as u32; + } else { + glyph_top_y += glyph_bounds.min.y as u32; + } + + g_outline.draw(|x1, y1, cov| { + let buf_position: usize = ((glyph_top_x+x1 + (glyph_top_y+y1) * extent.width)* 4) as usize; + let buf_slice = &mut buff[buf_position..buf_position+4]; + if cov >= 0.0 { + let bgcolor=FontColor::from_slice(&buf_slice); + let interp = self.fgcolor.interpolate(bgcolor,cov); + buf_slice.copy_from_slice(&[interp.r as u8,interp.g as u8,interp.b as u8,interp.a as u8]); + }; + }); + } else { + println!("Could not get outline for glyph!"); + } + xposn+=(advance as u32); + }; + } +``` + +In this function, &self is the Font object which we provided some initial code for in tutorial 1 [Getting Started](getting_started.md). We unwrap the font, scale it, and get the ascent and descent of the tallest characters in the font. Because in this example we want the top of the characters to appear at the coordinates we specified, the baseline position is stored in `xposn`, `yposn`. The `px_bounds()` function, which is different to the `glyph_bounds()` function, returns a conservative pixel bounding box whose coordinates are calculated relative to the baseline as the origin. + +Because the bounding box coordinates can be positive or negative, and our `xposn` and `yposn` are `u32`, we ensure the `f32` is positive before casting to throw away the fractional part. + +Note that this example code does not cover kerning, or advancing the cursor to the next line if the string passes the extent width. It does nothing but lay out each character within the buffer in a linear fashion, interpolating the background color already stored in the buffer with the foreground color saved within the font struct. It also does no error checking to ensure the string does not overflow the end of the buffer. Calculating whether the text will overflow the buffer will depend on knowing your layout strategy as well as the width and height of each glyph. + +I leave it as an exercise for the reader to sanitize this code to prevent the program from panicing due to an invalid index into the slice. I also recommend removing the `println!()` call within the function, as it is incredibly common for a glyph (such as a space) to not appear in a font; in these cases this example code simply ignores the non existent character as if it did not exist. + +As a final exercise we'll add a function that will be called each tick to test how performant this code is. This function will print various information from the controllers and other input contexts onto this huge 4 square metre display for you to interact with. + +```rust,noplayground +fn update_dynscreen(engine: &mut Engine, state: &mut State) { + let dyn_texture = state.dyn_texture.as_ref().unwrap(); + let mut vec_buff = state.texture_buffer.as_mut().unwrap(); + let mut text_slice=vec_buff.as_mut_slice(); + let mut n: u32 = 0; + let mut m: u32 = 0; + + while m < 1024 { + while n < 1024 { + let c = ((255*n/1024) & 255) as u8; + let slice_index = (n*4+m*4096) as usize; + text_slice[slice_index..slice_index+4].copy_from_slice(&[c,0,0,255]); + n += 1; + } + n = 0; + m += 1; + } + + state.fontref.draw_buffer("Hello Hotham 1234".to_string(), 18.0, 10, 10, text_slice, dyn_texture.image.extent); + let affine1=engine.input_context.right.stage_from_grip(); + let (_, rotation, translation) = affine1.to_scale_rotation_translation(); + let (axis, angle) = rotation.to_axis_angle(); + let fmt1 = format!("Right Grip Rotation: {:.2} deg around {:.2?}", angle.to_degrees(), axis ); + let fmt2 = format!("Right Grip Translation: {:.2?}", translation); + let affine2=engine.input_context.left.stage_from_grip(); + let (_, rotation, translation) = affine2.to_scale_rotation_translation(); + let (axis, angle) = rotation.to_axis_angle(); + let fmt3 = format!("Left Grip Rotation: {:.2} deg around {:.2?}", angle.to_degrees(), axis ); + let fmt4 = format!("Left Grip Translation: {:.2?}", translation); + let affine3 = engine.world.get::<&LocalTransform>(engine.hmd_entity).unwrap(); + let (rotation, translation) = (affine3.rotation, affine3.translation); + let (axis, angle) = rotation.to_axis_angle(); + let fmt5 = format!("HMD Rotation: {:.2} deg around {:.2?}", angle.to_degrees(), axis ); + let fmt6 = format!("HMD Translation: {:.2?}", translation); + + state.fontref.draw_buffer(fmt1, 18.0, 10, 30, text_slice, dyn_texture.image.extent); + state.fontref.draw_buffer(fmt2, 18.0, 10, 50, text_slice, dyn_texture.image.extent); + state.fontref.draw_buffer(fmt3, 18.0, 500, 30, text_slice, dyn_texture.image.extent); + state.fontref.draw_buffer(fmt4, 18.0, 500, 50, text_slice, dyn_texture.image.extent); + state.fontref.draw_buffer(fmt5, 18.0, 10, 70, text_slice, dyn_texture.image.extent); + state.fontref.draw_buffer(fmt6, 18.0, 10, 90, text_slice, dyn_texture.image.extent); + state.fontref.draw_buffer(format!("A {} B {} X {} Y {}", + engine.input_context.right.a_button(), + engine.input_context.right.b_button(), + engine.input_context.left.x_button(), + engine.input_context.left.y_button()), + 18.0, 10, 110, text_slice, dyn_texture.image.extent); + state.fontref.draw_buffer(format!("Grips: Left {} Right {}", engine.input_context.left.grip_button(), + engine.input_context.right.grip_button()), 18.0, 10, 130, text_slice, dyn_texture.image.extent); + state.fontref.draw_buffer(format!("Triggers: Left {} Right {}", engine.input_context.left.trigger_button(), + engine.input_context.right.trigger_button()), 18.0, 10, 150, text_slice, dyn_texture.image.extent); + engine.vulkan_context.upload_image(text_slice, 1, vec![0], &(dyn_texture.image)); +} +``` + +This example function can fit into your tick function. If you run the example code, you will be able to see clicks on the +a, b, x and y buttons, the grips and triggers, and the rotation and translation of your headset and controllers as you move +about the scene. You could add an example model into your scene which you update with a local translation based off of the +data returned from the input context. In this way, you can compare the numbers shown in the textual output with the visual +appearance on screen, and understand the input you are getting back from your controllers in a very concrete way which will +remove any ambiguities provided by the wording of the specification. + +The code presented above, despite the large number of floating point operations undertaken to draw the content to the buffer, +is remarkably fast. It is so fast that the numbers received from the input context, changing each frame as they do, will blur +into one another on screen because the frame rate is higher than your eyes can perceive. At least, such is the outcome perceived +by running this code on an Oculus Quest 2 with its ARM64 architecture. + +You could further speed up the above code by caching the x, y and coverage data returned by ab_glyph to avoid unnecessary +per glyph calculations being repeated for each instance of the specific glyph. diff --git a/src/tutorial/dynamic_texturing.md b/src/tutorial/dynamic_texturing.md new file mode 100644 index 0000000..82fee53 --- /dev/null +++ b/src/tutorial/dynamic_texturing.md @@ -0,0 +1,165 @@ +# Dynamic meshes and textures + +So far, we have dealt with loading objects from glb files. But what if +you wanted to load a PDF, or draw some fancy graphics and buttons on +a surface that were not known at compile time? To do this, we need to +get a little more low level. + +If you example the code for the loading of models from glb files, you'll +find that they call associated load functions within Hotham's variable +rendering component structs. These are the structs of note: + +```rust +rendering::{vertex::Vertex, primitive::Primitive, texture::Texture, material::{ + Material, MaterialFlags}, mesh_data::MeshData, +}, +``` + +To break it down: +* A `components::mesh::Mesh` object is constructed from a `MeshData` object and a render context. +* A `MeshData` object is constructed from a vector of `Primitive` structs. +* Each primitive is constructed out of a position buffer of `Vec3` structs, a vertex buffer, an index buffer of `u32`'s, a material (also a `u32`), and a render context. +* And the vertex buffer is constructed out of vertex objects. + +We'll now break these down further to explain how to create your own meshes. + +The vertex data, materials and textures as well as skins are all sent +into to vertex shader and fragment shader (located in the `src/shaders` +folder within the crate). +* So to map a texture onto a mesh, the shader needs to know the UV texture coordinates of each vertex. +* It also needs the normals of each vertex for other purposes such as lighting. +* Finally, if the object in question is skinned, it needs information about how the vertices are skinned. + +A simple case is the creation of a mesh plane with two triangles. +Suppose we want to create a 4m x 4m plane onto which to project some +scene. We would start by creating a position buffer containing the +locations of each vertex: + +```rust + let position_buffer: Vec = vec![ + [-2.0, 2.0, -2.0].into(), + [-2.0, -2.0, -2.0].into(), + [2.0, -2.0, -2.0].into(), + [2.0, 2.0, -2.0].into() + ]; +``` + +This example lists the top left, bottom left, bottom right and top right +in order. The index buffer is constructed of a series of triangles, +something like this: + +```rust + let index_buffer = [0, 1, 2, 0, 2, 3]; +``` + +This re-uses vertices 0 and 2, the top left and bottom right, which are +shared between the two triangles of the square mesh plane. + +Next, we need to specify the texture coordinates and vertex normals. +We can do this using Vertex::new, like so: + +```rust +let vertex_buffer: Vec = vec![ + Vertex::new([0., 0., 1.].into(), [0., 0.].into(), 0, 0), + Vertex::new([0., 0., 1.].into(), [0., 1.].into(), 0, 0), + Vertex::new([0., 0., 1.].into(), [1., 1.].into(), 0, 0), + Vertex::new([0., 0., 1.].into(), [1., 0.].into(), 0, 0), +]; +``` + +The normal in this example is in the positive Z direction and is contained in the first parameter to `Vertex::new`. Don't forget that this is *towards the camera* in OpenXR. This example has the mesh plane in question situated forward 2.4 metres from the world origin, with the normal of light reflected off it bouncing back toward the viewer. + +The texture coordinates are a `Vec2` of `f32` from `0.0` to `1.0`, with `0.0, 0.0` representing the top left of the texture and `1.0, 1.0` representing the bottom left. + +Next we need a texture to map onto these coordinates. The `render::texture::Texture` implementation provides a couple of functions, but for simplicity we'll only look at `Texture::empty`. You need to pass a `vk::Extent2D` to specify the size of the empty texture that you’ll draw onto, like so: + +```rust + state.dyn_texture=Some(Texture::empty( + &engine.vulkan_context, + &mut engine.render_context, + vk::Extent2D::builder().width(1024).height(1024).build() + )); +``` + +This example is storing the new texture in a mutable state variable +as an `Option` for later unwrapping and adjustment. + +Finally, we need a material to use this texture. Other texturing +options like normal maps are available, most PBR maps are supported. +Consult the documentation or module code for more information. In +this example we will simply create a material with a base color texture +and unlit workflow, such as you might use to display a 2 dimensional +image or rasterised text on screen. + +```rust +let mut mat_flags = MaterialFlags::empty(); +mat_flags.insert(MaterialFlags::HAS_BASE_COLOR_TEXTURE); +mat_flags.insert(MaterialFlags::UNLIT_WORKFLOW); + +let dyn_material = Material { + packed_flags_and_base_texture_id: mat_flags.bits() | (state.dyn_texture.as_ref().unwrap().index << 16), + packed_base_color_factor: 0, + packed_metallic_roughness_factor: 0, +}; +``` + +Here we insert each material flag separately for clarity, although +these could simply be cast and or'ed together. If you consult the code +for the Material struct implementation, as well as the shader common +code, you’ll see that the texture id is a word (16 bits) packed into the +most significant byte of the aptly named +`packed_flags_and_base_texture_id` field. + +In this case, we simply ignore the other fields. If you want to create more complex textures with normal maps or other options, I would strongly suggest reading the material.rs in the rendering/ folder and the shader code to understand +how the various textures, materials and so on interact. + +Next, you'll want to add this material to the materials buffer. You'll +find something like this peppered throughout the hotham code: + +```rust +let mat_dyntext = unsafe { + engine.render_context.resources.materials_buffer.push(&dyn_material) +}; +``` + +This is ensuring that the newly constructed material is available on +the GPU, and returns the material id. You most likely won't need to +store this unless you want to duplicate the same texture on other +meshes. + +Now that you have your beautiful empty texture and the coordinates +and objects to set up your plane, you can finish the job with something +like this: + +```rust +let prim_dynscreen = vec![Primitive::new(&position_buffer[..], &vertex_buffer[..], &index_buffer[..], mat_dyntext, &mut engine.render_context)]; + +let meshdata_dynscreen = MeshData::new(prim_dynscreen); + +let mesh_dynscreen = Mesh::new(meshdata_dynscreen, &mut engine.render_context); +``` + +We've called the object mesh_dynscreen as it represents a dynamic +"screen" onto which we’re going to draw text and other content. This +later needs to get inserted into the world with other components to +position it properly. + +```rust +world.spawn((Visible{}, mesh_dynscreen, LocalTransform {..Default::default()}, GlobalTransform::default())); +``` + +Because we've specified a fixed location for the plane in space in this +case, I've just used no translation or rotation on the Local and +GlobalTransform components above. You might want to create your +plane or shape to align with an axis and then transform and translate it +or parent it to the camera or to your hands in the scene. + +Finally, you'll want to draw something on your image before +uploading it to the GPU with something like: + +```rust +engine.vulkan_context.upload_image(state.texture_buffer.as_ref().unwrap(), 1, vec![0], &(state.dyn_texture.as_ref().unwrap().image)); +``` + +The texture buffer you are uploading should be in sRGBA format, with +each byte being `0..255` and the final byte being the alpha channel. diff --git a/src/tutorial/getting_started.md b/src/tutorial/getting_started.md index bad5562..71fe81f 100644 --- a/src/tutorial/getting_started.md +++ b/src/tutorial/getting_started.md @@ -1 +1,240 @@ # Getting Started + +The typical format of a Hotham program is going to include a `main.rs`, +a `lib.rs`, and additional modules for your separately written systems. The +`main.rs` looks something like this: + +```rust,noplayground +use hotham::HothamResult; + +fn main() -> HothamResult<()> { + simple_with_nav::real_main() +} +``` + +Within `lib.rs`, a configuration directive marks the true entry point of the program using `ndk_glue`: + +```rust,noplayground +#[cfg_attr(target_os = "android", ndk_glue::main(backtrace = "on"))] +pub fn main() { + println!("[HOTHAM_WITH_NAV] MAIN!"); + real_main().expect("Error running app!"); + println!("[HOTHAM_WITH_NAV] FINISHED! Goodbye!"); +} +``` + +In the documentation of the now deprecated [ndk-macro](https://github.com/rust-mobile/ndk-glue/tree/main/ndk-macro), the full parameters to this attribute macro are described. You can configure the android logger, and override the default path to the crate. + +In your `real_main()` function or within the marked `main()` itself, you start by initializing the `Engine`, and optionally checking for required OpenXR extensions. You also create and initialize any state variable as well as any textures, models or meshes the program may need. Then the tick loop occurs (discussed later in this section). + +# Engine creation + +To create the engine, the simplest way of doing it is: + +```rust,noplayground + let mut engine = Engine::new(); +``` + +However, in some cases, such as if you want to enable some specific OpenXR extensions, you may wish to use `EngineBuilder`. Here is a full example of using the `EngineBuilder` pattern with OpenXR, checking for supported extensions. + +```rust,noplayground + println!("Getting list of supported extensions..."); + let xr_entry = unsafe { xr::Entry::load().expect("Cannot instantiate OpenXR entry object!") }; + &xr_entry.initialize_android_loader(); + let xr_supported_extensions=&xr_entry.enumerate_extensions().unwrap(); + println!("Got list of supported extensions, eye gaze interaction is {}", xr_supported_extensions.ext_eye_gaze_interaction); + + let mut extension_set = xr::ExtensionSet::default().clone(); + if xr_supported_extensions.ext_eye_gaze_interaction { + extension_set.ext_eye_gaze_interaction = true; + } + + let extensions_required = Some(extension_set); + let application_name = Some("Hotham Navigation Test"); + let application_version = Some(1); + let mut engine_builder = EngineBuilder::new(); + &engine_builder.application_name(application_name); + &engine_builder.application_version(application_version); + &engine_builder.openxr_extensions(extensions_required); + let mut engine = engine_builder.build(); + +``` + +There are two things of note here. First, the code above assumes this is being built solely for an android device such as the Oculus Quest. Depending on your version of openxr, you may or may not need to include reference to initializing the Android loader as shown above. Check the documentation for the version of openxr you are using in your dependencies for more information. You may wish to adjust the openxr entry point loading code above to take into account your target os using `#cfg` directives. + +Secondly, the engine_builder object has a lifetime which is based on the application name string. The functions to set the engine builder fields in hotham v0.2.0 return &mut self, while build() takes ownership of self. Therefore, to avoid borrow checker issues, using a pattern such as the above is suggested. + +# State initialization + +Concurrently with the engine being instantiated, you will want to initialize any models, meshes and textures you may need. + +As an example, you may wish to initialize any action sets separate from the input context that you will use with OpenXR and store them in a place that will be accessible from the rest of your codebase: + +```rust,noplayground + state.eye_gaze_set = match engine.xr_context.instance.create_action_set("eye_gaze_interaction_scene_actions", "Eye Gaze Interaction Scene Actions", 1) { + Ok(actionset) => Some(actionset), + Err(_) => None, + }; + + if state.eye_gaze_set.is_some() { + println!("Eye gaze set created! Creating eye gaze action..."); + state.eye_gaze_action=match state.eye_gaze_set.as_ref().unwrap().create_action::("gaze_action", "Gaze Action", &[]) { + Ok(gaze_action) => Some(gaze_action), + Err(_) => None, + }; + + if state.eye_gaze_action.is_some() { + println!("Eye gaze action was created! Creating eye gaze path..."); + state.eye_gaze_path = match engine.xr_context.instance.string_to_path("/user/eyes_ext/input/gaze_ext/pose") { + Ok(some_path) => Some(some_path), + Err(_) => None, + }; + + let eyegaze_ip_path = engine.xr_context.instance.string_to_path("/interaction_profiles/ext/eye_gaze_interaction").unwrap(); + + println!("Suggesting interaction profile bindings..."); + engine.xr_context.instance.suggest_interaction_profile_bindings(eyegaze_ip_path, &[ + xr::Binding::new(state.eye_gaze_action.as_ref().unwrap(), state.eye_gaze_path.unwrap()), + ]); + + state.eye_gaze_space=match state.eye_gaze_action.as_ref().unwrap().create_space(engine.xr_context.session.clone(), + xr::Path::NULL, + xr::Posef::IDENTITY, + ) { + Ok(space) => Some(space), + Err(_) => None, + }; + + } + } +``` + +In my example, the `ab_glyph::FontVec` of any font I may wish to use to present text to the user on screen is initialized: + +```rust,noplayground + state.android_helper = match AndroidHelper::new() { + Ok(helper) => Some(helper), + Err(_) => panic!("Could not get android context!"), + }; + + state.fontref = FontObject::new("fonts/font-001.otf",&mut state); +``` + +This is a typical pattern, in this example I pass in a state variable that has already stored a helper object to access APK resources. Let's have a quick look at that: + +```rust,noplayground +#[derive(Debug)] +pub struct FontObject { + fontvec: Option, + fontdata: Vec, + fgcolor: FontColor, +} + +impl Default for FontObject { + fn default() -> Self { + Self { + fontdata: vec![], + fontvec: None, + fgcolor: FontColor::new(0.0, 0.0, 255.0, 255.0), + } + } +} + +impl FontObject { + pub fn new(file_name: &str, state: &mut State) -> Self { + let fontdata = state.android_helper.as_mut().unwrap().read_asset_as_bytes(file_name).unwrap(); + let fontvec = FontVec::try_from_vec(fontdata.clone()).unwrap(); + + println!("Loaded {:?} glyphs from font-001", fontvec.codepoint_ids().count()); + Self { + fontdata: fontdata, + fontvec: Some(fontvec), + ..Default::default() + } + } + +/// Further code follows... +} +``` + +`ab_glyph::FontVec` in this case does not implement Clone or Copy. To ensure the object remains accessible to the main program, I store problematic objects within a `State` variable which can simply stay allocated and owned in the `real_main()` function and passed around as a borrowed mutable reference carefully. When being passed as a mutable reference only, the struct does not need to implement Clone, Copy or Debug, only Default (whether derived or specially written). + +I recommend splitting the initialization up into separate functions, to avoid potential issues with the borrowing of the state variable. For example: + +```rust,noplayground + init(&mut engine, &mut state); + add_dynscreen(&mut engine, &mut state); +``` + +In my example here, the `init` function loads models and sets up physics engine properties. Then the `add_dynscreen` function further manipulates the state variable to set up dynamic meshes and textures not loaded from a glb file. The separation of concerns means that updates to the engine and the state that may cause borrow checker concerns can be minimised. + +# The Tick Function + +Before we look at loading models, a quick word about what happens post-initialization. + +Typically a loop like this is used to run each of the game systems in sequence before updating the engine itself: + +```rust,noplayground + while let Ok(tick_data) = engine.update() { + tick(tick_data, &mut engine, &mut state); + engine.finish()?; + if state.should_quit { + break; + } + } +``` + +The tick_data contains the current and prior OpenXR session state, +which can be used to react to events such as a loss of focus or return to +focus of the app. + +Basically, the update function handles polling and processing android +events, shutting down the app if a destroy event is received, handling +XR context state changes, and beginning the render frame. The finish +function ends the render frame appropriately, and ends performance +timers for the current tick. + +Here is an example of a tick function: + +```rust,noplayground +fn tick(tick_data: TickData, engine: &mut Engine, _state: &mut State) { + if tick_data.current_state == xr::SessionState::FOCUSED { + hands_system(engine); + grabbing_system(engine); + physics_system(engine); + animation_system(engine); + let mover = Movement { + world: &engine.world, + stage_entity: engine.stage_entity, + hmd_entity: engine.hmd_entity, + }; + input::handle_input(&engine.input_context, &mover, _state); + update_dynscreen(engine, _state); + update_global_transform_system(engine); + update_global_transform_with_parent_system(engine); + skinning_system(engine); + debug_system(engine); + } + + rendering_system(engine, tick_data.swapchain_image_index); +} +``` + +In this example tick function, the rendering system is the last to be called, after the physics system, grabbing system, animation system and other systems have updated object positions and relevant textures, and global transforms have been updated by the transform systems listed. The rendering system is passed the image index in the swapchain that was calculated by `engine.update()` to ensure the correct frame buffer gets updated. + +The following lines have been added for the sake of the dynamic texturing example which will be described in a later tutorial. They are not hotham systems. + +```rust,noplayground + input::handle_input(&engine.input_context, &mover, _state); + update_dynscreen(engine, _state); +``` + +# Debug System + +If you wonder why your colors get all messed up when you press the buttons on the controllers, you probably have the debug system enabled, as per the second to last system showing in the example above. This is a system designed to, as its comments indicate, help in debugging the fragment shader. It makes some changes to the params variable which is a Vec4 passed to the shader. + +Depending on the value of the third value in params, it will change the output color to display the base color, the normal, the occulusion, emission, roughness or metallic texture sampled. + +If you don't need to debug your output textures, comment this line out or remove it to *turn this off*. I will discuss the different systems that can be enabled shortly. + +In the next section, we will look at things that typically happen in the initialisation of the program, including the loading of game models. diff --git a/src/tutorial/introduction.md b/src/tutorial/introduction.md index 4f50ecc..6f3e421 100644 --- a/src/tutorial/introduction.md +++ b/src/tutorial/introduction.md @@ -1 +1,23 @@ -# Tutorial +# Introduction + +G'day, Matt from Spark Reactor here. Welcome to Hotham! + +![Spark Reactor](../images/spark-reactor-icon.png) ![Oculus](../images/quest2-icon.png) ![Hotham](../images/hotham-icon.png) ![Vulkan](../images/vulkan-logo-scaled.png) + +I'll be your friendly tutor here as we guide you through the process of learning about the wonderful world of 3D graphics, OpenXR and Vulkan! + +The goal of this series of tutorials is to provide an overview of how to accomplish a number of common tasks in developing virtual reality apps using Hotham. These techniques will be useful not only for ordinary game development, but for other apps that require realistic physics and access to the Vulkan API. + +As an example, an app wanting to display a pdf or view a wall of images sourced from the local device or the web will need to display 2 dimensional content on a planar or curved surface. In the current version of Hotham, this is a non trivial task as it involves understanding the Vulkan context, the shaders and how the renderer deconstructs and uses meshes, materials and textures together. + +The goal is therefore to demystify the processs by providing a clear explanation of how the multiple primitive components of Hotham work together. + +I am working primarily with the Oculus Quest 2, but this tutorial aims to be as generic as possible and the concepts discussed within are applicable to any application utilising the 3D rendering and physics functionality which Hotham provides. + +This version of the tutorial covers Hotham 0.2.0; if you are following +this further down the track, aspects of the process such as compilation +or minor aspects of the API may have changed. I will endeavour to keep these +tutorials in line with the current version of Hotham, however, if you get +stuck, jump on the Discord linked in the [Getting Started](https://github.com/leetvr/hotham/wiki/Getting-started) guide to find someone to help you out :-) + +I hope you enjoy your learning journey! diff --git a/src/tutorial/loading_models.md b/src/tutorial/loading_models.md index 8c1a200..4b4ac22 100644 --- a/src/tutorial/loading_models.md +++ b/src/tutorial/loading_models.md @@ -1 +1,105 @@ # Loading Models + +The simplest way to load your modules is demonstrated in the [simple-scene](https://github.com/leetvr/hotham/tree/main/examples/simple-scene) example provided within the hotham crate: + +```rust,noplayground + let mut glb_buffers: Vec<&[u8]> = vec![ + include_bytes!("../../../test_assets/floor.glb"), + include_bytes!("../../../test_assets/left_hand.glb"), + include_bytes!("../../../test_assets/right_hand.glb"), + ]; + let models = + asset_importer::load_models_from_glb(&glb_buffers, vulkan_context, render_context)?; +``` + +The `asset_importer::load_models_from_glb` function takes a `Vec<&[u8]>` as its first parameter. Each `&[u8]` represents the binary content of a .glb file. The `vulkan_context` and `render_context` are accessible within the engine object. + +Including all models within the executable will quickly get out of hand if you're creating a professional app or game. Generally speaking, you will want to load +your models directly from an assets folder within your APK, or from +some writeable folder on the storage of the device. + +To load from APK assets, Hotham provides a convenience function in `hotham::util` called `get_asset_from_path`. Here is the method signature: + +```rust,noplayground +pub(crate) fn get_asset_from_path(path: &str) -> Result> +``` +Currently, this uses `ndk_glue::native_activity().asset_manager()` to access the asset manager and read the entire buffer into a `Vec`. + +In my own example code which I used to learn the Hotham library, I abstracted the logic from this function into a separate library which cached the `native_activity` context rather than making multiple function calls for each file. I then set up a list of glb files to read into a `Vec>` like this: + +```rust,noplayground + let mut vec_buff: Vec> = Vec::new(); + + let asset_names= vec![ "glb/floor.glb", "glb/left_hand.glb", + "glb/right_hand.glb", "glb/damaged_helmet_squished.glb", + "glb/meditation retreat.glb", "glb/horned one.glb", + "glb/photo frame.glb", + ]; + + for asset in asset_names.into_iter() { + vec_buff.push(match android_helper.read_asset_as_bytes(&asset) { + Ok(buff) => buff, + Err(_) => return Err(hotham::HothamError::Other(anyhow!("Can't open asset: {:?}", asset))), + }.to_owned()); + }; + + let glb_buffers: Vec<&[u8]> = vec_buff.iter().map(|x| &x[..]).collect(); +``` + +In your Cargo.toml file, to do something like this, you'll want to reference an assets folder for `cargo apk` to pack into your APK like this: + +```toml +[package.metadata.android] +assets = "D:\\hotham-main\\hotham-main\\examples\\simple-with-nav\\assets" +``` + +You could further simplify the above for loop to use `iter().map()` on the `Vec<&str>` to create a non-mutable `Vec>`, or use `as_slice()` to transform the returned `Vec`'s into slices before a single collect call. I'll leave it up to you to experiment with what approach works best for you. + +The native activity also exposes the `external_data_path` and `internal_data_path` properties, paths to the folders on the Android file system where public and private application data can be stored respectively. These are Rust `Path` objects whose documentation is outlined [here](https://doc.rust-lang.org/nightly/std/path/struct.Path.html). + +# Adding Models to the World + +Once you have your models as above, you'll want to extract specific objects from the `Models`: + +```rust,noplayground +pub type Models = HashMap; +``` + +Models is really just a simple hashmap, with each object comprising a world of its own (queue the Seekers! 😂) + +`add_model_to_world` takes this series of worlds and extracts one particular named set of objects for insertion into the world. It grabs the raw entities from the glb models that were loaded and inserts them with appropriate local or global transforms that reflect the transforms saved in the glb files. It also handles meshes, skins, colliders (more on them later), parenting, and preservation of visibility. + +The `add_model_to_world` function returns a `Result`, which +can be unwrapped and used with `world.get<&mut LocalTransform>` or +`GlobalTransform` to adjust the transform of the entity with respect to its +parent or world space. + +Here is an example of using it: + +```rust,noplayground + let negz = add_model_to_world("negz", models, world, None).unwrap(); + let posz = add_model_to_world("posz", models, world, None).unwrap(); + let rigid_posz = RigidBody { + body_type: BodyType::Fixed, + ..Default::default() + }; + + let collider_posz = Collider::new(SharedShape::cuboid(2.5, 0.05, 2.5)); + + world.insert_one(posz, rigid_posz); + world.insert_one(posz, collider_posz); + + let rigid_negz = RigidBody { + body_type: BodyType::Fixed, + ..Default::default() + }; + + let collider_negz = Collider::new(SharedShape::cuboid(2.5, 0.05, 2.5)); + + world.insert_one(negz, rigid_negz); + world.insert_one(negz, collider_negz); + ``` + +This example requires a little explanation. In my `glb/meditation retreat.glb` example file above, are six walls of a room, a rudimentary cube map. posz and negz are actually the positive and negative y, as Blender switches the y and z axes. Put simply, they are the ceiling and the floor respectively. The final `None` parameter in `add_model_to_world` is the parent object. If you specify it, it should be `Some(entity)`, where `entity` is the unwrapped result of adding the parent to the world. Otherwise, `None` will make the object independent, which you will want if you want the physics engine to be able to control it. + +Generally speaking, most of the time you will want your floor to be solid and immoveable. That's where colliders and rigid bodies come in. We'll discuss those in an upcoming tutorial. diff --git a/src/tutorial/physics.md b/src/tutorial/physics.md index a0d251c..4d27bfc 100644 --- a/src/tutorial/physics.md +++ b/src/tutorial/physics.md @@ -1 +1,67 @@ # Physics + +Hotham uses rapier3d as its physics engine, with a thin wrapper over +the top of RigidBody to allow it to run the physics simulation on any +physics engine controlled objects. + +To adjust physics within the world, you can adjust the +`engine.physics_context` fields such as the vector for gravity. By default, +no gravity exists and any newly created objects will have no impulse on +them unless this is set up. + +Example: +```rust,noplayground + engine.physics_context.gravity = [0., -9.81 / 2., 0.].into(); +``` + +To set the gravity to half earth normal. + +- Upon releasing a grabbed object, an impulse from the hand may be imparted to the object. +- This means that without setting up gravity, released rigid bodies + which have their body type set back to dynamic will spin or move + in space according to the impulse that was on them at the time of + release. This can cause them to fly off into space. + +To make a game object a rigid body, simply insert a hotham +RigidBody struct into the dynamic bundle for the associated entity. +Rigid bodies receive their initial position from their parent object/entity, +and then update the associated game objects as the physics pipeline +unfolds progressively across engine ticks. + +Fixed rigid bodies do not move and are considered to have infinite +mass. + +For more information, visit the documentation for rapier3d at [https://www.rapier.rs/](https://www.rapier.rs/). + +# Colliders + +In hotham, there are several ways of specifying colliders. +- You can specify them as part of the glb files you load, by creating an object name suffixed by one of the following tags: + - **.HOTHAM_COLLIDER_WALL**: A wall colliders + - **.HOTHAM_COLLIDER_SENSOR**: A sensor only collider + - **.HOTHAM_COLLIDER**: An ordinary (non-sensor) collider. +- You can attach a collider to a dynamic bundle for an entity. These colliders will then be updated by the physics simulation if the entity in question has a rigid body. Here is an example: +```rust,noplayground + + let collider = Collider::new(SharedShape::ball(0.35)); + world + .insert(helmet, (collider, RigidBody::default())) + .unwrap(); +``` +- You can create a sensor collider and use it separately without inserting it into the world, using Collider::new and using +ray casting. + +The position of a collider which has an associated rigid body will be set based on the relative position of the entity itself. If +you create and maintain a collider separate from a rigid body, you will want to set its position using the methods provided within +rapier3d itself. + +Colliders are often used to prevent an object from flying off into space or falling forever. To do this, certain objects within the +scene which other objects may contact due to falling or moving need to be **fixed rigid bodies**, which will not be moved by collision +by another rigid body. These are objects such as the floor, ceiling or walls of a room. + +To specify the shape of your collider, you will want to examine the [SharedShape](https://docs.rs/rapier3d/latest/rapier3d/geometry/struct.SharedShape.html) +struct and its implementation within the Rapier3d library. There are many options available include ball shaped, cuboids, cones, +capsules and various types of mesh. + +As suggested above, a collider can be specified using a complex mesh. Review the source of the `asset_importer` module to find out more. + diff --git a/src/tutorial/render_context.md b/src/tutorial/render_context.md new file mode 100644 index 0000000..24177f9 --- /dev/null +++ b/src/tutorial/render_context.md @@ -0,0 +1,187 @@ +# The Render Context + +The Render Context, located as `engine.render_context` with a struct type of `RenderContext`, will likely be the most useful to you in creating your own custom render pipeline, in conjunction with the previously mentioned `VulkanContext`. + +It makes use of a number of struct types and constants defined elsewhere, particularly in the `hotham::rendering` hierarchy. + +Here are some of the aspects of the render context which may be useful in building a pipeline. + +# Descriptors + +The `render_context` contains a field `descriptors` which is a struct containing the graphics_layout descriptor set and other descriptor sets using in the PBR pipeline. Specifically, the `graphics_layout` binds the following types of data: + +- **Draw Data** (for the vertex shader): This is the `gos_from_local` and `local_from_gos` transformations used to calculate the true position of an object from the vertex input to the shader. The draw data also contains the skin id, as skinning affects the final position of a mesh. +- **Skins Binding** (for the vertex shader): As above, the skins has its own buffer of data relevant to calculating the position of meshes. This is defined as part of the graphics binding. +- **Scene Data** (vertex and fragment stages): The Scene Data is a readonly uniform buffer containing the view projection matrices, camera positions, lights, and the params struct. The params struct is solely for the purposes of debugging the shader. The view projection matrices, on the other hand, are used to calculate the final position of each vertex given the position in globally oriented space. There is one view projection for each eye: left first, right second. The camera positions are similarly one for each eye, and are indicated in globally oriented space. +- **Texture Bindings** (fragment stage): This is pretty obvious: it binds the uniform sampler2D of textures, including all textures referenced by the materials buffer. There are a maximum of 10000 of these. +- **Cube Texture Bindings** (fragment stage): These are the same but for cube textures. These are used in calculating diffuse light for materials that do not have an unlit workflow (ie materials with metallic or roughness factor). + +If you are creating a custom renderer which accesses textures loaded by the asset importer, or other functions within the rendering hierarchy, or if you need to access the GOS position of objects, or the view projection matrices, *you will most likely need to **bind this descriptor set**.* + +The descriptor sets corresponding with this `graphics_layout` are located in the `sets` field of the descriptors struct. There is one set allocated per eye. + +The compute pipeline's descriptor set also present as `compute_layout` and `compute_sets`, with one descriptor set allocated per eye, to assist in making it easier to render the multi-view. Typically, you will access the sets when binding your pipeline as: +```rust,noplayground + slice::from_ref(&render_context.descriptors.sets[render_context.frame_index]) +``` + +Or more fully: + +```rust,noplayground + vulkan_context.device.cmd_bind_descriptor_sets( + command_buffer, + vk::PipelineBindPoint::GRAPHICS, + self.pipeline_layout, + 0, + slice::from_ref(&render_context.descriptors.sets[render_context.frame_index]), + &[], + ); +``` + +# Resources: The Index and Vertex Buffers + +Assuming you are operating using a single set of vertex and index buffers, you'll want to bind these (or use your own, depending on the circumstances). Index and vertex buffers are located under the resources field of the render context, along with a lot of other useful data used in rendering. Here is an example of binding the index and vertex (and position) buffers: + +```rust,noplayground + vulkan_context.device.cmd_bind_vertex_buffers( + command_buffer, + 0, + &[render_context.resources.position_buffer.buffer, render_context.resources.vertex_buffer.buffer], + &[0,0]); + + vulkan_context.device.cmd_bind_index_buffer( + command_buffer, + render_context.resources.index_buffer.buffer, + 0, + vk::IndexType::UINT32); + +``` + +# Resources: Other Resources + +Other resources located under the render context's `resources` field are: +- Materials buffer: This is a proper storage buffer which is pushed into whenever adding a new material +- Skins buffer: The buffer which is bound in the descriptor set mentioned above to give access to skinning data. +- Mesh Data Arena: An arena to store the mesh data +- A texture sampler on repeat and a cube sampler. + +All of these fields are named with obvious names such as `skins_buffer`, `mesh_data` etc. + +# The Primitive Map + +This is a `HashMap` named `primitive_map` which is cleared at the end of each render pass, +and is only populated from objects in the world during the beginning of the render pass and indexed on a suitable +key. By default this key is the index buffer offset of the mesh indices of the primitive being referred to, however +this is a detail left up to the implementor. + +It would for example be suitable to set a key containing a bit or multiple bits set to act as a shader flag. In this +way a list of primitives sorted by this key will be emitted in order of their shader, allowing all the primitives in +one pipeline to be drawn before ending that render pass and switching to a separate render pass and pipeline for objects +that need to be handled specially. + +Here is an example of the use of the primitive map: +```rust,noplayground + let meshes = &render_context.resources.mesh_data; + + // Create transformations to globally oriented stage space + let global_from_stage = stage::get_global_from_stage(world); + + // `gos_from_global` is just the inverse of `global_from_stage`'s translation - rotation is ignored. + let gos_from_global = + Affine3A::from_translation(global_from_stage.translation.into()).inverse(); + + let gos_from_stage: Affine3A = gos_from_global * global_from_stage; + + let mut shaderlist=HashMap::new(); + + for (_, (mesh, global_transform, skin)) in + world.query_mut::), &SemiTransparent>>() + { + let mesh = meshes.get(mesh.handle).unwrap(); + let skin_id = skin.map(|s| s.id).unwrap_or(NO_SKIN); + for primitive in &mesh.primitives { + let key = primitive.index_buffer_offset | SEMI_TRANSPARENT_BIT; + + // Create a transform from this primitive's local space into gos space. + let gos_from_local = gos_from_global * global_transform.0; + render_context + .primitive_map + .entry(key) + .or_insert(InstancedPrimitive { + primitive: primitive.clone(), + instances: Default::default(), + }) + .instances + .push(Instance { + gos_from_local, + bounding_sphere: primitive.get_bounding_sphere_in_gos(&gos_from_local), + skin_id, + }); + shaderlist.insert(primitive.index_buffer_offset, SEMI_TRANSPARENT_BIT); + } + } +``` + +Essentially, each mesh with its accompanying components which contain a particular marker struct (in this case, +the `SemiTransparent` struct) are iterated and if it does not already exist in the primitive map with its corresponding +key (which in this case contains a bit set to identify the shader), a clone of the primitive is inserted with null instances, +and then its instances are populated with data from the primitive itself and its optional skin. In this example, I also +populate a hashmap of shaders to reconstruct the primitive id to be added to the primitive cull data buffer. + +This primitive map is later used to construct the data for sending to the culling shader, as well as to construct the +draw data for the visible primitives identified by the compute shader. + +# Buffers and Frames + +The `frames` field is an array of `Frame` structs. Each `Frame` struct houses a number of buffers. + +- The Command Buffer (`command_buffer`): The command buffer for recording commands. Obviously there is one per frame, +so that fences on drawing one frame do not affect the next. +- The Draw Data Buffer (`draw_data_buffer`): This is pushed into for every indexed primitive drawn, to hold the data +about the primitive's GOS position and skin id. +- The Primitive Cull Data Buffer (`primitive_cull_data_buffer`): This buffer is pushed into from the generated primitive +map mentioned above. It is used to construct a set of structs to be sent to the culling compute shader. Objects are +typically pushed with their `visible` field set to `false`, and the compute shader sets to `true` those objects whose +bounding spheres fall within the left and right clip planes passed to the compute shader. +- The Cull Parameters Buffer (`cull_params_buffer`): A buffer of CullParams, used to pass uniform data to the compute +shader, not used for anything else. + +Here is an example of using the cull data buffer: + +```rust,noplayground + let frame = &mut render_context.frames[render_context.frame_index]; + let cull_data = &mut frame.primitive_cull_data_buffer; + cull_data.clear(); + + for instanced_primitive in render_context.primitive_map.values() { + let primitive = &instanced_primitive.primitive; + for (instance, i) in instanced_primitive.instances.iter().zip(0u32..) { + cull_data.push(&PrimitiveCullData { + bounding_sphere: instance.bounding_sphere, + index_instance: i, + primitive_id: primitive.index_buffer_offset | shaderlist.get(&primitive.index_buffer_offset).unwrap(), + visible: false, + }); + } + } + + // This is the VERY LATEST we can possibly update our views, as the compute shader will need them. + render_context.update_scene_data(views, &gos_from_global, &gos_from_stage); + + // Execute the culling shader on the GPU. + render_context.cull_objects(vulkan_context); + + // Begin the render pass, bind descriptor sets. + render_context.begin_pbr_render_pass(vulkan_context, swapchain_image_index); +``` + +You see here that the primitive cull data is populated with the primitive id which is the key which is used to look +up the object in the primitive map. Because this key may change in format depending on your implementation, you should +make sure it is correctly set to get the data back about the visible primitives when you begin iterating the data for +your render psas. + +The final call to `render_context.cull_objects` is executed after the scene data is updated. This is important, as the +scene data is used to calculate the left and right clip planes, which will determine object visibility. As the position +of the user's eyes may change from one millisecond to the next, we wait till the last minute to set this so that the true +scene visibility is as accurate as possible. + diff --git a/src/tutorial/systems.md b/src/tutorial/systems.md new file mode 100644 index 0000000..203e581 --- /dev/null +++ b/src/tutorial/systems.md @@ -0,0 +1,47 @@ +# Tick Function & Systems + +As described in the [Getting Started](getting_started.md) section, a hotham program updates the world each frame before rendering it within a tick function or tick loop, by calling a number of different systems which receive input and update the simulation. Here again is an example of *some of* the functions that might be called during the tick function. + +```rust,noplayground + if tick_data.current_state == xr::SessionState::FOCUSED { + hands_system(engine); + grabbing_system(engine); + physics_system(engine); + animation_system(engine); + let mover = Movement { + world: &engine.world, + stage_entity: engine.stage_entity, + hmd_entity: engine.hmd_entity, + }; + input::handle_input(&engine.input_context, &mover, _state); + update_dynscreen(engine, _state); + update_global_transform_system(engine); + update_global_transform_with_parent_system(engine); + skinning_system(engine); + //debug_system(engine); + } + + rendering_system(engine, tick_data.swapchain_image_index); +``` + +Each of these systems performs specific actions. Some of them may not be exactly as you would like them. To make changes to these systems, you can use the code from one of these systems as a basis for your own version to run in its place. Here is a description of what these various systems do: + +1. **animation system**: For all in the world, update the rotation, translation and scale of the target objects of this animation controller based on their currently set blend amounts. The values which are animated through are based on the `blend_from` and `blend_to` values read in from the glb file the models are loaded from. +1. **audio system**: This queries any objects within the world with a `SoundEmitter` object, which is a wrapper over the `oddio` library. It updates the position of audio based on the movement of the listener (ie deals with spatialisation), and updates the state of the audio emitters to continue playing their sound. The position within the tick loop that you run this function is up to you; it is best to synchronise it with any other systems you run which may update the state of sound emitters within the world. +1. **debug system**: Reads input from the controllers and adjusts the value of the `scene_data.params` variable which is used by the shaders to determine the base output color displayed for PBR models. This can be used to ensure that normal maps, occlusion, emission and so on have loaded correctly. You can consult the `shaders/pbr.frag` code at the end of the main function to see what is displayed for different values of `params.z`.

Most likely, you want to turn this system off and capture the a, b, x, and y buttons yourself, and create your own action to update the scene data params that is more in line with your own program's input logic. +1. **draw gui system**: This system is within `systems::draw_gui`, and provides haptic feedback for any hover events, and paints the gui using the code in the `gui_context` implementation. In the crab-saber example code, it is called *before* the rendering system, but after all other systems which may have updated the UI panels. One point of note is that in the crab-saber example, it runs *after* the haptics system, but performs haptic feedback. You may wish to adjust the position of the haptics system in the loop to synchronise feedback from the GUI or other systems with the haptic system itself. +1. **hands system:** This updates the Global and Local transforms of the Left Hand and Right Hand models loaded into the scene, as well as updating the animation value on the animation controller to enable hand animations based on the value of each grip button. This system should be run before the animation controller, which will update the models based on this adjusted blend value. If you want to change the animation of the hands to use a different method of determining the blend value or final image, you can replace this call with your own custom system.

This system also updates the local and global transform of the currently grabbed entity from the grip. If this is not a behavior you want, again you can adapt the code for this system to allow a more intuitive method of rotation and translation of the object. +1. **grabbing system**: This system will iterate which objects which have a `Grabbable` component are currently colliding with the left or right hand while the grip button is depressed, and change any rigid body to be kinematic position based. This will also remove the `Parent` entity of the newly grabbed object, and set the grabbed_entity component of the hand in question to contain the newly grabbed object. If the grip is released, it will transform any rigid body previously grabbed into one of type `Dynamic`, meaning the physics engine will then control its position. It will then add a marker indicating the release of the object which will be removed on the next frame.

This may not be the behavior you want. For example, you might want to restore the parent or the type of rigid body after the object has finished being manipulated. If you want to do something like this, copy the system from the systems/grabbing.rs and modify the code to your taste. *You probably want to run the grabbing system **before** you run the physics system, to catch objects you want to grab before the physics system updates their position due to gravity/impulses.* +1. **haptic system**: This applies haptic feedback according to the feedback already applied this frame. It is best to be run *after* you have executed any system which may apply haptic feedback. +1. **navigation system**: In the hotham_examples shared name space, an example navigation system for rotating, scaling and translating the entire scene is implemented to adjust the transform of the stage and subsequently the camera/head mounted display parented to it.

If you read any input to adjust the position of objects within your scene like this, it is usually best done after the animation system system, but before updating the global transforms or rendering the frame. You can use the example code or write your own. In the example above, I have replaced it with the call to `input::handle_input`, passing the input context, the state variable, and an android helper object which makes it easier to move objects about. This is just an example of how you can write *whatever systems you want* and run them in *whatever logical order makes sense* for your application to function. +1. **pointer system**: This system handles collisions between items with a Pointer marker in their dynamic bundle, and the GUI panels in the scene. It should be run *before* calling the system to draw the GUI. You may wish to extend the ray casting logic included within this system to handle collisions of the pointers with other objects in the scene. +1. **physics system**: This updates the position of objects based on the physics simulation run by rapier3d. It creates handles for any objects in the world which have a rigid body or collider but no handle in rapier, and updates the position of all rigid bodies and colliders from their world positions. Then it runs the physics simulation, and updates the world based on the result of that simulation. You'll usually want to run this *after* any input events have been handled (although this is design dependent), but *before* the rendering system or updating global transforms. This should probably also run *before* the skinning system to ensure that any joints are in sync with updates to models made by the physics system. +1. **rendering system**: This renders the content that was updated as a result of running the previous systems. This is best called near the end of the tick loop. +1. **skinning system**: This updates the joint matrices from the `GlobalTransform` associated with the skin in question. It is best run *after* the global transforms have been updated using the `update_global_transform_system(engine)` and related systems. +1. **update global transform system**: This system, as the name suggests, updates the global transforms after previous systems have updated the corresponding `LocalTransform` objects. This simply updates all objects that have both a `LocalTransform` and `GlobalTransform` component. To update the position of parented objects, the next system is used. +1. **update global transform with parent system**: This recursively iterates all objects with a parent, and then for the corresponding root/parent node, updates the global transform of the child with the parent projection. This ensures that objects such as the head mounted display have a correct global transform which is relative to the stage object, and ensures any children whose parents are controlled by the physics engine fly in their respective groups. It should be run *before the rendering system is called* to ensure objects appear in their updated positions. + +# Recommendations + +Pay careful attention to the ordering of the systems you run within your tick loop to ensure that the state of the virtual world remains sane. Review the code of each one of the systems you are calling to ensure its actions are what you intend for your user interaction profile. If they differ, clone the system logic and adapt it to your own needs. + diff --git a/src/tutorial/vulkan_context.md b/src/tutorial/vulkan_context.md new file mode 100644 index 0000000..6f69303 --- /dev/null +++ b/src/tutorial/vulkan_context.md @@ -0,0 +1,43 @@ +# The Vulkan Context + +`engine.vulkan_context` exposes the Vulkan context, type `VulkanContext`, which provides access to the Vulkan instance for the engine. It is one half of the equation which will provide assistance in developing custom rendering solutions. + +Within the `VulkanContext`, the `device` attribute is the one you will likely make use of the most, providing access to the Vulkan device API for commands such as `create_pipeline_layout`, `create_render_pass`, and `create_graphics_pipelines`. This is also the home of commands such as `cmd_begin_render_pass`, `cmd_bind_pipeline`, and `cmd_bind_descriptor_sets` along with `cmd_bind_vertex_buffers` and `cmd_bind_index_buffer`. When recording commands you will be most likely using the `systems::rendering::draw_primitive` function which calls the `cmd_push_constants` and `cmd_draw_indexed` functions on the device. + +While using the `VulkanContext`, you will most likely want to import `hotham::ash::vk` to ensure the version of ash you have loaded is correct. Within the `vk` namespace, enums and builder patterns for many of the structs you will need to use in constructing and managing a pipeline are located. + +The `VulkanContext` also has a number of key functions which will be useful. We'll examine some of those now. + +# Creating images + +`engine.vulkan_context.create_image` is used for creating a new image that can be used in a framebuffer. It does nothing other than creating the image. This includes all of the Vulkan technical aspects of making assumptions of the type of image format to use, sharing mode, number of samples and so on. It also deals with creating an image view for the image. It handles the lower level technical implementation of allocating and binding memory for the Image as well. + +The object returned is a `Result`, where `Image` is a `hotham::rendering::image::Image`. This is essentially a wrapper around the local `vk::Image` which gives you access to the image view, the format, extent size, usage flags and any other information associated with the image creation. + +Note that this function does absolutely nothing other than what is described above. If you are creating an image for a particular purpose such as, lets say, for use as a texture, you probably want a more specific function like `Texture::empty` which makes calls to `create_texture_image` on the render context to add the newly created texture image to the descriptor set which can be accessed on the graphics pipeline. + +# Loading/updating images + +`engine.vulkan_context.upload_image` takes an `&[u8]` as its first parameter being the location of a buffer owned by the caller. The second parameter, the mip count, usually remains as 1, unless you have reason to change it. Similarly, the offsets array in parameter 3 is usually a `vec![0]` in most cases. The final parameter is the `&Image` itself. If you have created an empty texture, or want to modify an existing one, for example, the `rendering::texture::Texture` struct contains a field `image` containing the created Vulkan image to which you can upload content. You will typically want to have created the image with appropriate flags. + +This function creates a buffer using the data in the first parameter, with usage flags for the buffer being a `TRANSFER_SRC`, and then transitions the image layout to `TRANSFER_DST_OPTIMAL` before copying the buffer into the destination image, transitioning the image back to `SHADER_READ_ONLY_OPTIMAL` and destroying the staging buffer. + +I recommend using this function where possible to update textures, as it takes care of the details of image transitioning and ensuring the image is in a state optimal to receive the data from your local buffer. + +The functions `copy_buffer_to_image` and `copy_image_to_buffer` are specialized functions and should be used in conjunction with transitioning the image format and ensuring correct buffer usage flags. You should also ensure any buffer used with these functions is of appropriate size. + +# Buffers + +Other than the previously mentioned `copy_buffer_to_image` and `copy_image_to_buffer`, there is also `create_buffer_with_data` and a deprecated function `update_buffer`. However, in most cases, rather than using the `VulkanContext` itself to create buffers, you're best to use `hotham::rendering::buffer::Buffer` which includes functions for wrapping a buffer while providing access to the underlying vk::Buffer handle, determining if the buffer is empty, and updating descriptor sets with respect to a buffer. + +# Vulkan Validation Layers + +Although not part of the `vulkan_context` itself, many people developing in hotham may want to enable Vulkan validation layers and be unaware of how to do this. On the Oculus Quest / Meta Quest and several other devices, this can be done by executing the following command: + +```shell +adb shell setprop debug.oculus.loadandinjectpackagedvvl.my.package.name 1 +``` + +In this example, `my.package.name` is the name of the installed package you are debugging. It can be obtained by running `pm list packages -3` if you are unsure of the name of your program's package. + +After setting this property, `adb logcat -s VALIDATION` will show all warnings and other messages from the validation layers.