-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GLTF loading and animation performance regression #17535
Comments
Are you able to bisect? |
The first bad commit is 21f1e30 |
I was able to repo. This is the result of propagating over a million transforms per frame, when prior to #17398 we were propagating 28,000 per frame. We have the same number of transforms on both branches. This is actually caused by Children containing duplicates. I suspect that this is caused by buggy "scene spawn" code that spawns both a fully populated Children collection and a ChildOf, resulting in duplicates. This isn't possible via "normal" Rust flows, as you can't spawn a populated Children collection. |
Looking at the source, |
If, like me, you're blocked because of this bug, the following diff will work around it: diff --git a/crates/bevy_scene/src/scene.rs b/crates/bevy_scene/src/scene.rs
index 232acf345..df59a3bc0 100644
--- a/crates/bevy_scene/src/scene.rs
+++ b/crates/bevy_scene/src/scene.rs
@@ -1,7 +1,10 @@
+use core::any::TypeId;
+
use crate::{DynamicScene, SceneSpawnError};
use bevy_asset::Asset;
use bevy_ecs::{
entity::{hash_map::EntityHashMap, Entity, SceneEntityMapper},
+ hierarchy::Children,
reflect::{AppTypeRegistry, ReflectComponent, ReflectMapEntities, ReflectResource},
world::World,
};
@@ -112,6 +115,10 @@ impl Scene {
.get_info(component_id)
.expect("component_ids in archetypes should have ComponentInfo");
+ if component_info.type_id() == Some(TypeId::of::<Children>()) {
+ continue;
+ }
+
let registration = type_registry
.get(component_info.type_id().unwrap())
.ok_or_else(|| SceneSpawnError::UnregisteredType { |
As a heads up I'm working on a full fix, which is in the same category of thing as @pcwalton's fix above, just more generic (relies on expanding the "entity cloning" system so we can read "ignore" behavior here for all relationship targets). However in doing this, I identified various cases of tech debt that I figured we should pay the bill on. In addition to solving the clear logic/performance bug, this will also cut down on the boilerplate required to map entities for scene spawning and entity cloning (by making entity visiting a first class component behavior), improve the usability (and possibly performance) of EntityCloner, and open the doors to using EntityCloner infrastructure when spawning |
Fixes #17535 Bevy's approach to handling "entity mapping" during spawning and cloning needs some work. The addition of [Relations](#17398) both [introduced a new "duplicate entities" bug when spawning scenes in the scene system](#17535) and made the weaknesses of the current mapping system exceedingly clear: 1. Entity mapping requires _a ton_ of boilerplate (implement or derive VisitEntities and VisitEntitesMut, then register / reflect MapEntities). Knowing the incantation is challenging and if you forget to do it in part or in whole, spawning subtly breaks. 2. Entity mapping a spawned component in scenes incurs unnecessary overhead: look up ReflectMapEntities, create a _brand new temporary instance_ of the component using FromReflect, map the entities in that instance, and then apply that on top of the actual component using reflection. We can do much better. Additionally, while our new [Entity cloning system](#16132) is already pretty great, it has some areas we can make better: * It doesn't expose semantic info about the clone (ex: ignore or "clone empty"), meaning we can't key off of that in places where it would be useful, such as scene spawning. Rather than duplicating this info across contexts, I think it makes more sense to add that info to the clone system, especially given that we'd like to use cloning code in some of our spawning scenarios. * EntityCloner is currently built in a way that prioritizes a single entity clone * EntityCloner's recursive cloning is built to be done "inside out" in a parallel context (queue commands that each have a clone of EntityCloner). By making EntityCloner the orchestrator of the clone we can remove internal arcs, improve the clarity of the code, make EntityCloner mutable again, and simplify the builder code. * EntityCloner does not currently take into account entity mapping. This is necessary to do true "bullet proof" cloning, would allow us to unify the per-component scene spawning and cloning UX, and ultimately would allow us to use EntityCloner in place of raw reflection for scenes like `Scene(World)` (which would give us a nice performance boost: fewer archetype moves, less reflection overhead). ## Solution ### Improved Entity Mapping First, components now have first-class "entity visiting and mapping" behavior: ```rust #[derive(Component, Reflect)] #[reflect(Component)] struct Inventory { size: usize, #[entities] items: Vec<Entity>, } ``` Any field with the `#[entities]` annotation will be viewable and mappable when cloning and spawning scenes. Compare that to what was required before! ```rust #[derive(Component, Reflect, VisitEntities, VisitEntitiesMut)] #[reflect(Component, MapEntities)] struct Inventory { #[visit_entities(ignore)] size: usize, items: Vec<Entity>, } ``` Additionally, for relationships `#[entities]` is implied, meaning this "just works" in scenes and cloning: ```rust #[derive(Component, Reflect)] #[relationship(relationship_target = Children)] #[reflect(Component)] struct ChildOf(pub Entity); ``` Note that Component _does not_ implement `VisitEntities` directly. Instead, it has `Component::visit_entities` and `Component::visit_entities_mut` methods. This is for a few reasons: 1. We cannot implement `VisitEntities for C: Component` because that would conflict with our impl of VisitEntities for anything that implements `IntoIterator<Item=Entity>`. Preserving that impl is more important from a UX perspective. 2. We should not implement `Component: VisitEntities` VisitEntities in the Component derive, as that would increase the burden of manual Component trait implementors. 3. Making VisitEntitiesMut directly callable for components would make it easy to invalidate invariants defined by a component author. By putting it in the `Component` impl, we can make it harder to call naturally / unavailable to autocomplete using `fn visit_entities_mut(this: &mut Self, ...)`. `ReflectComponent::apply_or_insert` is now `ReflectComponent::apply_or_insert_mapped`. By moving mapping inside this impl, we remove the need to go through the reflection system to do entity mapping, meaning we no longer need to create a clone of the target component, map the entities in that component, and patch those values on top. This will make spawning mapped entities _much_ faster (The default `Component::visit_entities_mut` impl is an inlined empty function, so it will incur no overhead for unmapped entities). ### The Bug Fix To solve #17535, spawning code now skips entities with the new `ComponentCloneBehavior::Ignore` and `ComponentCloneBehavior::RelationshipTarget` variants (note RelationshipTarget is a temporary "workaround" variant that allows scenes to skip these components. This is a temporary workaround that can be removed as these cases should _really_ be using EntityCloner logic, which should be done in a followup PR. When that is done, `ComponentCloneBehavior::RelationshipTarget` can be merged into the normal `ComponentCloneBehavior::Custom`). ### Improved Cloning * `Option<ComponentCloneHandler>` has been replaced by `ComponentCloneBehavior`, which encodes additional intent and context (ex: `Default`, `Ignore`, `Custom`, `RelationshipTarget` (this last one is temporary)). * Global per-world entity cloning configuration has been removed. This felt overly complicated, increased our API surface, and felt too generic. Each clone context can have different requirements (ex: what a user wants in a specific system, what a scene spawner wants, etc). I'd prefer to see how far context-specific EntityCloners get us first. * EntityCloner's internals have been reworked to remove Arcs and make it mutable. * EntityCloner is now directly stored on EntityClonerBuilder, simplifying the code somewhat * EntityCloner's "bundle scratch" pattern has been moved into the new BundleScratch type, improving its usability and making it usable in other contexts (such as future cross-world cloning code). Currently this is still private, but with some higher level safe APIs it could be used externally for making dynamic bundles * EntityCloner's recursive cloning behavior has been "externalized". It is now responsible for orchestrating recursive clones, meaning it no longer needs to be sharable/clone-able across threads / read-only. * EntityCloner now does entity mapping during clones, like scenes do. This gives behavior parity and also makes it more generically useful. * `RelatonshipTarget::RECURSIVE_SPAWN` is now `RelationshipTarget::LINKED_SPAWN`, and this field is used when cloning relationship targets to determine if cloning should happen recursively. The new `LINKED_SPAWN` term was picked to make it more generically applicable across spawning and cloning scenarios. ## Next Steps * I think we should adapt EntityCloner to support cross world cloning. I think this PR helps set the stage for that by making the internals slightly more generalized. We could have a CrossWorldEntityCloner that reuses a lot of this infrastructure. * Once we support cross world cloning, we should use EntityCloner to spawn `Scene(World)` scenes. This would yield significant performance benefits (no archetype moves, less reflection overhead). --------- Co-authored-by: eugineerd <[email protected]> Co-authored-by: Alice Cecile <[email protected]>
This PR makes Bevy keep entities in bins from frame to frame if they haven't changed. This reduces the time spent in `queue_material_meshes` and related functions to near zero for static geometry. This patch uses the same change tick technique that #17567 uses to detect when meshes have changed in such a way as to require re-binning. In order to quickly find the relevant bin for an entity when that entity has changed, we introduce a new type of cache, the *bin key cache*. This cache stores a mapping from main world entity ID to cached bin key, as well as the tick of the most recent change to the entity. As we iterate through the visible entities in `queue_material_meshes`, we check the cache to see whether the entity needs to be re-binned. If it doesn't, then we mark it as clean in the `valid_cached_entity_bin_keys` bit set. If it does, then we insert it into the correct bin, and then mark the entity as clean. At the end, all entities not marked as clean are removed from the bins. This patch has a dramatic effect on the rendering performance of most benchmarks, as it effectively eliminates `queue_material_meshes` from the profile. Note, however, that it generally simultaneously regresses `batch_and_prepare_binned_render_phase` by a bit (not by enough to outweigh the win, however). I believe that's because, before this patch, `queue_material_meshes` put the bins in the CPU cache for `batch_and_prepare_binned_render_phase` to use, while with this patch, `batch_and_prepare_binned_render_phase` must load the bins into the CPU cache itself. On Caldera, this reduces the time spent in `queue_material_meshes` from 5+ ms to 0.2ms-0.3ms. Note that benchmarking on that scene is very noisy right now because of #17535. ![Screenshot 2025-02-05 153458](https://github.com/user-attachments/assets/e55f8134-b7e3-4b78-a5af-8d83e1e213b7)
Bevy version
main
deb135c
Relevant system information
What you did
What went wrong
The gltf file never finishes loading when running with the dev profile (or at least it takes a very very long time - I've killed the process after a few minutes). It will eventually load when running it with
--release
but it's slow and the animation playback is very choppy.Additional information
The file loads nearly instantly with
v0.15.1
and animation playback is smooth, even with the dev profile.UndeadMale.zip
The text was updated successfully, but these errors were encountered: