Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Queue free on node with script running async coroutine with tree timer still lets coroutine run one last time on next frame #93608

Open
hsandt opened this issue Jun 25, 2024 · 2 comments

Comments

@hsandt
Copy link
Contributor

hsandt commented Jun 25, 2024

Tested versions

  • Reproducible in v4.2.1.stable.official [b09f793]

System information

Godot v4.2.1.stable - Ubuntu 22.04.4 LTS 22.04 - X11 - Vulkan (Mobile) - dedicated NVIDIA GeForce GTX 860M (nvidia; 535.161.07) - Intel(R) Core(TM) i7-4710HQ CPU @ 2.50GHz (8 Threads)

Issue description

Calling queue_free on node, or a parent of a node that is currently running a coroutine (async method) that awaits for get_tree().create_timer(...) with argument process_in_physics=true will, under a certain timing, still lets the coroutine run one last time on the next frame.

I expect queue_free to delay node destruction to the end of the frame, but not at late as the end of the next frame.

To repro the bug, I had to use await get_tree().create_timer(0, false, true).timeout inside the coroutine: I found await get_tree().physics_frame not to repro the bug, and that process_in_physics=true was needed to match process_in_physics=true in the create_timer mentioned above. Or, you can use create_timer with process_in_physics=false in both the main script and coroutine to match, it will also cause the bug.

Some timer durations work, but not others, I'm not sure why, maybe it's related to the main and sub coroutines having to get in sync at the end of the frame.

On main code side (calling queue_free), both await get_tree().physics_frame and create_timer worked to repro the bug.

Finally, free doesn't cause the bug but can obviously cause other issues when dealing with more complex nodes used elsewhere.

Real-world case

My enemy AI is run via an async method, which is very convenient to define an action flow over time, but has the disadvantage of not being easily cancellable. Fortunately, destroying a node, or a parent node of the one with the script running the coroutine (at least when not static) will stop any running coroutine. The issue is just that it's not frame-precise and therefore I can have remarkable leaks of the coroutine even after a manual game restart consisting in clearing all entities manually (such as a projectile being spawned out of the blue on the first frame after restart).

Using process_in_physics=false` should be the minimal fix but I really want to sync my AI logic and actions with physics process to be safe. So my workaround was to make sure to clear all projectiles at least one frame after clearing the enemies.

Steps to reproduce

  1. Open MRP demo scene, made of a Main node and an AsyncRun node
  2. Play the demo scene and observe output

You should see LOOP 3 times, the last one happening after queue_free:

ASyncRun: start LOOP
... despite node being already queued for deletion!
(not just at the end of frame but really on next frame - check the frame counter)

  1. Swap the order of Main and AsyncRun
  2. Repeat step 2: note how the order of physics_process frames counter debug changes in the Output log, but eventually the same problem appears
  3. Variant: in AsyncRun.gd, replace create_timer 3rd argument process_in_physics to false. In main.gd, replace the secondawait get_tree().physics_frame (after run_async call) with await get_tree().replace (0.017, false, false).timeout (if you uncomment the existing line, remember to replace the 3rd argument with false there too) => also reproduces the bug

Simplified code if you prefer running it in your own project:

# AsyncRun.gd
extends Node2D

var frames: int = 0

func run_async():
	# the loop is merely to make it easier to spam LOOP after every timer
	while true:
		print("ASyncRun: start LOOP")
		if is_queued_for_deletion():
			print("... despite node being already queued for deletion!")

		# 0, 1/60. (0.0166...) and 2/60. (0.0333...) seem to trigger the bug,
		# but not 3/60. nor 4/60.
		# you can also pass process_in_physics=false but then
		# you must also use create_timer in main.gd, also with
		# process_in_physics=false
		await get_tree().create_timer(0, false, true).timeout

func _physics_process(_delta):
	if frames < 5:
		print("AsyncRun: _physics_process frame %d" % frames)
		frames += 1
# main.gd
extends Node2D


@onready var async_run_node = $"../AsyncRun"

var frames: int = 0


func _ready():
	# just to be safe on start
	await get_tree().physics_frame

	async_run_node.run_async()

	await get_tree().physics_frame
	# or
	# await get_tree().create_timer(0.017, false, true).timeout

	async_run_node.queue_free()
	print("Main: called async_run_node.queue_free(), AsyncRun coroutine should stop by next frame")

func _physics_process(_delta):
	if frames < 5:
		print("Main: _physics_process frame %d" % frames)
		frames += 1

Tree hierarchy:

  • Root
    • Main (main.gd)
    • AsyncRun (AsyncRun.gd)

Minimal reproduction project (MRP)

v4.2.2 - Queue free on node fails to stop coroutine last run after tree timer.zip

@4X3L82
Copy link

4X3L82 commented Jun 25, 2024

Emphasis mine:

My enemy AI is run via an async method, which is very convenient to define an action flow over time, but has the disadvantage of not being easily cancellable.
I really want to sync my AI logic and actions with physics process to be safe

Don't you want it both ways? First you want async AI logic, but at the same time you want to sync AI logic? Comes across as wanting to eat your cake and have it too.

Plus, if you want things to be "physics process"-safe, shouldn't you put all your AI logic into _physics_process()? Personally, that sounds like the wrong place (overkill in frequency, not to mention enemy logic isn't a "physics" thing), but you are the one wanting it to be "physics process"-safe.

@ajreckof
Copy link
Member

ajreckof commented Jun 26, 2024

This is the log I have (cut to the usefull part), if you do not have something similar then you can ignore whatever I'm stating below:

AsyncRun: _physics_process frame 4
Main: _physics_process frame 4
ASyncRun: start LOOP
Main: called async_run_node.queue_free(), AsyncRun coroutine should stop by next frame
AsyncRun: _physics_process frame 5
Main: _physics_process frame 5
ASyncRun: start LOOP
... despite node being already queued for deletion!
Main: _physics_process frame 6

The signal physics_frame is called before the associated function _physic_process. This means where you thought this was happening at the end of the previous frame it was happening at the start of the next frame it then went on doing physic_process for all nodes and finally doing a turn of all timers tied to physic process (which is where you have the call being done while the node is queued for deletion) finally it would free all associated Node.

If you have a different log then it is either something that has already been fixed (tested on master) or something that is platform dependent (tested on MacOS)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants