Skip to content

perf(pathfinder): Reduce cost of long pathfinding by 50-66% by implementing a conditional reverse insertion sorting algorithm#2450

Open
Mauller wants to merge 2 commits intoTheSuperHackers:mainfrom
Mauller:Mauller/perf-reverse-insertionsort
Open

perf(pathfinder): Reduce cost of long pathfinding by 50-66% by implementing a conditional reverse insertion sorting algorithm#2450
Mauller wants to merge 2 commits intoTheSuperHackers:mainfrom
Mauller:Mauller/perf-reverse-insertionsort

Conversation

@Mauller
Copy link

@Mauller Mauller commented Mar 13, 2026

This PR adds conditionaly selected reverse insertion sorting.

putOnSortedOpenList() now acts as a dispatching function that chooses if a new node is inserted by traversing the list from the head or tail. The code determines if the new nodes total cost is closer to the head or tail to determine which list traversal direction to use.

This provides a significant performance improvement to long paths, allowing them to finish in half to a third of the time of just using forward insertion sorting alone.

The likely reason for this is that longer and more complex paths tend to append larger values to the list which requires a full list traversal when forward inserting. Or they are adding large path costs so will typically be inserting in the upper 50% of the list.

On average we would have expected at least a 50% performance improvement, due to the logical splitting in half of the list length, since we now approach any node position from either end of the list.

With the conditional reverse insertion sort we would have expected the algorithm to be O(N/2) in time, but for the above reasons, allowing quick appends dramatically improves the performance of some pathing sorts.

In other words, pathfinder go brrrrrr already at stage 1 tuning!


Some comparison values from a debug build, the pathfinding times are in second.
image

image

Shorter paths are not as significantly affected, but the followup skip list enhancement should provide a further boost for these.
image

Some variation is also down to CPU load as i often would see variation between runs but overall significantly lower times on the longer paths.

@Mauller Mauller self-assigned this Mar 13, 2026
@Mauller Mauller added Performance Is a performance concern Gen Relates to Generals ZH Relates to Zero Hour labels Mar 13, 2026
@greptile-apps
Copy link

greptile-apps bot commented Mar 13, 2026

Greptile Summary

This PR upgrades the A* open-list management from a unidirectional forward insertion sort to a bidirectional one: a new m_tail pointer is added to PathfindCellList, and putOnSortedOpenList now dispatches between forwardInsertionSort (traverses from head) and reverseInsertionSort (traverses from tail) based on which end of the list is closer in cost to the new node. All dependent operations (removeFromOpenList, releaseOpenList, releaseClosedList, list init paths) have been correctly updated to maintain the tail pointer, and the decision function canReverseSort is properly guarded against empty lists.

Key changes:

  • PathfindCellList gains m_tail and canReverseSort, making it a true doubly-terminated list
  • reverseInsertionSort correctly inserts in ascending cost order while traversing backwards; edge-cases (empty list, append-to-tail) match the forward path's early-exit structure
  • getTotalCostDifference uses abs() on an unsigned subtraction, which is semantically incorrect and implementation-defined for large cost values — the comparison should be guarded with an explicit >= check before subtracting (see inline comment)
  • forwardInsertionSort, removeFromOpenList, and the release functions are correctly extended to keep m_tail in sync

Confidence Score: 3/5

  • Safe to merge for correctness (all sort paths produce a valid ascending list), but the unsigned-subtraction bug in getTotalCostDifference should be fixed to guarantee reliable direction-selection and avoid implementation-defined behaviour.
  • The core linked-list logic — insertion order, tail tracking, removal, and release — is correct and the invariants are properly maintained. The only real defect is the abs() call on an unsigned subtraction in getTotalCostDifference, which produces implementation-defined results for cost differences exceeding INT_MAX. For current game path costs this is harmless in practice, but it undermines the precision of canReverseSort at scale and is the kind of latent bug that becomes hard to diagnose later.
  • Core/GameEngine/Source/GameLogic/AI/AIPathfind.cpp — specifically the getTotalCostDifference implementation around line 1362.

Important Files Changed

Filename Overview
Core/GameEngine/Include/GameLogic/AIPathfind.h Adds m_tail to PathfindCellList, canReverseSort, getPrevOpen, and getTotalCostDifference declarations; structure is clean, #pragma once guard is present, and licensing header is intact.
Core/GameEngine/Source/GameLogic/AI/AIPathfind.cpp Implements bidirectional sorted-open-list insertion and tail-tracking maintenance; core logic is correct but getTotalCostDifference uses abs() on an unsigned subtraction which is implementation-defined and semantically wrong for large cost differences.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["putOnSortedOpenList(list)"] --> B{"RETAIL_COMPATIBLE_PATHFINDING\n& !s_useFixedPathfinding?"}
    B -- Yes --> C["forwardInsertionSortRetailCompatible(list)"]
    B -- No --> D{"list.canReverseSort(*this)\n(head_diff > tail_diff?)"}
    D -- Yes --> E["reverseInsertionSort(list)\n← traverse from m_tail"]
    D -- No --> F["forwardInsertionSort(list)\n→ traverse from m_head"]

    E --> E1{"m_tail == nullptr?"}
    E1 -- Yes --> E2["Init: m_head = m_tail = this"]
    E1 -- No --> E3{"totalCost >= tail.totalCost?"}
    E3 -- Yes --> E4["Append after tail\nupdate m_tail"]
    E3 -- No --> E5["Traverse backward while\nprevOpen.totalCost > newCost\nInsert before current\nupdate m_head if needed"]

    F --> F1{"m_head == nullptr?"}
    F1 -- Yes --> F2["Init: m_head = m_tail = this"]
    F1 -- No --> F3{"totalCost < head.totalCost?"}
    F3 -- Yes --> F4["Prepend before head\nupdate m_head"]
    F3 -- No --> F5["Traverse forward while\nnextOpen.totalCost <= newCost\nInsert after current\nupdate m_tail if needed"]
Loading

Comments Outside Diff (1)

  1. Core/GameEngine/Source/GameLogic/AI/AIPathfind.cpp, line 1362-1363 (link)

    abs() on unsigned subtraction is incorrect

    m_totalCost is UnsignedInt (unsigned), so m_info->m_totalCost - other.m_info->m_totalCost uses unsigned arithmetic: when this->m_totalCost < other.m_totalCost, the subtraction wraps around to a very large value near UINT_MAX instead of producing a negative number. Passing that wrapped value to abs() (which takes a signed int) relies on implementation-defined behavior during the implicit unsigned → int conversion for values above INT_MAX.

    In practice this works for typical game path costs (well below INT_MAX) because two's complement round-trips through int, but it is still semantically incorrect and could silently misfire once cost values grow large, causing canReverseSort to always pick the reverse path even when forward traversal would be faster.

    The fix is to avoid the unsafe subtraction entirely:

Prompt To Fix All With AI
This is a comment left during a code review.
Path: Core/GameEngine/Source/GameLogic/AI/AIPathfind.cpp
Line: 1362-1363

Comment:
**`abs()` on unsigned subtraction is incorrect**

`m_totalCost` is `UnsignedInt` (unsigned), so `m_info->m_totalCost - other.m_info->m_totalCost` uses unsigned arithmetic: when `this->m_totalCost < other.m_totalCost`, the subtraction wraps around to a very large value near `UINT_MAX` instead of producing a negative number. Passing that wrapped value to `abs()` (which takes a signed `int`) relies on implementation-defined behavior during the implicit `unsigned → int` conversion for values above `INT_MAX`.

In practice this works for typical game path costs (well below `INT_MAX`) because two's complement round-trips through `int`, but it is still semantically incorrect and could silently misfire once cost values grow large, causing `canReverseSort` to always pick the reverse path even when forward traversal would be faster.

The fix is to avoid the unsafe subtraction entirely:

```suggestion
	if (!m_info || !other.m_info)
		return UINT_MAX;

	if (m_info->m_totalCost >= other.m_info->m_totalCost)
		return m_info->m_totalCost - other.m_info->m_totalCost;

	return other.m_info->m_totalCost - m_info->m_totalCost;
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "perf(pathfinder): Ex..."

@Mauller Mauller force-pushed the Mauller/perf-reverse-insertionsort branch from 12389b7 to 62457b8 Compare March 13, 2026 16:34
@Mauller Mauller force-pushed the Mauller/perf-reverse-insertionsort branch from 62457b8 to 7f0fee2 Compare March 13, 2026 16:50
@Mauller
Copy link
Author

Mauller commented Mar 13, 2026

@greptile early initialising the tail makes no difference to performance when testing, keeping the code to initialise the tail when the list is appended is just as performant. Often hundreds of items are added to the list so an element is often appended to the list early on.

@greptile-apps
Copy link

greptile-apps bot commented Mar 13, 2026

Fair enough. If your testing shows no performance difference and the tail gets initialized naturally during the first append anyway, then keeping the lazy initialization is fine. The simpler code is better when there's no measurable benefit to the alternative.

Since putOnSortedOpenList is the only intended entry point and the list quickly gets populated in typical pathfinding scenarios, the current approach works well for your use case.

@Mauller
Copy link
Author

Mauller commented Mar 13, 2026

Fair enough. If your testing shows no performance difference and the tail gets initialized naturally during the first append anyway, then keeping the lazy initialization is fine. The simpler code is better when there's no measurable benefit to the alternative.

Since putOnSortedOpenList is the only intended entry point and the list quickly gets populated in typical pathfinding scenarios, the current approach works well for your use case.

Can you update your summary please.

@Mauller
Copy link
Author

Mauller commented Mar 14, 2026

@greptile can you update your summary to disregard your considered prepend head fix as it is unnecessary as discussed.

@greptile-apps
Copy link

greptile-apps bot commented Mar 14, 2026

I understand - the prepend head optimization isn't necessary. Here's the updated summary:

This PR optimizes the pathfinding open list insertion by implementing bidirectional insertion sorting. The algorithm now compares a new node's cost against the midpoint between the head and tail costs to determine whether to traverse from the head or tail, reducing average insertion time from O(N) to O(N/2).

The performance improvement is particularly dramatic for long paths (2-3x faster) because:

  1. Complex paths tend to generate nodes with higher costs that insert near the tail
  2. Approaching from the tail eliminates full list traversals for these common cases

The benchmarks demonstrate significant speedup on longer paths while maintaining equivalent performance on shorter paths.

@Mauller Mauller force-pushed the Mauller/perf-reverse-insertionsort branch from 7f0fee2 to 195696e Compare March 14, 2026 10:14
@Mauller
Copy link
Author

Mauller commented Mar 14, 2026

Tweaked forward and reverse insertion sort so if they are initialising the head or tail they also initialise their opposite if it is uninitialised.

It doesn't hurt performance when done here as it's only checked once after the head or tail initialisation check.

@Mauller Mauller force-pushed the Mauller/perf-reverse-insertionsort branch from 195696e to a478eda Compare March 14, 2026 21:22
@xezon xezon requested a review from Skyaero42 March 17, 2026 18:46
@Mauller Mauller force-pushed the Mauller/perf-reverse-insertionsort branch from a478eda to 852bb07 Compare March 17, 2026 21:09
@Mauller
Copy link
Author

Mauller commented Mar 17, 2026

updated based on feedback

@Mauller Mauller force-pushed the Mauller/perf-reverse-insertionsort branch from 852bb07 to 61742fd Compare March 18, 2026 20:53
@Mauller Mauller force-pushed the Mauller/perf-reverse-insertionsort branch from 61742fd to 73a3e2d Compare March 19, 2026 20:21
@Mauller
Copy link
Author

Mauller commented Mar 19, 2026

That was just a quick rebase, going to tweak now

@Mauller Mauller force-pushed the Mauller/perf-reverse-insertionsort branch from 73a3e2d to 9ce7b91 Compare March 19, 2026 20:33
Mauller added 2 commits March 19, 2026 20:35
…evOpen() and extend PathfindCellList with tail tracking and PathfindCellList::canReverseSort() (#2450)
@Mauller Mauller force-pushed the Mauller/perf-reverse-insertionsort branch from 9ce7b91 to 90d6f7a Compare March 19, 2026 20:36
@Mauller
Copy link
Author

Mauller commented Mar 19, 2026

Should be good now, i just realised i needed to tweak it again since even the retail compatible reset needs to be able to reset the tail if the fixed pathfinding path gets used in a retail compatible build.

@xezon xezon changed the title perf(pathfinder): Implement conditional reverse insertion sorting, improving long path performance by 2-3x perf(pathfinder): Reduce cost of long pathfinding by 50-66% by implementing a conditional reverse insertion sorting algorithm Mar 19, 2026
Copy link

@xezon xezon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looking good.

Title optimized.

m_info->m_prevOpen = current->m_info;
}

// Reverse insertion sort, returns early if the list is being initialized or we are appending the list
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add TheSuperHackers @performance here or elsewhere suitable and explain how this gives performance and how much.

@xezon
Copy link

xezon commented Mar 19, 2026

Greptile is still complaining about "abs() on unsigned subtraction is incorrect".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Gen Relates to Generals Performance Is a performance concern ZH Relates to Zero Hour

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants