perf(pathfinder): Reduce cost of long pathfinding by 50-66% by implementing a conditional reverse insertion sorting algorithm#2450
Conversation
|
| Filename | Overview |
|---|---|
| Core/GameEngine/Include/GameLogic/AIPathfind.h | Adds m_tail to PathfindCellList, canReverseSort, getPrevOpen, and getTotalCostDifference declarations; structure is clean, #pragma once guard is present, and licensing header is intact. |
| Core/GameEngine/Source/GameLogic/AI/AIPathfind.cpp | Implements bidirectional sorted-open-list insertion and tail-tracking maintenance; core logic is correct but getTotalCostDifference uses abs() on an unsigned subtraction which is implementation-defined and semantically wrong for large cost differences. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["putOnSortedOpenList(list)"] --> B{"RETAIL_COMPATIBLE_PATHFINDING\n& !s_useFixedPathfinding?"}
B -- Yes --> C["forwardInsertionSortRetailCompatible(list)"]
B -- No --> D{"list.canReverseSort(*this)\n(head_diff > tail_diff?)"}
D -- Yes --> E["reverseInsertionSort(list)\n← traverse from m_tail"]
D -- No --> F["forwardInsertionSort(list)\n→ traverse from m_head"]
E --> E1{"m_tail == nullptr?"}
E1 -- Yes --> E2["Init: m_head = m_tail = this"]
E1 -- No --> E3{"totalCost >= tail.totalCost?"}
E3 -- Yes --> E4["Append after tail\nupdate m_tail"]
E3 -- No --> E5["Traverse backward while\nprevOpen.totalCost > newCost\nInsert before current\nupdate m_head if needed"]
F --> F1{"m_head == nullptr?"}
F1 -- Yes --> F2["Init: m_head = m_tail = this"]
F1 -- No --> F3{"totalCost < head.totalCost?"}
F3 -- Yes --> F4["Prepend before head\nupdate m_head"]
F3 -- No --> F5["Traverse forward while\nnextOpen.totalCost <= newCost\nInsert after current\nupdate m_tail if needed"]
Comments Outside Diff (1)
-
Core/GameEngine/Source/GameLogic/AI/AIPathfind.cpp, line 1362-1363 (link)abs()on unsigned subtraction is incorrectm_totalCostisUnsignedInt(unsigned), som_info->m_totalCost - other.m_info->m_totalCostuses unsigned arithmetic: whenthis->m_totalCost < other.m_totalCost, the subtraction wraps around to a very large value nearUINT_MAXinstead of producing a negative number. Passing that wrapped value toabs()(which takes a signedint) relies on implementation-defined behavior during the implicitunsigned → intconversion for values aboveINT_MAX.In practice this works for typical game path costs (well below
INT_MAX) because two's complement round-trips throughint, but it is still semantically incorrect and could silently misfire once cost values grow large, causingcanReverseSortto always pick the reverse path even when forward traversal would be faster.The fix is to avoid the unsafe subtraction entirely:
Prompt To Fix All With AI
This is a comment left during a code review.
Path: Core/GameEngine/Source/GameLogic/AI/AIPathfind.cpp
Line: 1362-1363
Comment:
**`abs()` on unsigned subtraction is incorrect**
`m_totalCost` is `UnsignedInt` (unsigned), so `m_info->m_totalCost - other.m_info->m_totalCost` uses unsigned arithmetic: when `this->m_totalCost < other.m_totalCost`, the subtraction wraps around to a very large value near `UINT_MAX` instead of producing a negative number. Passing that wrapped value to `abs()` (which takes a signed `int`) relies on implementation-defined behavior during the implicit `unsigned → int` conversion for values above `INT_MAX`.
In practice this works for typical game path costs (well below `INT_MAX`) because two's complement round-trips through `int`, but it is still semantically incorrect and could silently misfire once cost values grow large, causing `canReverseSort` to always pick the reverse path even when forward traversal would be faster.
The fix is to avoid the unsafe subtraction entirely:
```suggestion
if (!m_info || !other.m_info)
return UINT_MAX;
if (m_info->m_totalCost >= other.m_info->m_totalCost)
return m_info->m_totalCost - other.m_info->m_totalCost;
return other.m_info->m_totalCost - m_info->m_totalCost;
```
How can I resolve this? If you propose a fix, please make it concise.Last reviewed commit: "perf(pathfinder): Ex..."
12389b7 to
62457b8
Compare
62457b8 to
7f0fee2
Compare
|
@greptile early initialising the tail makes no difference to performance when testing, keeping the code to initialise the tail when the list is appended is just as performant. Often hundreds of items are added to the list so an element is often appended to the list early on. |
|
Fair enough. If your testing shows no performance difference and the tail gets initialized naturally during the first append anyway, then keeping the lazy initialization is fine. The simpler code is better when there's no measurable benefit to the alternative. Since |
Can you update your summary please. |
|
@greptile can you update your summary to disregard your considered prepend head fix as it is unnecessary as discussed. |
|
I understand - the prepend head optimization isn't necessary. Here's the updated summary: This PR optimizes the pathfinding open list insertion by implementing bidirectional insertion sorting. The algorithm now compares a new node's cost against the midpoint between the head and tail costs to determine whether to traverse from the head or tail, reducing average insertion time from O(N) to O(N/2). The performance improvement is particularly dramatic for long paths (2-3x faster) because:
The benchmarks demonstrate significant speedup on longer paths while maintaining equivalent performance on shorter paths. |
7f0fee2 to
195696e
Compare
|
Tweaked forward and reverse insertion sort so if they are initialising the head or tail they also initialise their opposite if it is uninitialised. It doesn't hurt performance when done here as it's only checked once after the head or tail initialisation check. |
195696e to
a478eda
Compare
a478eda to
852bb07
Compare
|
updated based on feedback |
852bb07 to
61742fd
Compare
61742fd to
73a3e2d
Compare
|
That was just a quick rebase, going to tweak now |
73a3e2d to
9ce7b91
Compare
…evOpen() and extend PathfindCellList with tail tracking and PathfindCellList::canReverseSort() (#2450)
…ditional reverse insertion sorting (#2450)
9ce7b91 to
90d6f7a
Compare
|
Should be good now, i just realised i needed to tweak it again since even the retail compatible reset needs to be able to reset the tail if the fixed pathfinding path gets used in a retail compatible build. |
| m_info->m_prevOpen = current->m_info; | ||
| } | ||
|
|
||
| // Reverse insertion sort, returns early if the list is being initialized or we are appending the list |
There was a problem hiding this comment.
Could add TheSuperHackers @performance here or elsewhere suitable and explain how this gives performance and how much.
|
Greptile is still complaining about "abs() on unsigned subtraction is incorrect". |
This PR adds conditionaly selected reverse insertion sorting.
putOnSortedOpenList()now acts as a dispatching function that chooses if a new node is inserted by traversing the list from the head or tail. The code determines if the new nodes total cost is closer to the head or tail to determine which list traversal direction to use.This provides a significant performance improvement to long paths, allowing them to finish in half to a third of the time of just using forward insertion sorting alone.
The likely reason for this is that longer and more complex paths tend to append larger values to the list which requires a full list traversal when forward inserting. Or they are adding large path costs so will typically be inserting in the upper 50% of the list.
On average we would have expected at least a 50% performance improvement, due to the logical splitting in half of the list length, since we now approach any node position from either end of the list.
With the conditional reverse insertion sort we would have expected the algorithm to be O(N/2) in time, but for the above reasons, allowing quick appends dramatically improves the performance of some pathing sorts.
In other words, pathfinder go brrrrrr already at stage 1 tuning!
Some comparison values from a debug build, the pathfinding times are in second.

Shorter paths are not as significantly affected, but the followup skip list enhancement should provide a further boost for these.

Some variation is also down to CPU load as i often would see variation between runs but overall significantly lower times on the longer paths.