Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise A* Search #175

Open
tetv opened this issue Feb 13, 2017 · 5 comments
Open

Optimise A* Search #175

tetv opened this issue Feb 13, 2017 · 5 comments
Assignees
Labels

Comments

@tetv
Copy link

tetv commented Feb 13, 2017

During the A* Search problem, every time a state/node is explored the following procedure is applied:

  1. call action function to retrieve a list of actions that can be applied to that specific state;
  2. call the transition function for each action retrieved in 1.
  3. call the cost function (g) for each new generated state in 2.
  4. call the estimate function (h) for each new generated state in 2.
  5. insert the new generated states into the "sorted by f=g+h queue" of states to be explored.
  6. explore the next state in the "sorted queue" which have smaller f.

However I think a good optimisation is to call, before 5., the search predicate function (isGoal) for each new generated state, and if it is a goal state, then, after inserting into the "sorted queue", all the states in that queue which f is greater than the inserted state f could be removed, because they won't be never explored, and with that, a bunch of memory is saved during the search, which turn the Garbage Collector happier.

By the way, during the search and before the result, how can I have access to some basic stats?
Like: number of explored nodes and number of nodes that are in the queue to be explored?

Thank you.

@gonzalezsieira
Copy link
Contributor

Hi @tetv , and first of all thanks for sharing your thoughts on this question.

Although what you say is true, it must be taken into account that exploring the queue to remove obsolete elements is not trivial, and executing this operation whenever a node changes its position in the queue might cause a significant impact in the performance of the search algorithm.

So, we found that a better approach is the current implementation, at least regarding execution time. It is not perfect, tough. For problems in which nodes are regularly re-visited the consequence is the preservation of unused references, which are never cleaned up. This impacts the memory usage and the performance of the garbage collector, as you pointed out.

I agree with you on the need of cleaning up the priority, although this should be an operation only triggered under certain circumstances (i.e. memory wasted in storing unuseful elements, number of copies... something like that). Though, the triggering condition you mentioned will only happen in the last iterations of the search. We haven't thought about this issue so far, but probably something more is required to find a generic solution.

Regarding to your question, stats collection is not implemented so far. However, you can count the number of calls to next() within the search loop to get the number of visited nodes, or the size of the collection retrieved by getOpen() to get the number of nodes yet to explore.

Best regards,
Adrián

@tetv
Copy link
Author

tetv commented Feb 21, 2017

Thank you for your answer.

With regard the suggestion, one idea is to change the priority queue implementation to use the MinMaxPriorityQueue from guava, which also tracks the latest node, and then just starts comparing the cost of the latest node and remove it until find a node with cost greater or equal (processing the priority queue in the opposite way).

@gonzalezsieira
Copy link
Contributor

Thanks for your suggestion.

I am probably missing something, but what is the benefit of having access to the latest node (apart from the fact that the nodes with obsolete costs tend to appear at the end of the queue)? The main issue in my opinion is that the nodes "to-remove" can appear in any position of the queue and a complete cleanup will require iterating over all its elements. Thus, the benefits of MinMaxPriorityQueue are not quite clear, specially since in the documentation of the class it is said that: If you only access one end of the queue, and don't use a maximum size, this class is functionally equivalent to PriorityQueue, but significantly slower.

Also, currently the JARs weight about 0.5MB, which is desirable if you use the library in Android projects. In previous versions we depended on Guava, but decided to avoid it since the size shot up to more than 8MB. Introducing this dependency again will only make sense if the benefits are strong enough.

@tetv
Copy link
Author

tetv commented Feb 22, 2017

Hi @gonzalezsieira, thank you for your feedback.

  1. In terms of removing cost, the more it's removed, the more memory will be freed and therefore it could be priceless for some types of problems. As minimum, it would be very nice to have the option to activate this behaviour.

  2. Even if touching the queue is not an option because cost reasons, at least, adding new elements to the queue that will be never ever explored should be avoided! :)

  3. In a double priority queue, accessing the first and last element cost O(1) and removing then it's O(log(n)), therefore removing the last x elements (that have higher cost than the best solution cost found) will cost around O(x.log(n)) what it's much better than O(x . n) if needed to travel all queue.

  4. If guava library is not good enough (e.g. size), then maybe that class could be extracted, or get another implementation of a double priority queue that supports having getters of O(1) in both ends and at least O(log(n)) to remove it.

  5. I cloned your project and changed very few lines of code to emulate that behaviour which wasn't painful. I didn't replace the priority queue therefore removing the elements was O(n x) but at least I could see the impact of it. The main problem block was memory consumption.

This is just my 2 cents for your great library collection.

@tetv
Copy link
Author

tetv commented Feb 23, 2017

In terms of priority queue, probably the min max priority queue is not the ideal queue for this problem.

Mathematically should be possible to have an implementation of remove all low priority elements (priority less than a specific number) in one go, that means O(log(n)) instead O(x . log(n)). Typically Priority queues are implemented using heaps, but maybe a solution using binary trees can be more efficient here. E.g. Deleting one branch and reorganising the tree is O(log(n)).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants