Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

List intersection in Simplex_tree expansion #1077

Open
mglisse opened this issue Jun 19, 2024 · 0 comments
Open

List intersection in Simplex_tree expansion #1077

mglisse opened this issue Jun 19, 2024 · 0 comments
Labels
optimization It works, now make it faster

Comments

@mglisse
Copy link
Member

mglisse commented Jun 19, 2024

static void intersection(std::vector<std::pair<Vertex_handle, Node> >& intersection,
Dictionary_it begin1, Dictionary_it end1,
Dictionary_it begin2, Dictionary_it end2,

In Simplex_tree::expansion, the main operation is (roughly) computing the intersection of 2 sorted lists, and we currently do that using the simplest algorithm, with complexity len(L1)+len(L2). However, as https://arxiv.org/abs/2301.07191 notices (in particular for Erdős–Rényi graphs), these 2 lists are typically quite unbalanced: one is at depth k, while the other one is always at depth 1. Assuming this is indeed what takes most of the time in expansion (and not allocations for instance), some options could be

  1. replace the second list with a deeper (smaller) one. Currently, looking at the children of 1234 (say 5, 6 and 7), we first pick 5, L1={6, 7}, and L2 is the list of children of the vertex 5. This works because if 12345, 12346 and 56 are cliques, then 123456 is as well. However, we could also pick for L2 the children of the simplex 125 (this is just an example, there are other possible choices), the intersection remains the same and the list should be smaller. This requires ensuring that the children of 125 are computed before those of 12345, which might require reversing the iteration order.
  2. use a different intersection algorithm. A fancy one could use a galloping (exponential search) strategy. Simpler, and well suited to the unbalanced case, is, for every element of L1, checking if it belongs to L2 (this assumes that L1 is shorter, otherwise it should be more efficient in the reverse direction). Using standard binary search, this gives a complexity n₁*log(n₂). To speed this up even further, we could in a preprocessing step store the (oriented?) graph in a datastructure with fast lookup, either a dense array (that would waste a huge amount of memory for very sparse graphs), or a hash table, and get the complexity down to n₁. We can also use the minmax values of L2 to limit the part of L1 on which we need to iterate.

With a bit of profiling, on some examples, I notice that intersection accounts for less than half of the time of expansion. Some other costly operations:

  • allocating Siblings with new/delete → we could use an allocator (a pool?)
  • copying the vector produced by intersection into a flat_map → not much we can do about that, it has to be allocated at some point
@mglisse mglisse added the optimization It works, now make it faster label Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimization It works, now make it faster
Projects
None yet
Development

No branches or pull requests

1 participant