Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Determinism / Repeatability? #1177

Open
howardt opened this issue Mar 4, 2025 · 8 comments
Open

Determinism / Repeatability? #1177

howardt opened this issue Mar 4, 2025 · 8 comments

Comments

@howardt
Copy link
Contributor

howardt commented Mar 4, 2025

This is a mostly question more than an issue.

Does Manifold make any guarantees about the repeatability of the results? That is, if you run it again or run it on a different machine, will you get the same results (where "same" includes the order of the vertices and output faces, as well as the actual geometry of the output)?

I am guessing the answer is "no", but would be happily surprised to learn otherwise. If the answer is "no", a second question would be: if there is no parallelism used to get the answer (either because it is turned off, or because the problem size is quite small), might we expect such determinism? I suspect this mostly comes down to: are any sorts used stable sorts, and do you avoid hashing on pointers? If the answer is still "no" for small problems / parallelism turned off, then how hard would it be to make it deterministic/repeatable/portable in those cases?

The reason I care is two-fold.

(1) It makes regression testing much easier, and Blender's regression tests typically rely on such a guarantee. If Manifold is repeatable on small cases, that would be sufficient for this.

(2) There are cases where non-repeatability can affect the visual appearance of a render or animation. E.g., if you get different results frame-to-frame in an animation, the animation can flicker. Or, if the user has relied on element indices in making procedural postprocessing of the Boolean result (using our procedural system "Geometry Nodes") then they will not get repeatable results. They are not supposed to do this latter thing, so this is less important; but nevertheless, some users do.

@pca006132
Copy link
Collaborator

We use stable sort, so in general, we expect things to be identical from run to run when parallelism is disabled.

When parallelism is enabled, we do have things that are racy and are not deterministic. Disabling that will greatly affect performance, so we left it as is. It is mainly about parallelizing multiple Boolean operations and scheduling them, for which you cannot determine the schedule deterministically.

For results on different machines when parallelization is disabled, we expect them to be the same, but we can't make guarantees on that because floating point is tricky. Things like FMA can have higher intermediate precision and produce different results on x86-64 compared with arm, for example.

For results across versions, things will probably be different due to mesh decimation and triangulator changes.

We tried supporting deterministic results when parallelization is enabled by providing a config parameter, but that is rarely used (due to performance penalty) and hard to test. We removed it a few months ago due to this.

@howardt
Copy link
Contributor Author

howardt commented Mar 4, 2025

Thanks for the answer.

@howardt howardt closed this as completed Mar 4, 2025
@howardt howardt reopened this Mar 6, 2025
@howardt
Copy link
Contributor Author

howardt commented Mar 6, 2025

Reopening because there is a discussion about this issue on our Blender forum - https://devtalk.blender.org/t/2025-03-03-modeling-module-meeting/39367/11

Would it be a big pain to bring back the determinism option? I wouldn't want to use it all the time if the speedup from not using it is significant (it could be an option), but there are a number of use cases where having determinism is strongly desired.

@elalish
Copy link
Owner

elalish commented Mar 6, 2025

I don't think we're bringing it back as an option - we do strive to have as much determinism as is feasible though. I'd be happy to discuss if you can file an issue with an actual repro: a situation where you're not getting deterministic results and it's causing a problem of some kind. That's important especially because there are several forms of non-determinism, and we need to be super clear about which we mean. We can also talk about other potential mitigations once we know how it's being used.

@pca006132
Copy link
Collaborator

Crossposting here in case anyone not following the blender discussion is interested in this:

It is pretty much impossible to get deterministic results across machines due to the use of things like trigonometric functions that are not covered by the IEEE standard. For manifold, we don’t do trigonometry if the user passes us a matrix, so technically, it is possible to get deterministic results. However, realistically, mesh operations will involve trigonometry when you do things like rotation, so the end result for the user is still not deterministic (across machines) even when Manifold does not introduce additional non-determinism. Technically, users can’t follow tutorials and get the same result due to this, unless they have the same machine or they are lucky.

Non-determinism on the same machine: There are two reasons here: It prohibits some optimization, and we did not try hard to optimize this. The second reason is simple: People don’t care about this enough, so we don’t prioritize this issue. Now let me explain the first reason with a bit more detail:

Apart from optimizing individual boolean operations, Manifold also tries to optimize a group of boolean operations in the form of an expression DAG where we apply simple transformations on the DAG and schedule operations to fully utilize all the CPU cores when possible. Consider the classical matrix multiplication with N matrices, one can perform dynamic programming to group matrices together and reduce the number of arithmetic operations. The idea is similar in Manifold, we try to group boolean operations that take the smallest meshes first. The problem is that we cannot predict the mesh size after the operation (the worst-case size is at least quadratic, but it is rare), so we have to wait for the operation to complete before starting some new operation. Because inexact mesh boolean isn’t commutative and associative, different reordering decisions can produce different results.

Is the current scheme optimal? No, some optimal reordering cases can be missed due to timing, and our code is just a greedy algorithm. It can also be possible to try to estimate this mesh size information and plan the reordering instead of relying on this greedy, racy approach. But this requires a bit of experimentation and probably more data to test on.

And if Blender doesn’t care about optimizing multiple boolean operations, making things deterministic on the same machine will not have too much overhead. The issue is testing. We mainly use GitHub action for running our CI, but the CI doesn’t provide too much concurrency (2 threads?) to trigger non-deterministic behavior (say we managed to remove most of the non-determinism, but there may be something left).

@pca006132
Copy link
Collaborator

pca006132 commented Mar 11, 2025

I'm looking at the sources of non-determinism here.

  1. AddNewEdgeVerts in boolean_result.cpp: This adds EdgePos non-deterministically to vectors, but because these vectors are later sorted, the result should be deterministic.
  2. csg_tree.cpp boolean reordering: I think we can come up with some good enough heuristic. For example, maybe take the smallest 16 pairs, process them pair by pair, and then add them back to the queue. Because we wait until all of them are done before taking another 16 pairs (or less), the processing order is independent of the scheduling behavior. Ideally, this should be fast enough but deterministic.
  3. boolean3.cpp, currently in the new PR, is non-deterministic, but we will fix that by sorting the result. Hopefully, this will not add too much overhead.
  4. Things like AssignNormals in impl.cpp are non-deterministic due to floating point summation not being associative. We can make this deterministic by computing a vertex to halfedge map (to the smallest halfedge where it is the starting vertex) and then computing the vertex normal in parallel. This will duplicate work, but this makes sure the result is deterministic and avoids atomics. To further improve determinism, we may replace acos needed to compute the angle-weighted pseudonormal by some known approximation, e.g. https://developer.download.nvidia.com/cg/acos.html.
  5. The parallelization of CreateHalfedges and CreateFaces should make sure they are deterministic in the future.

@brechtvl
Copy link

Usage of tbb::parallel_reduce may also need to be changed totbb::parallel_deterministic_reduce when summing floats or doubles.

@pca006132
Copy link
Collaborator

pca006132 commented Mar 11, 2025

Yes, but so far I don't think we have that. We only have integer sums or min/max when using parallel reduce. Anyway, thanks for the tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants