Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regularization in higher dimensions #354

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

YuanbinLiu
Copy link
Collaborator

@YuanbinLiu YuanbinLiu commented Mar 7, 2025

We have upgraded the functionality to support high-dimensional convex hull calculations with regularization. Previously, it was limited to 3D convex hulls, but we are now able to handle higher-dimensional cases, such as 4D convex hulls for three-element systems.

  • Test 1: Verify whether the new code can produce the same convex hull results as the old version for the 3D case.

  • Test 2: Evaluate whether the new code can handle high-dimensional (>3D) convex hulls.

@JaGeo
Copy link
Collaborator

JaGeo commented Mar 7, 2025

Thank you @YuanbinLiu ! Some tests are still failing.

Can we handle anything beyond 4D? If not, should we add this limitation to the documentation?

@YuanbinLiu
Copy link
Collaborator Author

Thank you @YuanbinLiu ! Some tests are still failing.

Can we handle anything beyond 4D? If not, should we add this limitation to the documentation?

It's not limited to 4D; higher dimensions are also fesible. I'll find time to check for errors, but the tests passed before I pushed the PR.

@JaGeo
Copy link
Collaborator

JaGeo commented Mar 7, 2025

@YuanbinLiu Thanks for the explanation!

Test results can be dependent on the operating system

@YuanbinLiu
Copy link
Collaborator Author

@YuanbinLiu Thanks for the explanation!

Test results can be dependent on the operating system

I have tried several times, all passed. I didn't found some bugs. Can anyone test it on a different system?

f"Point and points must have the same dimensionality. Got {pn.shape[0]} and {preg.shape[1]}."
)
hull = ConvexHull(preg)
return np.all(np.dot(hull.equations[:, :-1], pn) + hull.equations[:, -1] <= 1e-12)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this floating point comparison maybe brittle?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be caused by this. I tested it on different computers with new environments, and it worked fine.

fraction_list = [[1.0]] + [[0.0]] + [[0.5]] * 8
calc_hull_ND = calculate_hull_nd(label)

assert np.allclose(calc_hull_3D.equations, calc_hull_ND.equations, atol=1e-6)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different tolerance needed?

get_e_dist_hull_ND = get_e_distance_to_hull_nd(
calc_hull_ND, atom, {3: -0.28649227, 17: -0.25638457}, "REF_energy"
)
assert np.allclose(get_e_dist_hull_3D, get_e_dist_hull_ND, atol=1e-6)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same jere?

@JaGeo
Copy link
Collaborator

JaGeo commented Mar 7, 2025

@YuanbinLiu could you check all numerical tolerances again and especially for the failing test? It is likely just the tolerance

@@ -245,7 +245,7 @@ def test_regularization_for_three_element_system(test_dir, memory_jobstore, clea
for at in atoms
]

assert all(d >= -1e-6 for d in des)
assert np.all(np.array(des) >= -1e6)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be -1e-6?

Copy link
Collaborator

@naik-aakash naik-aakash Mar 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

@YuanbinLiu YuanbinLiu Mar 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If some configurations have a distance from the convex hull exceeding 1e6, they will be excluded from the training set afterwards.

@naik-aakash
Copy link
Collaborator

naik-aakash commented Mar 8, 2025

@YuanbinLiu could you check all numerical tolerances again and especially for the failing test? It is likely just the tolerance

Are the values supposed to be non zero, positive , negative ? Is there any specific range this values are suppose to be ? Also with this new function added supporting n dimensions, do we still need 3d method or it can be replaced with this new function now ? Can you comment on this @YuanbinLiu ?

@YuanbinLiu
Copy link
Collaborator Author

@YuanbinLiu could you check all numerical tolerances again and especially for the failing test? It is likely just the tolerance

Are the values supposed to be non zero, positive , negative ? Is there any specific range this values are suppose to be ? Also with this new function added supporting n dimensions, do we still need 3d method or it can be replaced with this new function now ? Can you comment on this @YuanbinLiu ?

They should be non-negative (i.e., >= 0). Actually, we no longer need 3D, as this has already been handled in the new function.

@@ -699,6 +699,7 @@ def preprocess_data(
regularization: bool = False,
retain_existing_sigma: bool = False,
scheme: str = "linear-hull",
element_order: list = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
element_order: list = None,
element_order: list | None = None,

list = None would cause a data type mismatch for list

norm = np.cross(n_d[2] - n_d[0], n_d[1] - n_d[0])
plane_norm = norm / np.linalg.norm(norm)
else:
A = n_d[:-1] - n_d[0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe clear variable names would be a bit better?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants