Skip to content

Conversation

lmoresi
Copy link
Member

@lmoresi lmoresi commented Aug 8, 2025

This branch is to provide the global evaluation of variables by using a swarm to scatter points from the local process to the places we wish to sample.

  1. Need to fix up the storage of data / evaluation of functions. The evaluate methods should return data that is of the same shape as the evaluated function. We also need to be able to store this on a swarm and pass it around, then retrieve it safely.

  2. Decide on the interface we present to the user. Everything needs to look like a sympy matrix. Scalars are (X,1,1) in shape, vectors are (X,1,dim), tensors are (X,dim.dim) even if symmetric. We can squeeze arrays to remove extra dimensions but there should be a consistent pattern behind the scenes.

  3. Simplify the access methods - there are many redundant operations: variables are switched between readable and writable but the readable versions are never seen in the user space. There is a lot of room to reduce the footprint of this awkward pattern

@Copilot Copilot AI review requested due to automatic review settings August 8, 2025 11:04
@lmoresi
Copy link
Member Author

lmoresi commented Aug 8, 2025

Don't merge - this branch is still for testing

Copilot

This comment was marked as outdated.

@lmoresi lmoresi changed the base branch from main to development August 9, 2025 01:34
Attempting to fix up the evaluation to be internally consistent, adding the wrappers for the access manager (and calling it `array` for now).

Fix up tests to account for changes to evaluation array sizes on return (mostly squeezes)

Fix a couple of other evaluation bugs (like the need to add a zero vector to some nearly-constant matrices).

Repairing the quickstart guide here and there.
@jcgraciosa
Copy link
Contributor

I am testing out the changes in this pull request. I found the following issues so far:

  1. uw.function.evaluate() produces errors when evaluating a matrix with a constant inside:
meshbox = uw.meshing.UnstructuredSimplexBox(minCoords  = (0., 0.),
                                            maxCoords  = (1., 1.),
                                            cellSize = 1. / 16,
                                            regular = False)

x, y = meshbox.N.x, meshbox.N.y
vector_fn2 = sympy.Matrix((x*x, 0)) 
uw.function.evaluate(vector_fn2, meshbox.data, rbf = False)
  1. The pack/unpack results to mismatches if an asymmetrical tensor is used:
meshbox = uw.meshing.UnstructuredSimplexBox(minCoords   = (0., 0.),
                                            maxCoords   = (1., 1.),
                                            cellSize    = 1. / 16,
                                            regular     = False)

swarm = uw.swarm.Swarm(meshbox)

t = uw.swarm.SwarmVariable(
    "T",
    swarm,
    vtype=uw.VarType.SYM_TENSOR,
    proxy_degree=1,
    proxy_continuous=True,
    varsymbol=r"\mathbf{T}",
)

swarm.populate(0)

x, y = meshbox.N.x, meshbox.N.y
tensor_fn = sympy.Matrix(((x*x, x*x*y), (x*y, y*y))) # FIXME: this produces mismatch between packed and unpacked 

with swarm.access(t):
    t_data = uw.function.evaluate(tensor_fn, swarm.particle_coordinates.data, rbf=False)

t.pack(t_data)

print(f"mismatches: {(t_data != t.unpack()).sum()}")

Perhaps this is due to the vtype of t set as uw.VarType.SYM_TENSOR.
Changing the vtype to uw.VarType.TENSOR results to some array size mismatches.

Fix for the first comment by @jcgraciosa.

Also needed to update the JIT code to understand the MATRIX variable type and removing the other types that are (currently) not used. These cause compilation errors if they are lit up by mistake (even if that variable is not in any expression).

Solvers: Navier Stokes / Advection-Diffusion - fix the projection terms and how they pack up their data.
@lmoresi
Copy link
Member Author

lmoresi commented Aug 15, 2025

@jcgraciosa - your point (1) above is (perhaps) fixed - go ahead and try again to break it ! Your point (2) is intentional / expected. If you pass a non-symmetric tensor to set a symmetric tensor, the non-symmetric part is ignored and the lower-triangular section takes precedence. We probably need to introduce the operator to split into symmetric / anti-symmetric parts, but here I am assuming that the input is a mistake. Sympy knows about symmetric matrices, so we could introduce a warning or error if the input is non-symmetric.

lmoresi and others added 16 commits August 15, 2025 15:38
Adding global evaluation routines to compute values anywhere in the distributed domain (also with extrapolation using rbf)
Fixing broken test (array shape)
Not setting initial rank - this has problems when the particle is located on the correct rank because it automatically, always gets migrated to rank 0 and we start hunting for it again (but always in the wrong place).
Missed one !
Nodal Point Swarm also had the issue of uninitialised rank. This causes particles on rank > 1 to be migrated to rank 0 on the first call to migrate. They then get lost ...

Potential fix @jcgraciosa but maybe not the only issue here.
Fixing lack of rank initialisation in swarm.populate.

Also: some changes to the derivative expressions code:
  - .where relevant sym returns the inner expression derived by the diff_variable inner expression in symbolic form (which is closer to the intent of .sym) and does not return the evaluated form.
  - .doit function returns the evaluated form (so this is like the sympy deferred derivative).
This code is *probably* no longer needed
And leaving NodalPointPICSwarm alone
Shortening some monster-sized files.
Imports etc should not change
@lmoresi lmoresi requested a review from Copilot September 1, 2025 11:41
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a significant refactoring of the global evaluation system in Underworld3, focusing on how variables are evaluated across processes using swarms for distributed data handling. The main objectives are to improve data storage/evaluation consistency, standardize the user interface for matrix-like operations, and simplify variable access patterns.

Key changes include:

  • Standardizing function evaluation to return consistently shaped arrays with .squeeze() calls
  • Refactoring swarm variable data access patterns with new .array property and pack/unpack methods
  • Splitting PIC swarm functionality into separate module for better organization
  • Updating KDTree interface for consistent distance calculation options

Reviewed Changes

Copilot reviewed 38 out of 49 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/test_*.py Updated test files to use .squeeze() on evaluation results and modernized data access patterns
src/underworld3/swarm.py Major refactoring with new array access patterns, pack/unpack methods, and PIC swarm extraction
src/underworld3/pic_swarm.py New module containing extracted PIC swarm classes for better code organization
src/underworld3/kdtree.py Enhanced KDTree interface with sqr_dists parameter for distance calculation control
src/underworld3/visualisation.py Updated to use .squeeze() for consistent array shapes
src/underworld3/systems/*.py Modified to use new array access patterns and .squeeze() for consistent returns
src/underworld3/utilities/_jitextension.py Updated variable type handling and error messaging
Comments suppressed due to low confidence (1)

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

…t in a more user friendly form attached to the swarm
New data structure with callbacks - a sub-class of np.ndarray that synchronises with the petsc data structure when required and calls swarm migration, proxy updating, etc when values are changed.

+= and other in-place operators are disabled because they are difficult to synchronise properly / efficiently.

Various ways to suspend particle synchronisation are included in the arraay implementation.

This does over-print the swarm.data strategy for accessing points if the access manager is used - so more changes are still needed.
Actually, this is a question of switching out all of the swarm stuff which can now be replaced by the global evaluation. Note in this version there is a numerical issue with the advection scheme but it does now run correctly.
Semi-Lagrange time integrator no longer uses a nodal point swarm. For simplicity it uses global evaluation (this is now fully capable of handling complicated data structures).

Note: performance could be improved by caching the locations to evaluate multiple data structures simultaneously.

We have not removed all vestigial code yet.

Tests pass, but not clear if they are testing the values in the advection scheme.
We need to do this because there is a stokes.F0, a poisson.F0 and it is actually useful for documentation etc that they have the same name as each other. But sympy needs them to look different.
(Also helpful for humans if we get this correct !)
Goal here is to make the examples into an AI training set as well as useful to humans
 - Started the AI / doc process formally - refactored examples and added a framework for documentation (mostly empty)
 - For quantitative tests (e.g. advection) added some boilerplate so they should run as notebooks with graphics to see why the test is failing (if it is a math regression).
  - Sketched out a mulit-material constitutive model and show how to solve SolCx with it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants