EHN: cluster: JAX support (non-jitted) #22255

crusaderky · 2025-01-06T12:39:01Z

Part of ENH: tracking issue for JAX support #22246
See also ENH: special: JAX support (non-jitted) #22256

crusaderky · 2025-01-06T12:41:16Z

scipy/cluster/vq.py

@@ -137,8 +137,8 @@ def whiten(obs, check_finite=True):
    obs = _asarray(obs, check_finite=check_finite, xp=xp)
    std_dev = xp.std(obs, axis=0)
    zero_std_mask = std_dev == 0
-    if xp.any(zero_std_mask):
-        std_dev[zero_std_mask] = 1.0
+    std_dev = xpx.at(std_dev, zero_std_mask).set(1.0)


This fails on jax.jit. My current intention is to change jax.jit itself to special-case arr.at[idx].set(value) when idx is a boolean mask and value is a scalar, so that it can be rewritten as jnp.where(idx, value, arr). Failing that, I can implement the same special case in array-api-extra.

My current intention is to change jax.jit itself

I don't think changing jax.jit itself is a viable path here – I wouldn't suggest starting on that route.

It might be viable to make arr.at[idx].set(value) lower to lax.select rather than lax.scatter in the specific case of a boolean idx. I've tried that in the past, but it's really tricky to properly handle all corner cases of broadcasted and/or multi-dimensional indices, correctly implementing autodiff and batching rules, etc.

The easiest thing would probably be to do this at the level of xp.at, though boolean indices were specifically excluded from the initial discussions there.

I've tried that in the past

Do you have a (partial, non-functioning) PR I could start from?

crusaderky · 2025-01-06T12:42:23Z

scipy/cluster/vq.py

-    if xp.any(zero_std_mask):
-        std_dev[zero_std_mask] = 1.0
+    std_dev = xpx.at(std_dev, zero_std_mask).set(1.0)
+    if check_finite and xp.any(zero_std_mask):


This fails on jax.jit with an error message that's not not useful to final scipy users, unless you explicitly pass check_finite=False.
data-apis/array-api-compat#225 is a blocker for a more robust fix.

crusaderky · 2025-01-06T13:48:21Z

All CI failures are unrelated

lucascolley

thanks @crusaderky, LGTM!

Are there any potential improvements to make to

scipy/scipy/cluster/hierarchy.py

Lines 1839 to 1843 in 7a81451

    
           if is_jax(xp): 
        
               # calculate_cluster_sizes doesn't accept read-only arrays 
        
               Zpart = np.array(Zpart, copy=True) 
        
           else: 
        
               Zpart = np.asarray(Zpart)

at this time?

lucascolley · 2025-01-06T15:29:34Z

also, have you ran the full CUDA tests locally? I can, if not.

crusaderky · 2025-01-06T16:58:38Z

thanks @crusaderky, LGTM!

Are there any potential improvements to make to

scipy/scipy/cluster/hierarchy.py

Lines 1839 to 1843 in 7a81451

if is_jax(xp):

# calculate_cluster_sizes doesn't accept read-only arrays

Zpart = np.array(Zpart, copy=True)

else:

Zpart = np.asarray(Zpart)

at this time?

Not yet. I intend to replace this and all other calls to np.asarray in the middle of xp-aware code with xpx.apply_numpy_function(func, arg), which internally will call jax.pure_callback(func, arg) for JAX and xp.asarray(func(np.asarray(arg)) for everything else.

crusaderky · 2025-01-06T17:19:02Z

also, have you ran the full CUDA tests locally? I can, if not.

Yes. All green functionally. This said, there are points where arrays are yanked from device memory to host memory by np.asarray, ran through a Cython kernel (search for _vq.), and sent back to jax.

lucascolley · 2025-01-06T17:22:07Z

This said, there are points where arrays are yanked from device memory to host memory by np.asarray, ran through a Cython kernel (search for _vq.), and sent back to jax.

Ah, does JAX not raise/warn there? Do we want to error out there @rgommers as per gh-18286?

jakevdp · 2025-01-06T17:33:47Z

Ah, does JAX not raise/warn there?

JAX will raise there during traced execution (e.g. under JIT), but in eager execution np.asarray is just fine. If you want to prevent this, you can enable the transfer guard.

rgommers · 2025-01-06T17:36:14Z

Using the "disallow" transfer guard in scipy/conftest.py would seem reasonable to me.

crusaderky · 2025-01-06T17:57:20Z

Using the "disallow" transfer guard in scipy/conftest.py would seem reasonable to me.

I'll investigate this but I expect it to result in a substantial amount of xfails. It would also make sense to figure out the equivalent setting for torch and cupy within the same scope. Either way, I'd rather leave it to a later PR...

lucascolley · 2025-01-06T18:01:37Z

I'll investigate this but I expect it to result in a substantial amount of xfails. It would also make sense to figure out the equivalent setting for torch and cupy within the same scope. Either way, I'd rather leave it to a later PR...

It should line up with existing cpu_only skips, no? CuPy errors on np.asarray

rgommers · 2025-01-06T18:37:06Z

It should line up with existing cpu_only skips, no? CuPy errors on np.asarray

That's what I expect as well. Same for PyTorch, it does not do implicit transfers.

lucascolley · 2025-01-06T21:03:09Z

Using the "disallow" transfer guard in scipy/conftest.py would seem reasonable to me.

Reminder to follow-up with this - thanks!

github-actions bot added the scipy.cluster label Jan 6, 2025

crusaderky commented Jan 6, 2025

View reviewed changes

lucascolley added enhancement A new feature or improvement array types Items related to array API support and input array validation (see gh-18286) labels Jan 6, 2025

crusaderky mentioned this pull request Jan 6, 2025

ENH: special: JAX support (non-jitted) #22256

Merged

lucascolley approved these changes Jan 6, 2025

View reviewed changes

lucascolley added this to the 1.16.0 milestone Jan 6, 2025

EHN: cluster: JAX support (non-jitted)

7c69d17

crusaderky force-pushed the jax_cluster branch from 4c9c5b9 to 7c69d17 Compare January 6, 2025 20:10

lucascolley merged commit a32bc49 into scipy:main Jan 6, 2025
35 of 37 checks passed

crusaderky deleted the jax_cluster branch January 7, 2025 00:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EHN: cluster: JAX support (non-jitted) #22255

EHN: cluster: JAX support (non-jitted) #22255

crusaderky commented Jan 6, 2025 •

edited

Loading

crusaderky Jan 6, 2025

jakevdp Jan 6, 2025 •

edited

Loading

crusaderky Jan 6, 2025

crusaderky Jan 6, 2025 •

edited

Loading

crusaderky commented Jan 6, 2025

lucascolley left a comment

lucascolley commented Jan 6, 2025

crusaderky commented Jan 6, 2025

crusaderky commented Jan 6, 2025

lucascolley commented Jan 6, 2025

jakevdp commented Jan 6, 2025 •

edited

Loading

rgommers commented Jan 6, 2025

crusaderky commented Jan 6, 2025

lucascolley commented Jan 6, 2025

rgommers commented Jan 6, 2025

lucascolley commented Jan 6, 2025

	if is_jax(xp):
	# calculate_cluster_sizes doesn't accept read-only arrays
	Zpart = np.array(Zpart, copy=True)
	else:
	Zpart = np.asarray(Zpart)

EHN: cluster: JAX support (non-jitted) #22255

EHN: cluster: JAX support (non-jitted) #22255

Conversation

crusaderky commented Jan 6, 2025 • edited Loading

crusaderky Jan 6, 2025

Choose a reason for hiding this comment

jakevdp Jan 6, 2025 • edited Loading

Choose a reason for hiding this comment

crusaderky Jan 6, 2025

Choose a reason for hiding this comment

crusaderky Jan 6, 2025 • edited Loading

Choose a reason for hiding this comment

crusaderky commented Jan 6, 2025

lucascolley left a comment

Choose a reason for hiding this comment

lucascolley commented Jan 6, 2025

crusaderky commented Jan 6, 2025

crusaderky commented Jan 6, 2025

lucascolley commented Jan 6, 2025

jakevdp commented Jan 6, 2025 • edited Loading

rgommers commented Jan 6, 2025

crusaderky commented Jan 6, 2025

lucascolley commented Jan 6, 2025

rgommers commented Jan 6, 2025

lucascolley commented Jan 6, 2025

crusaderky commented Jan 6, 2025 •

edited

Loading

jakevdp Jan 6, 2025 •

edited

Loading

crusaderky Jan 6, 2025 •

edited

Loading

jakevdp commented Jan 6, 2025 •

edited

Loading