Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix nodata retrieval logic to ensure compatibility with rioxarray #192

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

lbferreira
Copy link

Currently, if we set nodata using rioxarray, odc-geo fails to retrieve the nodata value.
This PR changes nodata retrieval logic to ensure compatibility with nodata values set using rioxarray.

Before the proposed changes, we have the problem shown in the screenshot below. After the proposed changes, no errors were observed.
Before fix
Screenshot 2024-12-09 144556
After fix
Screenshot 2024-12-09 144855

@Kirill888
Copy link
Member

@lbferreira thanks for this, but I'd prefer a different fix here, instead of nodata == () check it really should be nodata is (), this will deal with "nodata is set to a single-element array issue". I dislike if k in A: do things with A[k] pattern as it's not thread-safe.

I also think rioxarray should make sure to convert nodata value to a python float and not use numpy.float64.

Another nitpick: when adding commits please use following template:

Summary text no more than 76 chars (important)

expanded explanation if needed. Ideally formatted to not have long lines

Having commit message as a long single line text makes it much harder to review changelog.

@Kirill888
Copy link
Member

Kirill888 commented Dec 10, 2024

In your specific case you can work-around the current issue by using float("nan") instead of np.nan. Like:

da.odc.nodata = float("nan")
# OR
da.rio.write_nodata(float("nan"), encoded=False, inplace=True)

@lbferreira can you please report what version of numpy and rioxarray is this and what is dtype of da array in your example , I can't seem to be able to reproduce the issue locally.

EDIT: it's a numpy 2 issue triggered by use of np.nan instead of float("nan").

@lbferreira
Copy link
Author

lbferreira commented Dec 10, 2024

@Kirill888 Sorry for the long commit message.
I am using the following lib versions:
odc-geo: 0.4.8
numpy: 2.2.0
rioxarray: 0.18.1

Using float("nan") hasn`t solved the problem. Actually, any value that I set as nodata using rioxarray (int or float) is not correctly retrieved by odc-geo. I am providing a full example below. However, replacing the operator "==" with "is" solves the problem, as you have mentioned. Thus, I will make this change. Please let me know if it looks good to you.

Full example

import xarray as xr
import numpy as np
import rioxarray
import odc.geo.xr

# Create a fake dataarray
da = xr.DataArray(
    np.random.rand(10, 10).astype(np.float32),
    coords={"x": np.arange(10), "y": np.arange(10)},
    dims=["y", "x"],
)

da.rio.write_nodata(float('nan'), encoded=False, inplace=True)
print(da.rio.nodata)

print(da.odc.nodata)

Output
Screenshot 2024-12-10 084203

Copy link

codecov bot commented Dec 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.44%. Comparing base (796a99c) to head (fa4a84c).

Additional details and impacted files
@@           Coverage Diff            @@
##           develop     #192   +/-   ##
========================================
  Coverage    95.44%   95.44%           
========================================
  Files           31       31           
  Lines         5534     5534           
========================================
  Hits          5282     5282           
  Misses         252      252           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Kirill888
Copy link
Member

@Kirill888 Sorry for the long commit message.

no worries, we can fix it. To replace last three commits with one please do:

git reset --soft HEAD~3

Then you can make new commit instead of the last three with whatever tool you use. You can then push it to github with:

git push --force-with-lease

to replace last three commits with one.

@rosepearson
Copy link

Glad to see this is being addressed! Causing me problems as well!

@Kirill888
Copy link
Member

@rosepearson this is numpy>=2 issue, it will take some time for numeric Python ecosystem to stabilize around new numpy I reckon. I would recommend adding numpy<2 constraint to your environments for now.

@lbferreira lbferreira force-pushed the rio_nodata_compatibility_fix branch from ac2ff81 to ab46b06 Compare December 11, 2024 14:28
@lbferreira
Copy link
Author

@Kirill888 All done!

@lbferreira lbferreira changed the title Refactor nodata retrieval logic in ODCExtensionDa class to ensure com… Fix nodata retrieval logic to ensure compatibility with rioxarray Dec 11, 2024
@lbferreira
Copy link
Author

@Kirill888 I have noted that after this fix, when you have errors, the stack trace comes with a warning... Although it is only a warning, it`s a little annoying. Do you have any suggestions to handle this? In the example below a forced an error, just to show the warning related to nodata...
Screenshot 2024-12-11 105534

@lbferreira lbferreira requested a review from Kirill888 December 11, 2024 17:09
@Kirill888
Copy link
Member

@lbferreira thanks, yep, that's a fair warning. We are using () as a sentinel value, so we rely on the fact that () is (). Let's replace () with numpy._NoValue instead.

@lbferreira lbferreira force-pushed the rio_nodata_compatibility_fix branch from ab46b06 to fa4a84c Compare December 12, 2024 14:29
@lbferreira
Copy link
Author

@Kirill888 Now everything is working fine without warnings.

@lbferreira
Copy link
Author

lbferreira commented Dec 13, 2024

@Kirill888 mypy and pylint tests are not passing because we are accessing a protected class/variable. Any thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants