Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spurious NotGeoreferencedWarning during reproject #131

Open
Kirill888 opened this issue Feb 23, 2024 · 8 comments
Open

Spurious NotGeoreferencedWarning during reproject #131

Kirill888 opened this issue Feb 23, 2024 · 8 comments

Comments

@Kirill888
Copy link
Member

There is a warning coming out of rasterio when performing MEM -> MEM reproject, there seem to be no visible issues on the output. It could be an issue inside rasterio, or it could be due to inputs odc-geo provides to rasterio. Seems to be more common when using Dask, so could be due to chunking.

Reported in opendatacube/odc-stac#145

@robbibt
Copy link
Contributor

robbibt commented Feb 28, 2024

Yep, this happens with datacube.load too. It's a really annoying warning, and seems to be quite random.

@rbavery
Copy link

rbavery commented Oct 8, 2024

I'm getting this as well. Here's an MRE

import pystac_client
import numpy as np
api_url = "https://earth-search.aws.element84.com/v1"
collection_id = "sentinel-2-c1-l2a"
bbox = np.array([27.68375 , 35.875969, 28.247358, 36.458195])
client = pystac_client.Client.open(api_url)
search = client.search(
    collections=collection_id,
    datetime="2023-07-01/2023-08-31",
    bbox=bbox
)

item_collection = search.item_collection()

import odc.stac
ds = odc.stac.load(
    item_collection,
    groupby='solar_day',
    chunks={'x': 2048, 'y': 2048},
    use_overviews=True,
    resolution=20,
    bbox=bbox,
)

ds

red = ds['red']
nir = ds['nir']
scl = ds['scl']

# generate mask ("True" for pixel being cloud or water)
mask = scl.isin([
    3,  # CLOUD_SHADOWS
    6,  # WATER
    8,  # CLOUD_MEDIUM_PROBABILITY
    9,  # CLOUD_HIGH_PROBABILITY
    10  # THIN_CIRRUS
])
red_masked = red.where(~mask)
nir_masked = nir.where(~mask)

ndvi = (nir_masked - red_masked) / (nir_masked + red_masked)

ndvi_before = ndvi.sel(time="2023-07-13")
ndvi_before.plot()
/Users/ryanavery/test-dask-on-ray/.venv/lib/python3.11/site-packages/rasterio/warp.py:387: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
  dest = _reproject(

@rbavery
Copy link

rbavery commented Oct 8, 2024

Looking into this with a debugger I see that eventually in dask locals.py, only one of the tasks is triggering the warning and it seems to happen when NIR is processed after reading. So the MRE above can be simplified to

import pystac_client
import numpy as np
api_url = "https://earth-search.aws.element84.com/v1"
collection_id = "sentinel-2-c1-l2a"
bbox = np.array([27.68375 , 35.875969, 28.247358, 36.458195])
client = pystac_client.Client.open(api_url)
search = client.search(
    collections=collection_id,
    datetime="2023-07-01/2023-08-31",
    bbox=bbox
)

item_collection = search.item_collection()

import odc.stac
ds = odc.stac.load(
    item_collection,
    groupby='solar_day',
    chunks={'x': 2048, 'y': 2048},
    use_overviews=True,
    resolution=20,
    bbox=bbox,
)

ds

red = ds['red']
nir = ds['nir']
scl = ds['scl']

nir.compute()

The warning does not occur if I compute the scl time series.

What's really strange is that if I run

scl.compute()
nir.compute()

without restarting the kernel, I don't get the warning. the warning only occurs if running the example with nir.compute() or red.compute() end to end in a fresh kernel. Those two lines need to be run in separate cells to not show the warning in either because of some async behavior I think.

@Kirill888
Copy link
Member Author

rasterio/rasterio#2497

Probably related, I think related to Rasterio not hiding some warnings in some specific circumstances

@mdsumner
Copy link

mdsumner commented Oct 9, 2024

I think it's occuring in the dask cog writer when it opens some mem datasets, been having fun exploring to find it .

Also rasterio only warns once per session if you open a bare new dataset, which might explain the transience

@Kirill888
Copy link
Member Author

Probably when creating output dataset somewhere inside rasterio

@robbibt
Copy link
Contributor

robbibt commented Oct 9, 2024

Is there any way we can catch/suppress this inside odc-* and datacube? Although it seems to have no impact, the spam it produces does impact user experience - particularly for new/beginner users who freak out when they see any kind of warning...

@mdsumner
Copy link

mdsumner commented Oct 9, 2024

I've been slowly pivoting in trying to find where it might happen, this is the latest clue I had fwiw:

https://gist.github.com/mdsumner/55bb0708c3eeaeeb20a290c96fcc4ce6?permalink_comment_id=5219757#gistcomment-5219757

(it's not a reprex but any input tif should do) This has been a good focus for me to explore what's going on in odc 🙏 - but ultimately my guess is that rasterio should a allow a "warning suppression" option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants