-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ranges? #33
Comments
One issue that might occur is to define > 0, then you'd need to do something like So an alternative would be to allow a minimal subset of JSON Schema (minimum, maximum, exclusiveMinimum, exclusiveMaximum) and allow an object instead of an array, e.g. for > 0: {
"exclusiveMinimum": 0
} |
would it be terrible to be super explicit like
|
Yes, I think it is terrible ;-) How would you decide whether [1,2] is a range from 1 to 2 or the two categorical values 1 and 2? |
that's why the keys are explicit |
Ooooh, I didn't catch that difference. Sorry. I don't think that is necessary, it is more complicated to describe and read but doesn't give any obvious benefit to me? |
I just don't like that I do really like ranges that are json schema objects. and of course I still don't like putting ranges into classes ;), but I want to at least get somewhere with the concept. |
I did not consider lists of values in my proposal yet because you can emulate them just by having the classes multiple times while you can't reasonable express continuous ranges. But yeah, the full-fledged solution would be:
Not sure whether we should cater for all, the use cases I've heard about so far were only continuous ranges. |
Alternatively, we could try something like the following although I'm not sure that would be valid in raster as we give "made-up" statistics / exclude no-data values from statistics. It also feels less intuitive.
Thoughts, @emmanuelmathot ? |
Any opinion on preferring
versus
|
I'd follow JSON Schema as we already use it in other places, which (except for the outdated draft-4) use numbers instead of boolean flags: https://json-schema.org/understanding-json-schema/reference/numeric.html#range |
ok, didn't realize I was looking at an older draft 👍 |
seems redundant or out of place when the data set is describing rain fall in mm and a negative depth doesn't make sense. |
Yeah, I'm liking it more the more I'm thinking about it but it's less flexible and covers only some use cases, I assume. Also, it doesn't seem so wrong to exclude no-data values from statistics because they are usually always just made-up values for the file format that doesn't support encoding them properly. I guess we only need to clarify in raster that no-data values are invalid pixel values and as should not be reflected in statistics etc. On the other hand, statistics are usually real min/max values while what we want to describe here are theoretical min and max values. For example, if you have a raster with precipation values, the min and max could be 1, 5 and 10 so min/max are 1 and 10, although the potential range is 0 to infinity (mostly). But maybe that's not an issue?! |
are we really saying that this is a continuous dataset with classed nodata and should have something roughly like:
with something else that says that clarifies that the data range of possible values does not include the full range of the datatype? |
Sorry, misunderstood you initially. But still not sure, I think I like the proposal above more, because it just adds an additional field hier instead of adding a new data type to an existing field. #33 (comment) |
yes, the question is more "is |
Well, nodata is already part of raster so would be a change in that extension. But I don't like putting classification:classes into so many different places. Also, if you have no-data values and categorical values in a file, do you really want to have them in two different places? |
|
The more I think about it, saying "this dataset uses classes but isn't classified" seems reasonable and simple. |
I created PR #34 to discuss a potential solution more closely. |
|
I like the "full-fledged solution". However, even if the array of integers doesn't make it in, I prefer the json schema like object for continuous ranges for its clarity; it also leaves the door open to adding arrays of integers without having to change how continuous ranges are expressed. Mocking up
Perhaps not necessary, but it is nice to be able to describe the valid range of vegetation index data (a defined subset of the possible int16 values). |
To me describing the valid range of a continuous dataset has nothing to do with classification. I'm not sure how a client can or should deal with that class when it isn't a class at all. |
@drwelby I see your point, I think. I suppose the same argument could be made for any continuous range? Or is it particular to the valid range? |
To me the valid range is akin to |
Yep, I see the connection to |
From the STAC call: No one screamed at me when I said "ranges" are no categories. ;-) I think we can leave this open for further feedback, but I won't push for a change here. If you only want to describe a single class of valid values (e.g. >= 0), then consider using the statistics or histogram in raster:bands. |
@pjhartzell How would you want to expose that exactly? 12-21, 23-32, 34-43, ...? or just 12-87? |
For this case, [12-21, 23-32, 34-43] would be ideal. [12-87] would be a fallback if multiple ranges can't be expressed. |
It comes up over and over again, the range values. Recently in #31. A common example seems to be something like:
Should we cater for this? I think the simplest solution would be to allow for
value
an array with two values that can on one side ne null (for open-ended range) as defined also by the STAC Collection extents.Then you could have something like:
The text was updated successfully, but these errors were encountered: