Replies: 4 comments 8 replies
-
Reading more into #956... @TomNicholas after reading #956 (reply in thread) I am very curious if you could elaborate on that. In a recent discussion with @paraseba I figured that there would be no good way to persist credentials (or arbitrary code that could retrieve such) into an icechunk store itself. Thats why I thought earthaccess is an ideal layer to implement this quite specific but powerful logic. We can assume that users of earthaccess have some form of credentials for earthdata, and with that we can generate the necessary refreshable credentials to supply to the icechunk repo at opening, while hiding a lot of the details from the users. But if there is a more general way to support this in icechunk, I would love to hear about it. |
Beta Was this translation helpful? Give feedback.
-
Thanks for working on this @jbusecke!! seems very similar to what @TomNicholas presented at CNG but with NASA data! here things get a little intertwined, one could argue that What I have in mind with Icechunk is that NASA DAACs or missions (e.g. NISAR) will be the ones generating these stores, not necessarily end-users. In any case, it will be great for earthaccess to provide a single line to open an icechunk store. I think there are 2 questions. NASA files can be accessed in 2 ways, in-region we can use the S3 URLs, out of region we need to use HTTPS. It would be great if Icechunk could provide 2 distribution links in their chunk manifest metadata. Something like store = earthaccess.open_icechunk("s3://icechunk-store", distribution="s3") I don't think this is a deal breaker, DAACs could generate 2 stores, one that points to S3 URLs and another that uses the HTTPs links so the stores can be used out of region. @jbusecke if you're available it would be great if you could join one of the earthaccess hacking hours to demo what you did with this! |
Beta Was this translation helpful? Give feedback.
-
Lolz, I dunno how any of this works and I'm sure it's awesome, but I'll still ungraciously chime in with a user experience concern. The addition of these underscore methods (e.g. open_virtual_dataset) leaves a lot to be desired, in my view. Great tools, absolutely, but a poor experience for a typical earth scientist. Is there a case to be made for having |
Beta Was this translation helpful? Give feedback.
-
Started working on this during the hack hour today and got stuck with this. Ill see if I can make more time in the coming days, or for the next hack. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Over here I am experimenting with building virtual icechunk stores for data stored on NASA cloud buckets. This seems to tie in with a lot of the ideas in #956
For this sort of setup we need to provide two credentials:
In the link above you can see that this results in a TON of boilerplate code, and I would like to reduce that at least for this specific use case (data is all accessible via earthaccess, the general case is much thornier and might in fact never work due to security concerns).
What I am proposing is a new
.open_icechunk(url_to_icechunk)
method which does the following:Beta Was this translation helpful? Give feedback.
All reactions