Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SuperCDMS DID Finder #2

Open
BenGalewsky opened this issue Aug 18, 2021 · 8 comments
Open

SuperCDMS DID Finder #2

BenGalewsky opened this issue Aug 18, 2021 · 8 comments
Assignees

Comments

@BenGalewsky
Copy link
Contributor

BenGalewsky commented Aug 18, 2021

As an analyzer I want to resolve SuperCDMS Datasets So I can extract data from that experiment

Assumptions

  1. Based on the ServiceX_DID_Finder_lib
  2. Will use the SuperCDMS:// DID Finder Scheme (https://gitlab.com/supercdms/slaclab-datacat)

Acceptance Criteria

@zonca
Copy link
Member

zonca commented Nov 16, 2021

@BenGalewsky what is the simplest DID finder implementation that I can look for an example?

@BenGalewsky
Copy link
Contributor Author

The CERN OpenData DID Finder is quite simple. @Michael-D-Johnson is working on a requests based DID Finder for yt which will be very very similar to the SuperCDMS implementation

@zonca
Copy link
Member

zonca commented Nov 17, 2021

thanks @BenGalewsky

@Michael-D-Johnson do you already have some code for the yt DID finder I can look at? It would be really nice for me to understand how to implement one.

@zonca
Copy link
Member

zonca commented Nov 18, 2021

@Michael-D-Johnson
Copy link

@zonca I don't have code yet to share as it's untested, but I did start with the the DID Finder Demo you found. My main addition was to the find_files function in src/demo_did.py where I wrote some logic to pass the url's I needed as the file_path variable in the yield statement. I'll share the repo once its done.

@Michael-D-Johnson
Copy link

@zonca There is an initial DID finder for yt/girder here: https://github.com/pondd-project/ServiceX_DID_Finder_Girder. If you can't access it let me know.

@zonca
Copy link
Member

zonca commented Nov 29, 2021

thanks @Michael-D-Johnson , is the selection attribute in https://github.com/pondd-project/ServiceX_DID_Finder_Girder/blob/main/tests/yt.json filtering the YT hub data and returning a subset of it?

@Michael-D-Johnson
Copy link

Michael-D-Johnson commented Nov 29, 2021

@zonca the did finder is ignoring the selection line. The did finder only needs the first line (did: ). It uses that to find the dataset on YT hub. I am beginning work on a YT hub transformer which will allow you to make a selection/subset/filter on the YT hub data. The other lines currently in the json (are needed for servicex to work) are I believe an example I got using opencerndata and so will be irrelevant to the YT transformer. Once I have the YT hub transformer working, the selection, result-format, etc. will change, and I will update the json as needed. But for testing the did finder all you need is a valid collection id (did).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants