-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(records): add LHCb Ntuples example from the hackathon #3720
base: master
Are you sure you want to change the base?
Conversation
I gave a first pass of editing, but this is still a work in progress from my end. I will also start some threads on the changed file for finer details. I tried to push some changes so we could have a discussion on the updated @@ -1,7 +1,7 @@
[
{
"abstract": {
- "description": "<p>Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for {insert reason for request}.</p><p>FIXME Here should come a long explanation of inputs, ntuples, etc.</p><p>FIXME If we can attach the <code>info.yaml</code> etc, this is the place wher they would be added."
+ "description": "<p>Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for exploring heavy baryon decays used as control channels for CP violation studies. This was used as an example during the <a href=https://indico.cern.ch/event/1429526/ target=_blank>First LHCb Open Data and Ntuple Wizard Workshop</a>. NOTE -- IN THE FUTURE, NEED TO FETCH FROM REQUEST DETAILS HERE https://gitlab.cern.ch/cernopendata/lhcb-ntupling-service-requests-dev/-/issues/142 OR A MORE SECURE PLACE! </p><p> Ntuples are created using DaVinci version {v46r9 - extract from info.yaml} and the Analysis Productions batch processing system. Quantities saved to the Ntuple are specified during the request phase and detailed in the following code configuration files.</p><p>FIXME If we can attach the <code>info.yaml</code> etc, this is the place where they would be added."
},
"accelerator": "CERN-LHC",
"collaboration": {
@@ -54,7 +54,7 @@
"secondary": ["Collision"]
},
"usage": {
- "description": "<p>You can clone and amend this ntupling request here:",
+ "description": "<p>Ntuples and instructions are provided in the links under Related datasets. They are ready to be downloaded and used for analysis. If these do not suit your needs, you can clone and amend this ntupling request here:",
"links": [
{
"description": "LHCb Open Data Ntupling Service",
@@ -65,7 +65,7 @@
},
{
"abstract": {
- "description": "Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for {insert reason for request}."
+ "description": "Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for exploring heavy baryon decays used as control channels for CP violation studies. This was used as an example during the <a href=https://indico.cern.ch/event/1429526/ target=_blank>First LHCb Open Data and Ntuple Wizard Workshop</a>."
},
"accelerator": "CERN-LHC",
"collaboration": {
@@ -180,7 +180,7 @@
"stream": "BHADRON",
"version": "stripping21r1"
},
- "title": "{Lambda_b0}[Lambda_b0 -> {Lambda_cplus}(Lambda_c+ -> {pplus}p+ {Kminus}K-{piplus}pi+) {piminus}pi-]CC",
+ "title": "[Lambda_b0 -> (Lambda_c+ -> p+ K- pi+) pi-]CC -- 2011 Magnet Down",
"type": {
"primary": "Dataset",
"secondary": ["Collision"]
@@ -197,7 +197,7 @@
},
{
"abstract": {
- "description": "Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for {insert reason for request}."
+ "description": "Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for exploring heavy baryon decays used as control channels for CP violation studies. This was used as an example during the <a href=https://indico.cern.ch/event/1429526/ target=_blank>First LHCb Open Data and Ntuple Wizard Workshop</a>."
},
"accelerator": "CERN-LHC",
"collaboration": {
@@ -272,7 +272,7 @@
"stream": "BHADRON",
"version": "stripping21r1p2"
},
- "title": "{Lambda_b0}[Lambda_b0 -> {Lambda_cplus}(Lambda_c+ -> {pplus}p+ {Kminus}K- {piplus}pi+) {Kminus_0}K-]CC",
+ "title": "[Lambda_b0 -> (Lambda_c+ -> p+ K- pi+) K-]CC -- 2011 Magnet Down",
"type": {
"primary": "Dataset",
"secondary": ["Collision"]
@@ -289,7 +289,7 @@
},
{
"abstract": {
- "description": "Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for {insert reason for request}."
+ "description": "Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for exploring heavy baryon decays used as control channels for CP violation studies. This was used as an example during the <a href=https://indico.cern.ch/event/1429526/ target=_blank>First LHCb Open Data and Ntuple Wizard Workshop</a>."
},
"accelerator": "CERN-LHC",
"collaboration": {
@@ -379,7 +379,7 @@
"stream": "BHADRON",
"version": "stripping21r1"
},
- "title": "{Lambda_b0}[Lambda_b0 -> {Lambda_cplus}(Lambda_c+ -> {pplus}p+ {Kminus}K-{piplus}pi+) {piminus}pi-]CC",
+ "title": "[Lambda_b0 -> (Lambda_c+ -> p+ K- pi+) pi-]CC -- 2011 Magnet Up",
"type": {
"primary": "Dataset",
"secondary": ["Collision"]
@@ -396,7 +396,7 @@
},
{
"abstract": {
- "description": "Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for {insert reason for request}."
+ "description": "Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for exploring heavy baryon decays used as control channels for CP violation studies. This was used as an example during the <a href=https://indico.cern.ch/event/1429526/ target=_blank>First LHCb Open Data and Ntuple Wizard Workshop</a>."
},
"accelerator": "CERN-LHC",
"collaboration": {
@@ -461,7 +461,7 @@
"stream": "BHADRON",
"version": "stripping21r1p2"
},
- "title": "{Lambda_b0}[Lambda_b0 -> {Lambda_cplus}(Lambda_c+ -> {pplus}p+ {Kminus}K- {piplus}pi+) {Kminus_0}K-]CC",
+ "title": "[Lambda_b0 -> (Lambda_c+ -> p+ K- pi+) K-]CC -- 2011 Magnet Up",
"type": {
"primary": "Dataset",
"secondary": ["Collision"]
@@ -478,7 +478,7 @@
},
{
"abstract": {
- "description": "Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for {insert reason for request}."
+ "description": "Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for exploring heavy baryon decays used as control channels for CP violation studies. This was used as an example during the <a href=https://indico.cern.ch/event/1429526/ target=_blank>First LHCb Open Data and Ntuple Wizard Workshop</a>."
},
"accelerator": "CERN-LHC",
"collaboration": {
@@ -593,7 +593,7 @@
"stream": "BHADRON",
"version": "stripping21"
},
- "title": "{Lambda_b0}[Lambda_b0 -> {Lambda_cplus}(Lambda_c+ -> {pplus}p+ {Kminus}K-{piplus}pi+) {piminus}pi-]CC",
+ "title": "[Lambda_b0 -> (Lambda_c+ -> p+ K- pi+) pi-]CC -- 2012 Magnet Down",
"type": {
"primary": "Dataset",
"secondary": ["Collision"]
@@ -610,7 +610,7 @@
},
{
"abstract": {
- "description": "Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for {insert reason for request}."
+ "description": "Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for {exploring heavy baryon decays used as control channels for CP violation studies. This was used as an example during the <a href=https://indico.cern.ch/event/1429526/ target=_blank>First LHCb Open Data and Ntuple Wizard Workshop</a>."
},
"accelerator": "CERN-LHC",
"collaboration": {
@@ -715,7 +715,7 @@
"stream": "BHADRON",
"version": "stripping21r0p2"
},
- "title": "{Lambda_b0}[Lambda_b0 -> {Lambda_cplus}(Lambda_c+ -> {pplus}p+ {Kminus}K- {piplus}pi+) {Kminus_0}K-]CC",
+ "title": "[Lambda_b0 -> (Lambda_c+ -> p+ K- pi+) K-]CC -- 2012 Magnet Down",
"type": {
"primary": "Dataset",
"secondary": ["Collision"]
@@ -732,7 +732,7 @@
},
{
"abstract": {
- "description": "Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for {insert reason for request}."
+ "description": "Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for {exploring heavy baryon decays used as control channels for CP violation studies. This was used as an example during the <a href=https://indico.cern.ch/event/1429526/ target=_blank>First LHCb Open Data and Ntuple Wizard Workshop</a>."
},
"accelerator": "CERN-LHC",
"collaboration": {
@@ -882,7 +882,7 @@
"stream": "BHADRON",
"version": "stripping21"
},
- "title": "{Lambda_b0}[Lambda_b0 -> {Lambda_cplus}(Lambda_c+ -> {pplus}p+ {Kminus}K-{piplus}pi+) {piminus}pi-]CC",
+ "title": "[Lambda_b0 -> (Lambda_c+ -> p+ K- pi+) pi-]CC -- 2012 Magnet Up",
"type": {
"primary": "Dataset",
"secondary": ["Collision"]
@@ -899,7 +899,7 @@
},
{
"abstract": {
- "description": "Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for {insert reason for request}."
+ "description": "Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for {exploring heavy baryon decays used as control channels for CP violation studies. This was used as an example during the <a href=https://indico.cern.ch/event/1429526/ target=_blank>First LHCb Open Data and Ntuple Wizard Workshop</a>."
},
"accelerator": "CERN-LHC",
"collaboration": {
@@ -1004,7 +1004,7 @@
"stream": "BHADRON",
"version": "stripping21r0p2"
},
- "title": "{Lambda_b0}[Lambda_b0 -> {Lambda_cplus}(Lambda_c+ -> {pplus}p+ {Kminus}K- {piplus}pi+) {Kminus_0}K-]CC",
+ "title": "[Lambda_b0 -> (Lambda_c+ -> p+ K- pi+) K-]CC -- 2012 Magnet Up",
"type": {
"primary": "Dataset",
"secondary": ["Collision"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All done with a first pass of review, though some points would be nice to iterate on. Since I couldn't push changes directly, I have included my diff
in a different comment on the PR so you can see my suggestions so far.
I can update this diff
and I make more changes unless I am able to push directly in the future. Let me know what you prefer.
Thanks a lot for setting this up!
[ | ||
{ | ||
"abstract": { | ||
"description": "<p>Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for {insert reason for request}.</p><p>FIXME Here should come a long explanation of inputs, ntuples, etc.</p><p>FIXME If we can attach the <code>info.yaml</code> etc, this is the place wher they would be added." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RE: Here should come a long explanation of inputs, ntuples, etc.
This could be pretty tricky depending on how detailed we are. If we endeavor to provide a similar level of detail on the inputs as https://opendata.cern.ch/record/28071, there is a bit of groundwork to do (e.g. steps for how DST files were produced). At the very least we can point to the proper DST files used to create the Ntuples, as well as software versions used in the Ntuple Making step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The explanation can stay more brief, anything you would like to highlight and/or make available for search.
[ | ||
{ | ||
"abstract": { | ||
"description": "<p>Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for {insert reason for request}.</p><p>FIXME Here should come a long explanation of inputs, ntuples, etc.</p><p>FIXME If we can attach the <code>info.yaml</code> etc, this is the place wher they would be added." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RE: If we can attach the code...
I will have to check with DPA but I agree this would be nice. Then we can mention that some information about the tools used to save quantities to the Ntuples are specified in the config files.
"stream": "BHADRON", | ||
"version": "stripping21r1" | ||
}, | ||
"title": "{Lambda_b0}[Lambda_b0 -> {Lambda_cplus}(Lambda_c+ -> {pplus}p+ {Kminus}K-{piplus}pi+) {piminus}pi-]CC", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For here and other occurrences. Right now this similar decay descriptor is repeated for 4 different "related datasets" (we have 8 related datasets -- 2 decays x 2 years x 2 magnet polarities). I have added the year and magnet polarity to the title, which can be reliably extracted from the info.yaml
file (and will need to be for other fields in the json file, though I am not sure how those are currently used). I have also made the decay descriptor more readable by removing the Ntuple branch names, which could be implemented algorithmically by removing substrings enclosed in "{}".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The title look good 👍
Shall we also amend the umbrella record title? Currently it says:
LHCb Ntuples from user request 142
Perhaps it should similarly say:
[Lambda_b0 -> (Lambda_c+ -> p+ K- pi+) K-]CC ntuples
or including request ID:
[Lambda_b0 -> (Lambda_c+ -> p+ K- pi+) K-]CC ntuples from user request 142
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[ | ||
{ | ||
"abstract": { | ||
"description": "<p>Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for {insert reason for request}.</p><p>FIXME Here should come a long explanation of inputs, ntuples, etc.</p><p>FIXME If we can attach the <code>info.yaml</code> etc, this is the place wher they would be added." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RE {insert reason for request} -- we should extract this from the request details here or perhaps an immutable source if it is stored elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this needs to be inputted by the LHCb open data team, as relying on a free text supplied by a user could contain slang, or internal information, or other such language that would not be understandable outside of the context.
(In the reason, 1 user talks to N LHCb open data team; whilst for the promoted ntuples, anybody in the world can access them, missing the original context)
"secondary": ["Collision"] | ||
}, | ||
"usage": { | ||
"description": "<p>You can clone and amend this ntupling request here:", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggested a modification to encourage folks to check out the existing Ntuples and only clone and amend the request if these Ntuples do not suit their needs.
[ | ||
{ | ||
"abstract": { | ||
"description": "<p>Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for {insert reason for request}.</p><p>FIXME Here should come a long explanation of inputs, ntuples, etc.</p><p>FIXME If we can attach the <code>info.yaml</code> etc, this is the place wher they would be added." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me summarize my previous comments on this line in light of our discussion this morning with a suggestion:
"description": "<p>Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for {insert reason for request}.</p><p>FIXME Here should come a long explanation of inputs, ntuples, etc.</p><p>FIXME If we can attach the <code>info.yaml</code> etc, this is the place wher they would be added." | |
"description": "<p>Data from proton-proton (pp) collisions collected by the LHCb experiment filtered to produce Ntuples for exploring heavy baryon decays used as control channels for CP violation studies. This was used as an example during the <a href=https://indico.cern.ch/event/1429526/ target=_blank>First LHCb Open Data and Ntuple Wizard Workshop</a>.</p> <p>Ntuples are created using DaVinci version v46r11 and the Analysis Productions system to process the BHADRON.MDST datasets for both magnet polarities in years 2016, 2017, and 2018. Quantities saved to the Ntuple are specified during the request phase and detailed in the following code configuration files.</p><p>PLACEHOLDER FOR CODE</p> |
I am not sure how to insert viewable code into the records, but the files are located here (https://gitlab.cern.ch/cernopendata/lhcb-ntupling-service-requests-dev/-/tree/opendata-ntupling-service-request-142-development/request/142/baryon_example_run1). We probably only need to include the yaml
files, but I am open to suggestions. I recommend we just include the code and I can run it by DPA when I show them what we have been working on.
Note: I have not tested this since I am not in the office and I have not gotten the local deployment of the ODP setup on my laptop. The changes are straightforward enough though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Final comment from this morning and before merging, I think we should add something like "Originated from request 142" in the description of the related datasets (not the parent) so that if another request is promoted with the same decay descriptor, the records can still be uniquely identified visually when performing a search. We probably want to put it early up in the description so it is always visible in the search results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably only need to include the yaml files, but I am open to suggestions. I recommend we just include the code and I can run it by DPA when I show them what we have been working on.
I'll amend the example attaching all these input files, and we can see how it looks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add something like "Originated from request 142" in the description of the related datasets (not the parent)
Yes, I'll enrich the "selection" part for daughter records in this sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dillfitz BTW a terminology question. Do we spell it "Ntuples" with uppercase N in regular sentences? In ATLAS and CMS documents, we usually just say "ntuples" with lowercase in the regular text...
I'll modify the current PR to continue discussions here, but for the next PRs we can just be editing the same upstream branch, so that edits would be easier. (If you prefer, I could even close this PR and open a new one in a shared-editable state already.) |
4b8937b
to
d0dd9b8
Compare
2eedfc2
to
80b0b25
Compare
80b0b25
to
1903699
Compare
This is a work-in-progress illustrating how the open data portal records resulting from an LHCb Ntupling Service request could look like.
Just for illustration in order to discuss and amend the content.
CC @dillfitz @pietnogga