-
Notifications
You must be signed in to change notification settings - Fork 15
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add task to find overlapping events between custom and central NANOAOD (
#44) * Add overlap checking task. * add a simple hash function that calculates a unique number using 'event', 'luminisityBlock' and 'run' information of a nano AOD * linting * Completed the task to find overlapping events within our custom Nano AOD and the centralized created NANOAOD. Due to different compressions between our and the central NANOAODs we need to find out which overlap both have and filter them out. This task finds the events that overlap and savin the unique event identifier as tuple in a json. This json also contains the relative number of overlap as information. * make it more clear why the value is padded to this specific value * rearrange order of fields to the one used in the hash * add an assertion to check if unique identifier column exceed a specfic value, which is given by data (and may exceed in far future * moved output of unique ovelap identifier from json to parquet file, use this file to create filter mask in sync csv task * linting * moved type cast to hash function, refactor names * move imports into hash function * linting and add maybe import for numpy * overlapping events should not be chunk depending * add new check so that people will not create an int64 overflow * previous overlapping find compared chunkwise, but chunks are not always of same size. Changed this to a global comparison. * Changed default value of file path to empty string, since None is resolved by law to `/None` * Added case padding handling for array of dim < 2 Some array variables, e.g. like MET.covXX is of type numpy and not ListNumpy, meaning slicing is not possible. * added more variables for the sync * Apply suggestions from code review --------- Co-authored-by: Marcel R. <[email protected]> Co-authored-by: Marcel Rieger <[email protected]>
- Loading branch information
1 parent
f4a8a4e
commit 5ee7b03
Showing
2 changed files
with
211 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters