Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend sources serial number functionality for different instruments and/or via meta dict #91

Open
4 tasks
spirrobe opened this issue Sep 15, 2023 · 1 comment

Comments

@spirrobe
Copy link
Contributor

general idea

When operating several different devices on the same site and looking at the output it is useful to know which cloudnetfile (especially cat/class/+ files) contains which devices. (We operate 3 radars on one campaign site - and technically 2 MWRs as well as 2 LIDARs, each of the same type).

serial number from raw files

So far, some cloudnet processing steps (e.g. ceilo2nc) with certain instrument types (CHM15k for example) contain the serial number of the device as taken from the netCDF files. This would be a possibility for the RPG HATPRO (MWR) nc types which contain the attribute "Serial_Number" (with a leading space in the string value that needs to be removed). Where available this can be included (this might depend on device + manufacturer software version as well). Notably, this information is not present in the RPG HATPRO binary files.

Where not available, this information could/should be passed via the site_meta dict, with the dict taking precedence over the ncfile attributes.

sources attribute

The cat/class/+ files (usually - at least as far as i've seen) contain a global sources attribute for the device combination (4 sources in general). This can/should be extended to the single variables, resp. their attributes in the files. After all, Z in the cat file will be from the radar and the global source includes all instruments. Some variable already do contain the source attribute in the current version. As such, the global source attribute can be partially added depending on the field type· This allows with #90 that the plot will contain exactly the information which lead to the creation of the specific variable. For class files, the global and variable specific source would then be equal. This means, there is a certain redundancy created with this approach, but the variables become more atomic and stand alone.

serial numbers attribute and expected addition

Similarly, the serial numbers should be available on the global level as well as for each variable. Again, this will create redundancy but makes it clearer which variable originated from what. e.g. ncdump -h categorize_file output

Z:source_serial_number = "RADARSN" ;
Z:source = "RPG-Radiometer Physics RPG-FMCW-94" ;
v:source_serial_number = "RADARSN" ;
v:source = "RPG-Radiometer Physics RPG-FMCW-94" ;
sldr:source_serial_number = "RADARSN" ;
sldr:source = "RPG-Radiometer Physics RPG-FMCW-94" ;
beta:source_serial_number = "LIDARSN" ;
beta:source = "Lufft CHM15k" ;
lwp:source_serial_number = "MWRSN" ;
lwp:source = "RPG-Radiometer Physics HATPRO" ;

e.g. ncdump -h drizzle_file

        float mu(time, height) ;
                mu:_FillValue = 9.96921e+36f ;
                mu:units = "1" ;
                mu:source = "RPG-Radiometer Physics RPG-FMCW-94\n",
                        "Lufft CHM15k" ;
                mu:source_serial_number = "MWRSN\n",
                        "LIDARSN" ;
....

                :source_serial_numbers = "RADARSN",
                        "LIDARSN\n",
                        "MWRSN\n",
                        "" ;

looking for input/comments

  • site_meta key would be 'serial_number'
  • Would this create too much redundancy? In terms of file size, I think the few attrs are negligible
  • Serial number of model -> This would actually be the version of the model (or at least that is the closest information that would make sense) but I'm not sure this is readily available. So far I added an empty serial number for the model.
  • more tasks / clearer definition

roadmap / tasks (draft, unordered)

  • pass on global serial number if available
  • site_meta with kw input for each instrument
  • where available parse serial numbers from instrument files (several readers have this already)
  • helper function to add source / serial number via dict a posteriori (similar to gist below)

related

Plotting as of v1.55.0 / #90 supports both global and variable sources and serial numbers for figures (and checks for them)
The gist cloudnet_add_serial_number_2_cloudnetfiles.py](https://gist.github.com/spirrobe/dde782662bda45feeaeacd15526f062f) is an example of how to postprocess some cloudnet nc files, adding global serial numbers, per variable serial numbers and where applicable per variable source

@siiptuo
Copy link
Contributor

siiptuo commented Sep 19, 2023

site_meta key would be 'serial_number'

Sounds good!

Would this create too much redundancy? In terms of file size, I think the few attrs are negligible

I think it's worth having rich and unambiguous metadata even if it adds some redundancy.

Serial number of model -> This would actually be the version of the model (or at least that is the closest information that would make sense) but I'm not sure this is readily available. So far I added an empty serial number for the model.

We have thought ways of identifying models in more detail, but for now it can be left empty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants