Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAVEN l2_regex fails to match for some data types #891

Open
jameswilburlewis opened this issue Jun 21, 2024 · 2 comments
Open

MAVEN l2_regex fails to match for some data types #891

jameswilburlewis opened this issue Jun 21, 2024 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@jameswilburlewis
Copy link
Contributor

jameswilburlewis commented Jun 21, 2024

This is happening in get_year_month_day_from_sci_file from maven/download_file_utilities.py for the rse, iuv, and ngi datatypes.

21-Jun-24 15:31:45: l2_regex match failed for filename mvn_rse_l2_w60_20160101T000000_v01_r00.tab
21-Jun-24 15:32:06: l2_regex match failed for filename mvn_iuv_l2_corona-orbit02450-fuv_20160102T201901.xml
21-Jun-24 15:32:35: l2_regex match failed for filename mvn_ngi_l2_ion-abund-18402_20160102T222549_v08_r01.csv

Here's an example of a successfully matched filename: mvn_mag_l2_2016002ss1s_20160102_v01_r01.xml

Apparently the rse, iuv, and ngi filenames don't follow the expected pattern:

    l2_pattern = (
        r"^mvn_(?P<{0}>[a-zA-Z0-9]+)_"
        r"(?P<{1}>l[a-zA-Z0-9]+)"
        r"(?P<{2}>|_[a-zA-Z0-9\-]+)_"
        r"(?P<{3}>[0-9]{{4}})"
        r"(?P<{4}>[0-9]{{2}})"
        r"(?P<{5}>[0-9]{{2}})"
        r"(?P<{6}>|T[0-9]{{6}}|t[0-9]{{6}})_"
        r"v(?P<{7}>[0-9]+)_"
        r"r(?P<{8}>[0-9]+)\."
        r"(?P<{9}>cdf|xml|sts|md5)"
        r"(?P<{10}>\.gz)*"
    ).format(
        "instrument",
        "level",
        "description",
        "year",
        "month",
        "day",
        "time",
        "version",
        "revision",
        "extension",
        "gz",
    )


@jameswilburlewis jameswilburlewis added the bug Something isn't working label Jun 21, 2024
@jameswilburlewis
Copy link
Contributor Author

I see that the rse datatype is using a ".tab" file extension that's not supported by the l2_regex -- I've seen that extension in the kp files, though, so maybe this is really kp data?

And ngi seems to have a ".csv" extension, which doesn't appear in the l2_regex.

iuv might be missing a version number?

@nickssl
Copy link
Contributor

nickssl commented Jun 27, 2024

There are more issues with the rse files, not just the regex. These are .tab files (TAB delimited text) that currently cannot be handled by the rest of the code and cannot be loaded into tplot. I attach an example of such a file. Since they cannot be loaded, I added code to skip these files.

An additional problem is that the rest of the maven code assumes that any .tab files are kp files, which is not true in this case. For loading .tab files, the code assumes a particular structure inside the .tab file, which is not valid for rse files. So, if we want to load these rse files in the future, a more extensive fix for the existing code will be needed.

mvn_rse_l2_w40_20160101T000000_v01_r00.tab.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants