-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot round-trip explicitly set missing INFO values in VCF #1197
Milestone
Comments
Sounds good - |
Do you have the script for generating it? |
jeromekelleher
added a commit
to jeromekelleher/sgkit
that referenced
this issue
Feb 16, 2024
jeromekelleher
added a commit
to jeromekelleher/sgkit
that referenced
this issue
Feb 16, 2024
jeromekelleher
added a commit
to jeromekelleher/sgkit
that referenced
this issue
Mar 5, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The
all_fields.vcf
file contains lots of examples where we explicitly state that an INFO key is missing, rather than omitting the key, e.g.II1=.
andII2=.,.
here. This was handled before #1190 because we treating non-present INFO keys as PAD values and only these explicit "key=." values as missing.I don't think it's a useful distinction, and likely to cause more problems downstream if we distinguish between these two types of missingness. I'm fairly clear that regarding missing keys as dimension padding isn't helpful, in any case.
However, it seems that
bcftools
at least does make this distinction, and losslessly roundtrips this VCF through BCF.My suggestion here is that we just edit the
all_fields.vcf
file to remove all-missing values. This seems like a pretty niche problem, and probably something we'd need to deal with explicitly at the spec level rather than here. It's not worth getting bogged down on, I think.The text was updated successfully, but these errors were encountered: