-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core dump error on import #29
Comments
Thanks for reporting this @peterdfields . The problem appears to an assertion I've place in the computation of Hardy-Weinberg equilibrium. For some reason the offending line has an allele that is not biallelic (encodings 0, 1, 4, or 5 in my internal format). This should not happen as non-biallelic, non-diploid sites should be filtered out. Must be some edge case I have not covered. Could you find the offending line and report it to me? By email if it is not public. |
Hi @mklarqvist. Given that an allele remains non-biallelic I have to assume it has somehow made it past gatk selectvariants and vcftools filtering for biallelic snps. Is there a way to force tomahawk to output the line that has the error? I tried with version of the program built with |
@peterdfields I'll update the error message to reflect the offending variant line number and offending allele encoding. This is something I should've done in the first place. |
@mklarqvist would there be an alternative method to localize the problem line? |
@peterdfields A crude way would be like a manual binary search:
I'm digging through the code to find the problem. |
@mklarqvist Okay, I followed your advice about doing the manual binary search. The line from the vcf that is causing the error is as follows:
|
@peterdfields Thanks for helping me getting to the bottom of this. Very helpful! I am investigating this. |
@mklarqvist no worries! I'm looking forward to exploring tomahawk. |
Hi @mklarqvist. Any news about this issue? Thank you again for your help. |
Hello @peterdfields . Sorry for the delay in resolving this. I returned today from a trip abroad. Will pick up were I left of. Thanks for your patience! |
Hi @mklarqvist. Okay, great. Thank you again for your assistance! |
Hi @mklarqvist. Any luck on tracking down this issue? |
Hey @mklarqvist, I got the same issue, I think the problem is related to missing data, or at least with './.' in the GT field. Replacing missing data with random genotypes or removing loci with any missing data solves the problem with import in my case. |
I have the same problem too. Yes, removing sites with ANY missing data will resolve the situation, but this is not really a practical approach for my dataset. Thanks |
Same problem here... Is there a different way we can encode missing data so that it can be captured? |
Same problem here ... with command line:
And, besides that,
And the examples I found are wll with the dash as -m 0.xx -h 0.001 etc. |
Hi. Is there an update for this problem? I have the same problem as well. Thanks |
I am getting the same -m error on Ubuntu 20.
|
I think I figured out that for 'import' the -m has changed to -n and -h is now -H. But I am getting the same core dump.
|
I also get this
error. I don't know what is not working. |
Hi,
I'm trying to import a bcf file that was generated by first converting a GATK vcf to bcf with bcftools. I'm getting the following error:
The SNPs seem to meet the expectations of the program. I'm not entirely sure what's going wrong here. Please let me know if additional info would be useful.
The text was updated successfully, but these errors were encountered: