-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Outputs from only *.vcf file #95
Comments
Hey, thank you for the request! If you still think it's a possibility that can help you I would go forward with this on an extra branch. I can't promise anything but if it works as expected I'd try to get a version working there. |
Hi. Thanks for offering that. Its a great way to go because we do have a broad QC numbers in some of the outputs that are provided. I will manually check a few items:
I suggest proceeding and I'll ask that company about more detail. Thanks! |
Hello, here is a little update: I am currently working on enabling direct So if you (still) want to use the pigx-sars-cov-2 pipeline to analyse your data, you would probably need to get your variants called with lofreq. There is a version that should be capable of producing variant reports from lofreq |
Hi Thanks Jonas, We were able to eventually obtain some raw fastq, but not for the majority of our weekly assessments. I’m going to try the predefine_file_io<https://urldefense.com/v3/__https:/github.com/jonasfreimuth/pigx_sars-cov-2/tree/predefine-rule-io__;!!DfVsRZep!kpz-Nz-A5uLwO7b3TFesinyth5tNcF8RZBpu6Ez1DBrjR9qY6q2ilTqkPVzAXVQ$> option that you detailed below.
From: Jonas Freimuth ***@***.***>
Sent: Sunday, July 10, 2022 2:31 PM
To: BIMSBbioinfo/pigx_sars-cov-2 ***@***.***>
Cc: Sinclair, Ryan (LLU) ***@***.***>; Author ***@***.***>
Subject: [EXTERNAL] Re: [BIMSBbioinfo/pigx_sars-cov-2] Outputs from only *.vcf file (Issue #95)
CAUTION: This message originated from outside the LLUH email system. Do not open attachments or follow links unless you have verified the legitimacy of the sender and its content. If you receive a suspicious email, you may forward it to ***@***.******@***.***> and then delete the suspicious email.
…________________________________
Hello,
here is a little update: I am currently working on enabling direct vcf input. However, there are some INFO fields that need to be present, namely Allele Frequency (AF) and Depth (DP). The information from both those fields is required by the downstream analysis. I tried running the pipeline on the vcf files you provided, but they are lacking that info. Also, when I try to work around this, no nucleotide info gets found by vep, which I am still investigating.
So if you (still) want to use the pigx-sars-cov-2 pipeline to analyse your data, you would probably need to get your variants called with lofreq. There is a version that should be capable of producing variant reports from lofreq vcf output alone on brach predefine_file_io<https://urldefense.com/v3/__https:/github.com/jonasfreimuth/pigx_sars-cov-2/tree/predefine-rule-io__;!!DfVsRZep!kpz-Nz-A5uLwO7b3TFesinyth5tNcF8RZBpu6Ez1DBrjR9qY6q2ilTqkPVzAXVQ$> in my personal repo (not thoroughly tested at all).
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/BIMSBbioinfo/pigx_sars-cov-2/issues/95*issuecomment-1179803401__;Iw!!DfVsRZep!kpz-Nz-A5uLwO7b3TFesinyth5tNcF8RZBpu6Ez1DBrjR9qY6q2ilTqkV8LVeVc$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AWOJAP273LWDFSVVRYUOK63VTM6IVANCNFSM5K6QAO7A__;!!DfVsRZep!kpz-Nz-A5uLwO7b3TFesinyth5tNcF8RZBpu6Ez1DBrjR9qY6q2ilTqkx7kbjBU$>.
You are receiving this because you authored the thread.Message ID: ***@***.******@***.***>>
CONFIDENTIALITY NOTICE: This e-mail communication and any attachments may contain confidential and privileged information for the use of the designated recipients named above. If you are not the intended recipient, you are hereby notified that you have received this communication in error and that any review, disclosure, dissemination, distribution or copying of it or its contents is prohibited. If you have received this communication in error, please notify me immediately by replying to this message and destroy all copies of this communication and any attachments. Thank you.
|
FYI, development of that branch will now take place on predef-rule-io-dev, due to git reasons |
This commit has a lot of consequences: * It changes how synonymous AA mutations are coded in the output. Previously the format was X123-, now it is X123X. * The code that deals with deletions now gets executed reliably. The previous condition was misspecified and would almost never work (specified as `!(any(is.na(...)))` whereas `any(!is.na(...))` would be correct) * the names between the df `full` created in `detectable_deletions()` don't match up with the colnames passed into the function (gene_mut is missing from `full`), this error is fixes as `detectable_deletions()` will not be called any more and removed in a future commit. Note: There are no deletions (that I found) anywhere in the results section of the project dir. I only got some from running the pipeline on the files provided in BIMSBbioinfo#95.
This commit has a lot of consequences: * It changes how synonymous AA mutations are coded in the output. Previously the format was X123-, now it is X123X. * The code that deals with deletions now gets executed reliably. The previous condition was misspecified and would almost never work (specified as `!(any(is.na(...)))` whereas `any(!is.na(...))` would be correct) * the names between the df `full` created in `detectable_deletions()` don't match up with the colnames passed into the function (gene_mut is missing from `full`), this error is fixes as `detectable_deletions()` will not be called any more and removed in a future commit. Note: There are no deletions (that I found) anywhere in the results section of the project dir. I only got some from running the pipeline on the files provided in #95.
The changes are now merged into main in #142. But I have no updates on getting nucleotide info from the files @sinclairify provided. |
Thank you @jonasfreimuth. Our governmental partners are working with the sequencing company to provide this (Fulgent). They had some staff changes and lost track of our progress. We will keep trying. |
I'd like to see how to generate mutation and lineage reports using only *.vcf as an input. The commercial lab that provides our county wastewater sequencing services provides a *.vcf , but doesn't provide the raw fastq file in an effort to protect their companies proprietary primers. They don't provide wastewater sequencing reports and process the the extracted RNA (from wastewater) as a clinical sample. The result is a few different files and I'd like to use the pigx to generate some lineage and mutation charts.
The *.vcf is generated from our commercial lab after they:
We have a few outputs from them <pangolin_##_trimmed.csv>, <##_ivar_consensus_trimmed_qual.fa>, <##ivar_consensus_trimmed_qual.txt>, and <VarScan##_trimmed.vcf>.
I'm providing some files that they returned to us in late November. I'm assuming the *.vcf is the best bet. Any help would be appreciated.
SH7951.zip
The text was updated successfully, but these errors were encountered: