-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question regarding align_trim.pl #99
Comments
Hi Alberto align_trim is designed for full amplicon unpaired read data and relies on that type of data to function properly. There is version of it I have adapted to work with paired and fragmented illumina data which you can find here but I'm still validating this so it comes with absolutely zero guarantees. |
Hi @BioWilko, Based on your comment above, am I correct in assuming that artic's Thanks, |
That is correct however I also have a version of align_trim I've modified to work in the case of fragmented nanopore data available here if you utilise the |
Thanks for the reply. I gave the script a crack by (1) dropping it in as a replacement in a clone of the artic-ncov2019 conda environment, and (2) creating a new conda environment after cloning your fork of the fieldbioinformatics repo. The
I get a bit lost in some of the functions, so I'm not sure what the issue is. |
Would you be willing to share the fastqs for the data you used here? It would be super handy for getting to the bottom of the issue Edit: Oh you used conda, that will cause problems since I haven't packaged the fork, you should follow the compile from source instructions to use this fork properly. |
Apologies for my slow reply. I previously installed your fork with the following steps:
Forgive me, but I can't see any different instructions for compiling from source, have I missed something? I checked for the
I used the following command to trim the primers:
Same error as above. However, when I tried to replicate this problem today on a different computer, I could not do so. When I follow the link you provided above, there is no longer a In any case, the problematic reads and bamfile can be found here: https://unsw-my.sharepoint.com/:f:/g/personal/z3533036_ad_unsw_edu_au/ErSNhdfncMRDvddJFQhV0N4BPB4Y3LRp9jfsaCFIUagz6g?e=M4o0Q8 |
Sorry that's on me entirely, I had to revert my repo for a version bump PR. However chris wright at ONT has written an amplicon overlap based version of align_trim which should suit your purposes which you can find here: https://github.com/epi2me-labs/fieldbioinformatics/tree/align_trim |
Hi, I think I might have the same problem. Tried to use the version you linked to from epi2me, but got the same problem. |
Fieldbioinformatics doesn't support circular genomes (yet) but I don't know what the specific issue would be in this case, my assumption would be that only amplicons crossing the reference boundaries would be an issue.... If you're having trouble with the epi2me version of fieldbioinformatics I suggest you raise the issues there since it differs in a few key ways, but if all you want is rapid barcoding support then I suggest you try the pre-release of fieldbioinformatics 1.4.0, this does support rapid barcoded data but not circular genomes. |
Thanks @BioWilko for the quick response! I managed to run the 1.4.0 release, but still have similar issues. More than 60000 reads map in the first round, but it seems that everything is removed by align_trim because the reads are not correctly paired. Could it be something wrong with my primer file? |
It sounds like it, if you can share your reads, reference, and bed file I can have a look and give you a better answer |
Thank you so much for the help! |
This is the command I used and the 1.4.0 release: |
Ah, here's your problem, the bedfile is malformed:
The number in the middle of the primer name is the amplicon number and they need to match e.g. If you want to see the bedfile specifications you can see them here: https://github.com/ChrisgKent/primal-page, fieldbioinformatics supports any of these formats so long as it is used consistently. |
Ah... I see. Thanks!! These are custom primers we designed without primalscheme. I tried to format everything corectly, but didn't notice this 😳 |
The pipeline works now |
Unfortunately it still won't be able to handle amplicons crossing the reference boundary, that functionality will hopefully be added soon! |
Hi all,
I am using align_trim.py script to remove primers from the alignment. In me case, we have performed SARS2 amplification with v3 primer set, and then sequenced with illumina (Miseq).
I was expecting align_trim.py to trims primers from aligned reads only if reads starts or ends in the same positions as a primers starts or ends, and both reads and primers are in the same orientation (I think it has no sense to remove a reverse primer from a forward read, as this forward read can come from and overlapped amplicon). But after executing the program, we can see that all positions in the alignment were a primers is mapped are remove, and all bases up to the end of the read (when the read is forward and the primers is reverse for instance), among other strange things, it could be long to explain...
Here is a image of a couple of reads (in a very sad region, with very few mapped reads)
So, I do not know is I am losing something is the program, or if in nanopore reads, where I do not have any experience, this behavior of the program is the expected.
Best,
Alberto
The text was updated successfully, but these errors were encountered: