Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bam support #44

Open
endrebak opened this issue Oct 7, 2016 · 2 comments
Open

Bam support #44

endrebak opened this issue Oct 7, 2016 · 2 comments

Comments

@endrebak
Copy link
Member

endrebak commented Oct 7, 2016

I think it is a bad idea for the below reasons. Feel free to suggest solutions:

You will probably rerun the analyses many times. Having to run a time-consuming conversion step (the most time-consuming one in the algorithm) each time would be silly. It is also IO-intensive so parallell execution would not help much.

I am not just writing epic but a lot of helper scripts for ChIP-Seq and differential ChIP Seq. Adding a conversion step to bed in all of these before running the scripts would be a waste.

Also, where should I store the temporary bed files? Overflowing /tmp/ dirs is an eternal issue.

If I were to stream the data to bed using pipes, epic would not be fast anymore. I get a massive speedup from multiple cores if I use text files, presumably because the system knows it has the file in memory already. This is not the case if I start the pipe with bamToBed blabla | ...

There are many things that can go wrong when converting bam to bed, due to wonky bam files. I would get a bunch of github issues about "epic not being able to use my bam files" if I were to silently convert to bed within my programs.

@endrebak
Copy link
Member Author

endrebak commented Oct 7, 2016

I guess the best way of adding bam support would be to do the conversion before running the script with a warning that I think using bams instead of beds is suboptimal. If the conversion fails I'll throw an exception informing the user that the onus is on them to convert their wonky bam-files to bed.

@endrebak
Copy link
Member Author

My solution: if the input files are called path/to/file.bam, create a file path/to/file.bed. Do not delete it afterwards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant