You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
These are attractive ideas, but only work in certain situations and cleaned up data. We will always start out with messy variant calls initially, and we need a software stack and data structures to work with this.
The text was updated successfully, but these errors were encountered:
These ideas are not in conflict with also being able to scale work out across processors and servers, and can apply even for messy data. There was a lot of effort in the Hadoop ecosystem to identify compression codecs that were splittable (our friend Tom White wrote about the topic in his book) and had the right tradeoff of computation and storage efficiency (e.g. Snappy was an improvement at the time). Much of the work since then has gone into using instruction set extensions to make hardware-friendly codecs, and algorithms to operate directly on compressed data, discussed as far back as Data compression and database performance (1991) for example.
Loh et al argue for the idea of compressive genomics and follow up with ideas of Compressive acceleration.
These are attractive ideas, but only work in certain situations and cleaned up data. We will always start out with messy variant calls initially, and we need a software stack and data structures to work with this.
The text was updated successfully, but these errors were encountered: