Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider despecializing AbstractFormattedIOs #91

Open
jakobnissen opened this issue Jan 1, 2023 · 3 comments
Open

Consider despecializing AbstractFormattedIOs #91

jakobnissen opened this issue Jan 1, 2023 · 3 comments

Comments

@jakobnissen
Copy link
Member

With Julia 1.9, package image caching significantly reduces latency, as entire function can now be completely cached.
FASTX's readers and writers are parameterized by the underlying IO type. This means different underlying IO types causes the entire Automa-generated code to be recompiled, needlessly. This takes about half a second.

We might consider somehow despecializing the readers and writers (AbstractFormattedIOs, AFIOs) on their underlying IO. This will cause a dynamic dispatch whenever the AFIOs run out of buffer and need to query their underlying IOs, but these operations are already slow, so I suspect impact would be minimal (earlier tests of mine showed insignificant slowdown when removing the type parameter of FASTAReader completely). This will make the code type unstable, but allow precompilation.

@CiaranOMara
Copy link
Member

CiaranOMara commented Jan 2, 2023

I think this is related. I was doing some work on XAM last year following @jonathanBieler PR to improve index handling, in that work I too found myself questioning the Reader parametrisation.

In the case of XAM.BAM, the Reader needs to know that it has a BGZFStream, but at the Reader level, having an awareness or parametrisation of the underlying stream is not useful when writing methods. And I thought, as I think you are saying here, in part, is that BGZFStream should specialise based on the underlying IO that it is aware of, not the Reader.

Similarly, for the XAM.SAM.Reader, following your suggestion here, I suppose it only needs to know it has an IO stream, and there is no need to parametrise the Reader on the IO stream type.

As an aside, I think the parametrisation that is useful at the Reader level is the type of indexer (BAI, FAI, .etc...) used.

@CiaranOMara
Copy link
Member

@jakobnissen, could you give us/lay out a few quick structs to show/confirm the implementation you have in mind?

@jakobnissen
Copy link
Member Author

I'll need to look through the BioJulia stack to figure out where this change is suitable. Maybe in Automa - maybe in BioGenerics - or maybe in the individual parser packages, like FASTX. When I get some free time I'll look at it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants