-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Agree on outline #2
Comments
Yep, overall narrative sounds good to me. We should break things down into some sections, I guess, and then sketch out the outlines of each of these sections as separate issues? This is also determined by the journal and article format choice (#1), as some journals have pretty weird sectioning requirements. |
Here's another very rough possible starting point for an outline, similar to Eric's: https://docs.google.com/document/d/1TKSS--28ErsGjdyH6AHXnAcB7ngwCMXm5JK-u0dA_3c/edit?usp=sharing |
eLife recommends the following outline:
|
Pulling some previous discussions that could be useful here: |
Here's an outline that @tomwhite, @benjeffery and I came up with last week:
Then within results we have
What do we think? The first section (pydata for genomics) gets directly to the point of discussing sgkit's design principles and data structures, letting the intro set the scene of the software infrastructure around us. In terms of display items, we would refer to the Scaling and compute (#7) in the pydata for genomics section, plus the . We probably don't need display items for the rest of the paper. The PopGen, StatGen, QuantGen (and PhyloGen?) sections are a way to allow readers interested in just those areas to skip in and see what sort of things sgkit can do, without having the trudge through API listings. We want to give one (or two) concrete examples showing useful things being done, giving indicative performance figures without getting bogged down in direct performance comparisons. It also gives us a space to quickly discuss the tools that people use and illustrate how fragmented the ecosystem is. |
If we roughly agree on this outline I can make some more issues to track the different sections, and sketch out what we want to say in them. |
Looks great to me thanks for moving this forward! |
I should have asked: how do we define stat, pop, and quant gen? I generally think of pop gen as variation without phenotype and stat gen as variation with phenotype. I’m not sure where that leaves quant, perhaps as the union of the two? If so, do we need to rename to |
There probably isn't a good definition, but we can just do something pragmatic based on the user communities. PopGen people are mostly interested in evolutionary biology itself, Statgen mostly in applications to humans and Quantgen mostly to applications in agriculture. The tools they use are mostly nonoverlapping sets I think. |
Should this include a mention of trends in python adoption? And/or why this is an important tailwind to ride given AI progress?
FWIW on the StatGen piece, I think #9 is a good template for that. I also think that would probably be a good place to touch on the potential power and relatively nascent state of pathway GWAS (gene-e), GWAS/ExWAS methods in general (e.g. REGENIE), some of the QC ops necessary to get there (HWE, pruning, filtering) and general purpose operations like those for creating LD matrices and kinship coefficients (pc-relate). @jeromekelleher I could outline some of those in more detail in a StatGen specific issue at some point if you or someone else (@hammer perhaps?) hasn't already done anything related to it. I'm not sure how this interacts with the |
@eric-czech please do go ahead and create an issue to sketch out your thoughts on StatGen. Don't worry too much about how things fit into the overall structure, just get the key points that you think should get in there down in some form, and I'll bring it together into the document. |
I'm going to close this as out-of-date now. |
@eric-czech has an initial proposal at https://github.com/pystatgen/sgkit-publication/blob/main/content/01.outline.md.
The text was updated successfully, but these errors were encountered: