Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2023-12-05 workshop debrief notes #18

Open
3 of 8 tasks
tavareshugo opened this issue Dec 13, 2023 · 1 comment
Open
3 of 8 tasks

2023-12-05 workshop debrief notes #18

tavareshugo opened this issue Dec 13, 2023 · 1 comment

Comments

@tavareshugo
Copy link
Collaborator

tavareshugo commented Dec 13, 2023

The timetable on the GDoc for this iteration is quite accurate (I adjusted it to match what actually happened, with the help of the recording).

Some ideas compiled here, which we can turn into issues at some point:

  • Add extra exercises for those that are quicker and/or more experienced
  • MaxQuant - add something about this, as it is open source. Conceptually it's similar to PD, but we can document column names and their meanings, and give an example of how to import it into R.
  • PD - document column meanings (can be done along with the task above, maybe a table matching the column names from the two packages and their meaning)
  • More visualisation examples
    • heatmaps
    • different dimred methods and/or clustering
    • emphasise MA plot more than volcano
  • GO analysis on non-model organisms
    • this is perhaps for another course, but we could look into adding some resources for alternative organisms
    • show how to obtain GO annotations online into a local file
    • could also use goseq package instead of clusterProfiler
    • another alternative is using cytoscape-based analysis using BiNGO extension
  • Experimental design
    • Extra set of slides at the end of the course talking more about experimental design and statistical power. Do this towards the end, so it comes after the whole workflow is covered and they understand what the analysis procedure looks like.
    • How to deal with more heterogeneous designs, e.g. batch effects
    • experimental design considerations that are specific depending on the type of technology being used
  • Missing values
    • how to deal with cases where sometimes you do expect missing values so the imputation has to be done to account for that
    • currently there's no clean way of doing this, although Charlotte D may have some code to do this
  • need to clarify the correct direction of addAssayLinks()
    • possibly assayLinks can be added directly when using assayLink()
@TomSmithCGAT
Copy link
Member

Following meeting with @Charl-Hutchings , @lmsimp & @csdaw, we agreed the following:

  • MaxQuant: @csdaw will add small section (box?), location TBD for maxQuant columns and how to read into QFeatures. Any parts of workflow for which there are not equivalent columns, e.g. filtering.
  • Visualisation: @TomSmithCGAT will add a (🦶)heatmap at the end of the stats section
  • GO:
    • Risks getting outside of scope
    • Maybe short section of text to say that non human/mouse GO annotations are available, usually from species-specific DBs.
    • Recommend goseq with box to justify why (@TomSmithCGAT)
  • Experimental design: Leave a dedicated time for this at the end of day2. Ask participants to put any design Qs in a separate gdoc, with reminder on day 1 to help us prepare, identify common Qs and generalities, and Qs which are very specific and best answered individually after the course
  • Missing values: Slim down section to focus on TMT. See Add comparison table for LFQ versus labelled #19 re seperate section on LFQ and where the differences would be from the TMT workflow presented

We didn't discuss (from memory) extra exercises, PD column details or assayLinks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants