MRCToxBioinformatics · TomSmithCGAT · Jul 2, 2024 · Jul 2, 2024
diff --git a/Markdowns/TMT_phospho.Rmd b/Markdowns/TMT_phospho.Rmd
@@ -51,7 +51,7 @@ Alongside the published total data shown above, we also performed a yeast/human
 
 
 |Tag   | Total | Phopsho | Ratio |
-|------|-------|---------|-------|
+|:----:|:-----:|:-------:|:-----:|
 | 126  |1x     |1x       |1      |
 | 127N |1x     |1x       |1      |
 | 127C |1x     |1x       |1      |
@@ -214,7 +214,7 @@ psm_res <- list('Total'=psm.total, 'Phospho'=psm.phospho)
 
 
 Below, we assess the quantification distributions. There are no clear outlier samples or any other concerns.
-```{r}
+```{r, warning=FALSE, fig.height=5, fig.width=7, fig.fullwidth=TRUE}
 for(x in names(psm_res)){
 
   p <- psm_res[[x]] %>% log(base=2) %>% plot_quant() +
@@ -237,7 +237,7 @@ We want to remove low Signal:Noise (S:N) PSMs, since the quantification values w
 
 We use Note that where the signal:noise > 10, there are far fewer missing values.
 
-```{r, fig.height=5, fig.width=7, fig.fullwidth=TRUE, fig.cap="Missing values per PSM, in relation to the signal:noise ratio"}
+```{r, fig.height=5, fig.width=7, fig.fullwidth=TRUE}
 for(x in names(psm_res)){
 
     p <- psm_res[[x]] %>% plot_missing_SN(bins=40) +
@@ -269,7 +269,7 @@ psm_filt_sn_int <- psm_res %>% lapply(function(x){
 ```
 
 
-Below, we will also filter our PSMs using the Delta.Score, e.g the difference between the score for the top two hits. Where this is the difference between the top score in the MS2 spectrum matching and the second top score. For a mixed species sample, it's likely some peptides will be difficult to assign to the protein from the correct species, so we are being extra cautious here. For a typical phospho TMT proteomics experiment, it would likely be reasonable to use a more relaxed filter above for the interference (e.g 50%) and to not filter by Delta Score at all.
+Below, we will also filter our PSMs using the Delta.Score which is the difference between the top score in the MS2 spectrum matching and the second top score. For a mixed species sample, it's likely some peptides will be difficult to assign to the protein from the correct species, so we are being extra cautious here. For a typical TMT phosphoproteomics experiment, it would likely be reasonable to use a more relaxed filter for the interference in the code chunk above (e.g 50%)and to not filter by Delta Score at all.
 
 ```{r}
 

diff --git a/Markdowns/TMT_phospho.html b/Markdowns/TMT_phospho.html
diff --git a/Markdowns/TMT_phospho_stats.Rmd b/Markdowns/TMT_phospho_stats.Rmd
@@ -43,7 +43,7 @@ phospho_sites <- readRDS(here('results/phospho_sites.rds'))
 
 
 Below, we identify the total peptides that overlap each phosphosite. Note that each phosphosite may intersect multiple total peptides (due to missed cleavages) and each total peptide may intersect multiple phosphosites (due to multiple phosphosites per peptide).
-```{r}
+```{r, warning=FALSE}
 phospho_f <- phospho_sites %>% fData() %>% tibble::rownames_to_column('ID')
 total_f <- total %>% fData() %>% tibble::rownames_to_column('ID')
 
@@ -105,7 +105,7 @@ Inspect the combined quantification matrix.
 ```{r}
 head(all_combined_exprs) #
 ```
-
+We also need to create feature data and phenotype data (experimental information) for our combined quantification matrix. From these, we can then create an MSnSet to hold all the combined quantification data.
 ```{r}
 all_combined_fdata <- fData(phospho_sites[rownames(all_combined_exprs),])[
   ,c('Master.Protein.Accessions', 'ptm_position')]
@@ -136,7 +136,9 @@ pData(pairwise_comparison_combined_res)
 Below, we perform the limma analysis. First though, some clarification on the model we're using.
 
 We have two experimental variables:
+
 - type: phospho or total
+
 - condition: 1x or 2x spike in
 
 ```{r}
@@ -145,11 +147,17 @@ condition <- factor(pData(pairwise_comparison_combined_res)$condition, levels=1:
 print(paste(type, condition))
 ```
 
-When we use an model with the terms type, condition and type:condition, the interaction term captures the difference between the observed data and the simple additive model when the abundance is dependent upon just the separate effect of the type and condition variables and there is no combinatorial effect.
+A simple additive model would estimate independent effects for the `type` and `condition` variables
+
+
 ```{r}
 # model without interaction term
 print(model.matrix(~type+condition))
+```
 
+If we want to explore a possible combinatorial effect, we need to also include a `type:condition` interaction term. In our case, this captures the difference in abundance for phosphopeptides between the conditions, which is specific for phosphopeptides and not seen in the unmodified peptides.
+
+```{r}
 # model with an interaction term
 print(model.matrix(~type+condition+type:condition))
 

diff --git a/Markdowns/TMT_phospho_stats.html b/Markdowns/TMT_phospho_stats.html
@@ -496,9 +496,6 @@ <h3>Input data</h3>
 
   phospho_peptide_to_total_peptides[[row$ID]] &lt;- total_peptide_ids
 }</code></pre>
-<pre><code>## Warning: NAs introduced by coercion
-## Warning: NAs introduced by coercion
-## Warning: NAs introduced by coercion</code></pre>
 <p>Now, how many overlapping features. Note that for the phospho and
 total to be considered intersecting, there must be at least one total
 peptide which overlaps the phosphosite. Out of the 666 phosphosites,
@@ -572,6 +569,9 @@ <h3>Input data</h3>
 ## O60841_164                7.4      12.7
 ## O60841_214               63.7      56.5
 ## O75554_258|262           81.6     102.5</code></pre>
+<p>We also need to create feature data and phenotype data (experimental
+information) for our combined quantification matrix. From these, we can
+then create an MSnSet to hold all the combined quantification data.</p>
 <pre class="r"><code>all_combined_fdata &lt;- fData(phospho_sites[rownames(all_combined_exprs),])[
   ,c(&#39;Master.Protein.Accessions&#39;, &#39;ptm_position&#39;)]
 all_combined_fdata$n_total &lt;- sapply(rownames(all_combined_exprs),
@@ -644,19 +644,19 @@ <h3>Input data</h3>
 <h3>Run limma</h3>
 <p>Below, we perform the limma analysis. First though, some
 clarification on the model we’re using.</p>
-<p>We have two experimental variables: - type: phospho or total -
-condition: 1x or 2x spike in</p>
+<p>We have two experimental variables:</p>
+<ul>
+<li><p>type: phospho or total</p></li>
+<li><p>condition: 1x or 2x spike in</p></li>
+</ul>
 <pre class="r"><code>type &lt;- factor(pData(pairwise_comparison_combined_res)$type, level=c(&#39;total&#39;, &#39;phospho&#39;))
 condition &lt;- factor(pData(pairwise_comparison_combined_res)$condition, levels=1:2)
 print(paste(type, condition))</code></pre>
 <pre><code>##  [1] &quot;phospho 1&quot; &quot;phospho 1&quot; &quot;phospho 1&quot; &quot;phospho 1&quot; &quot;phospho 2&quot; &quot;phospho 2&quot;
 ##  [7] &quot;phospho 2&quot; &quot;total 1&quot;   &quot;total 1&quot;   &quot;total 1&quot;   &quot;total 1&quot;   &quot;total 2&quot;  
 ## [13] &quot;total 2&quot;   &quot;total 2&quot;</code></pre>
-<p>When we use an model with the terms type, condition and
-type:condition, the interaction term captures the difference between the
-observed data and the simple additive model when the abundance is
-dependent upon just the separate effect of the type and condition
-variables and there is no combinatorial effect.</p>
+<p>A simple additive model would estimate independent effects for the
+<code>type</code> and <code>condition</code> variables</p>
 <pre class="r"><code># model without interaction term
 print(model.matrix(~type+condition))</code></pre>
 <pre><code>##    (Intercept) typephospho condition2
@@ -682,6 +682,11 @@ <h3>Run limma</h3>
 ## 
 ## attr(,&quot;contrasts&quot;)$condition
 ## [1] &quot;contr.treatment&quot;</code></pre>
+<p>If we want to explore a possible combinatorial effect, we need to
+also include a <code>type:condition</code> interaction term. In our
+case, this captures the difference in abundance for phosphopeptides
+between the conditions, which is specific for phosphopeptides and not
+seen in the unmodified peptides.</p>
 <pre class="r"><code># model with an interaction term
 print(model.matrix(~type+condition+type:condition))</code></pre>
 <pre><code>##    (Intercept) typephospho condition2 typephospho:condition2