Change coordinate integration from raw intensity to area under the curve #24
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What is the purpose of this PR?
When writing out the peak image, the raw peak intensity at each spot is currently used. The agreed upon practice is to use the area under this curve from the peak base, so this needs to be updated.
How did you implement your changes
In order to properly calculate this area, the
l_ips_r
(left bound of peak),r_ips_r
(right bound of peak), andpeak_widths_height
(height at which peak base begins) needs to be passed intocoordinate_integration
.Find the closest value in the spot spectra that matches up with the m/z values associated with the corresponding
l_ips_r
andr_ips_r
, then use those indices as the bounds for integrating the signal at the spot. Integration is done usingscipy.integrate.simpson
.Because the peak only begins at 10% higher than the peak's prominence, the bottom rectangle below this must be subtracted out. This area to be removed can be computed using the corresponding
peak_widths_height
and the base length defined by subtracting the correspondingl_ips_r
fromr_ips_r
.Remaining issues
The peaks are currently determined across all spots. However, integration needs to be done on a spot level. While it is assumed that peaks at a spot level correspond to peaks across all spots, we have no way of knowing for sure. If this isn't the case, integration for peaks could be wrong because the actual peak interval for spots aren't getting captured.
This PR may also need to change if the raw glycan list is used instead of the existing peak finding/filtering algorithm.