-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into tcatley/dna-width
- Loading branch information
Showing
49 changed files
with
5,934 additions
and
436 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
# TopoStats Pull Requests | ||
|
||
Please provide a descriptive summary of the changes your Pull Request introduces. | ||
|
||
The [Software Development](https://afm-spm.github.io/TopoStats/main/contributing.html#software-development) section of | ||
the Contributing Guidelines may be useful if you are unfamiliar with linting, pre-commit, docstrings and testing. | ||
|
||
**NB** - This header should be replaced with the description but please complete the below checklist or a short | ||
description of why a particular item is not relevant. | ||
|
||
--- | ||
|
||
Before submitting a Pull Request please check the following. | ||
|
||
- [ ] Existing tests pass. | ||
- [ ] Documentation has been updated and builds. Remember to update `configuration.md`, `usage.md`, and relevant | ||
processing sections under `advanced.md`. | ||
- [ ] Pre-commit checks pass. | ||
- [ ] New functions/methods have typehints and docstrings. | ||
- [ ] New functions/methods have tests which check the intended behaviour is correct. | ||
|
||
## Optional | ||
|
||
### `topostats/default_config.yaml` | ||
|
||
If adding options to `topostats/default_config.yaml` please ensure. | ||
|
||
- [ ] There is a comment adjacent to the option explaining what it is and the valid values. | ||
- [ ] A check is made in `topostats/validation.py` to ensure entries are valid. | ||
- [ ] Add the option to the relevant sub-parser in `topostats/entry_point.py`. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -39,6 +39,7 @@ pytest-debug.ini | |
|
||
# Documentation | ||
_build/ | ||
!docs/_static/images/** | ||
|
||
# MacOS | ||
.DS_Store | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,9 +10,10 @@ config: | |
- div | ||
- br | ||
|
||
# Globs | ||
globs: | ||
- "**/*.md" | ||
|
||
# Fix any fixable errors | ||
ignores: | ||
- "tmp/" | ||
|
||
fix: true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+373 KB
docs/_static/images/flattening/flattening_tilt_removal_full_zrange.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+896 KB
docs/_static/images/grain_finding/grain_finding_unet_multi_class_example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,137 @@ | ||
# Flattening | ||
|
||
Flattening is the process of taking a raw AFM image, and removing the image artefacts that are present due to the | ||
scanning probe microscopy (SPM) and AFM imaging. These encompass, but are not limited to; row alignment from the raster | ||
scanning motion, and polynomial flattening of a surface from piezoelectric bowing. | ||
For surface based samples, such as DNA on Mica, this results in an image where the background mica is flat and the | ||
sample is clearly visible resting on the surface. | ||
|
||
Here is a raw, unprocessed AFM image: | ||
|
||
![raw AFM image](../_static/images/flattening/flattening_raw_afm_image.png) | ||
|
||
You can see there is a large tilt in the image from the bottom right to the top left, as well as lots of horizontal | ||
banding throughout the rows in the image. These artefacts are removed | ||
during the flattening process in TopoStats knows as `Filters`. | ||
|
||
## At a Glance - Removing AFM Imaging Artefacts | ||
|
||
Images are processed by: | ||
|
||
- Row alignment (make each row median the same height) | ||
- Tilt & polynomial removal (fit a plane and quadratic polynomial to the image and subtract) | ||
- Scar removal (remove long, thin, bright streaks in the data) | ||
- Zero the average height (lower the image by the mean height) to make the background roughly centred at zero nm | ||
- Masking (detect objects on the surface and flatten the image again, ignoring the data on the surface) | ||
- Secondary flattening (re-process the data using the mask to tell us where the background is, and zero the data using | ||
the mean of the background mask) | ||
- Gaussian filter (to smooth pixel differences / high-gain noise) | ||
|
||
![flattening pipeline](../_static/images/flattening/flattening_pipeline.png) | ||
|
||
## Row alignment | ||
|
||
The first step in the flattening process is **row alignment**. Row alignment is a process that adjusts the height of | ||
each row of the image so that they all share the same median height value. This "median" value is set by the | ||
`row_alignment_quartile` where the default of 0.5 is the median value, but can be adjusted depending on how much data | ||
is considered background. This gets rid of some of the horizontal | ||
banding and produces an image where the rows are aligned, but the image still has a clear tilt. | ||
|
||
![row alignment](../_static/images/flattening/flattening_align_rows.png) | ||
|
||
## Tilt removal | ||
|
||
After row alignment, tilt removal is applied. This is a simple process of fitting and subtracting a plane to the image, | ||
resulting in a mostly flat image. However as you can see in the following image, it's not perfect and there still | ||
exists "shadows" on rows with lots of non-background data. | ||
Two images are provided here, one with the full z-range and one with an adjusted height range (z-range) to show | ||
the remaining artefacts better, such as the low regions or "shadows" on rows with lots of non-background data. | ||
|
||
![tilt_removal_full_zrange](../_static/images/flattening/flattening_tilt_removal_full_zrange.png) | ||
|
||
![tilt removal_better_viewing](../_static/images/flattening/flattening_tilt_removal.png) | ||
|
||
## Polynomial removal | ||
|
||
After the tilt, we remove the polynomial trends. In some images, there is also quadratic or occasionally cubic bowing to | ||
the image too. We remove this by fitting a two dimensional quadratic polynomial to the image (in the horizontal | ||
direction), and subtracting it from the image. We then do the same for a nonlinear polynomial (z = a*x*y) to eliminate | ||
“saddle” trends in the data. We could do all of these at the same time, but we like to be able to see the iterative | ||
differences. | ||
|
||
## Scar removal (optional) | ||
|
||
We then optionally run scar removal on the image. This is a special function that detects scars - long, thin, bright / dark | ||
streaks in the data, caused by physical problems in the AFM process. They are found by the parameters; `threshold_low` | ||
and `threshold_high` identifying great height changes between rows, and filtered for scars via `max_scar_width` | ||
and `min_scar_length` in pixel lengths. We are using a different image here as an example since our lovely | ||
minicircles.spm image doesn’t have any scars. | ||
|
||
![scarred image](../_static/images/flattening/flattening_scarred_image.png) | ||
|
||
![scar removed](../_static/images/flattening/flattening_scar_removed.png) | ||
|
||
**Note that scar removal can distort data, and it’s best to take data without scars if you can.** | ||
|
||
## Zero the average height | ||
|
||
![height zeroing](../_static/images/flattening/flattening_height_zeroing.png) | ||
|
||
We then lower the image by its mean height which causes the background of the image to be roughly centred at zero nm. | ||
If this function is provided a foreground mask such as in the second iteration of flattening, this function zeros the | ||
data only on the background data. | ||
Data zeroing is important since the raw AFM heights are relative, and these processing steps can shift the background | ||
height away from zero, so this makes it easier to obtain comparative height metrics. | ||
|
||
## Masking | ||
|
||
Now consider that all the processing we have done has assumed that every pixel of the image is background. We assumed | ||
that there were no objects on the surface, messing up our fitting, and row alignment. If there was a large amount of | ||
DNA on one side of the image, then the slope will be affected by it, and so flatten the image poorly. | ||
|
||
Because of this, once we have done our initial flattening, we detect our objects on the surface, and then flatten the | ||
image again! But this time, ignoring the data on the surface, and only considering the background. | ||
|
||
How do we do that? | ||
Well first, we need to find the data on the surface. We do this by thresholding. | ||
The type of threshold (standard deviation - `std_dev`, absolute - `absolute`, otsu - `otsu`), and the threshold values | ||
are set by the config file (have a look!). Any pixels that are below the threshold, are considered | ||
background (sample surface). Any pixels that are above the threshold are considered to be data (useful sample objects). | ||
This binary classification allows us to make a binary mask of where is foreground data, and where is background. | ||
|
||
For more information on thresholding and how to set it, see the [thresholding](thresholding.md) page. | ||
|
||
Here is the binary mask for minicircle.spm: | ||
|
||
![tilt_removed_with_mask](../_static/images/flattening/tilt_removed_with_mask.png) | ||
|
||
So you can see how all the interesting foreground (high) regions are now masked in white, and the background is in | ||
black. | ||
|
||
This allows TopoStats to use only the background (black pixels) in its calculations for slope removal, row alignment | ||
etc. | ||
|
||
So we re-do all the previous processing, but with this new useful binary mask to guide us. | ||
|
||
## Secondary flattening | ||
|
||
After re-processing the data using the mask to tell us where the background is, we get a better, more accurately | ||
flattened image. We can see the "shadows" on rows with lots of data have now been flattened correctly. | ||
|
||
From here, we can go on to do things like finding our objects of interest (grains) and get stats about them. | ||
|
||
![secondary flattening](../_static/images/flattening/flattening_final_flattened_image.png) | ||
|
||
## Gaussian filter | ||
|
||
Finally, we apply a Gaussian filter to the image to smooth height differences and remove high-gain noise. This allows | ||
you to get smoother data | ||
but will start to blur out important features if you apply it too strongly. The default strength is a sigma of 1.0, but | ||
you can adjust this in the config file under `gaussian_size`. The `gaussian_mode` parameter suggests how values at | ||
the border should be handled, see | ||
[skimage.filters.gaussian](https://scikit-image.org/docs/stable/api/skimage.filters.html#skimage.filters.gaussian) | ||
for more details. | ||
|
||
Here are some examples of different gaussian sizes: | ||
|
||
![gaussian_sizes](../_static/images/flattening/gaussian_sizes.png) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
# Grain finding | ||
|
||
## At a Glance - Identifying Objects of Interest | ||
|
||
TopoStats automatically tries to find grains (objects of interest) in your AFM images. There are several steps to this. | ||
|
||
- **Height thresholding**: We find grains based on their height in the image. | ||
- **Remove edge grains**: We remove grains that intersect the image border. | ||
- **Size thresholding**: We remove grains that are too small or too large. | ||
- **Optional: U-Net mask improvement**: We can use a U-Net to improve the mask of each grain. | ||
|
||
## Height thresholding | ||
|
||
Grain finding is the process of detecting useful objects in your AFM images. This might be DNA, proteins, holes in a | ||
surface or ridges on a surface. | ||
In the standard operation of TopoStats, the way we find objects is based on a height threshold. This means that we | ||
detect where things are based on how high up they are. | ||
|
||
For example, with our example minicircles.spm image, we have DNA that is poking up from the sample surface, represented | ||
by | ||
bright regions in the image, alongside impurities and proteins, also above the surface: | ||
|
||
![minicircles image](../_static/images/grain_finding/grain_finding_minicircles.png) | ||
|
||
If we want to select the DNA, then we can take only the regions of the image that are above a certain height | ||
threshold (standard deviation - `std_dev`, absolute - `absolute`, otsu - `otsu`). | ||
|
||
Here are several thresholds to show you what happens as we increase the absolute height threshold: | ||
|
||
![height thresholds](../_static/images/grain_finding/grain_finding_grain_thresholds.png) | ||
|
||
Notice that the amount of data decreases, until we are only left with the very highest points. | ||
|
||
The aim is to choose a threshold that keeps the data you want, while removing the background and other low objects | ||
that you don’t want including. | ||
So in this example, a threshold of 0.5 would be best, since it keeps the DNA while removing the background. | ||
|
||
There are lots of objects in this mask that we don't want to analyse, but we can remove those using area thresholds in | ||
the next steps. These objects have been detectd because while they are small, they are still high up and above the | ||
background. | ||
|
||
For more information on the types of thresholding, and how to set them, see the [thresholding](thresholding.md) page. | ||
|
||
## Remove edge grains | ||
|
||
Some grains may intersect the image border. In these cases, the grain will not be able to have accuracte statistics | ||
calculated for it, since it is not fully in the image. Because of this, we have the option of removing grains that | ||
intersect the image border with the `remove_edge_intersecting_grains` flag in the config file. This simply removes | ||
any grains that intersect the image border. | ||
|
||
Here is a before and after example of removing edge grains: | ||
|
||
![size_thresholding](../_static/images/grain_finding/grain_finding_tidy_borders.png) | ||
|
||
## Size thresholding | ||
|
||
In our thresholded image, you will notice that we have a lot of small grains that we do not want to analyse in our | ||
image. We can get rid of those with size thresholding. This is where TopoStats will remove grains based on their area, | ||
leaving only the right size of molecules. You will need to play around with the thresholds to get the right results. | ||
|
||
You can set the size threshold using the `absolute_area_threshold` in the config file. This sets the minimum and | ||
maximum area of the grains that you want to keep, in nanometers squared. Eg if you want to keep grains that are between | ||
10nm^2 and 100nm^2, you would set `absolute_area_threshold` to `[10, 100]`. | ||
|
||
![size_thresholding](../_static/images/grain_finding/grain_finding_size_thresholding.png) | ||
|
||
## Optional: U-Net mask improvement | ||
|
||
As an additional optional step, each grain that reaches this stage can be improved by using a U-Net to mask the grain | ||
again. This requires a U-Net model path to be supplied in the config file. | ||
|
||
The U-Net model will take the bounding box of each grain, makes it square, and passees it to a trained U-Net model | ||
which makes a prediction for a better mask, which then replaces the original mask. | ||
|
||
Here is an example comparing absolute height thresholding to U-Net masking for one of our projects. The white boxes | ||
indicate regions where the height threhsold performs poorly and is improved by the U-Net mask. | ||
|
||
![unet_example](../_static/images/grain_finding/grain_finding_unet_example.png) | ||
|
||
### Multi-class masking | ||
|
||
TopoStats supports masking with multiple classes. This means that you could use a U-Net to mask DNA and proteins | ||
separately. | ||
|
||
This requires a U-Net that has been trained on multiple classes. | ||
|
||
Here is an example of multi-class masking using a U-Net which was used for one of our projects. | ||
|
||
![multi_class_unet_example](../_static/images/grain_finding/grain_finding_unet_multi_class_example.png) | ||
|
||
## Technical details | ||
|
||
### Details: Multi-class masking | ||
|
||
Multi class masking is implemented by having each image be a tensor of shape N x N x C, where N is the image size, | ||
and C is the number of classes. Each class is a binary mask, where 1 is the class, and 0 is not the class. | ||
The first channel is background, where 1 is background, and 0 is not background. The rest of the channels | ||
are arbitrary, and defined by how the U-Net was trained, however we conventially recommend that the first class | ||
be for DNA (if applicable) and the next classes for other objects. |
Oops, something went wrong.