Added ThresholdTiler #539

erich-r · 2023-03-21T19:53:00Z

Description

Added a new tiler that, given a scorer and a threshold, extracts all tiles whose score is above the threshold.
There may be instances when the user does not want the top n tiles from a slide, but rather every tile with a score over a specified threshold.
In this instance, the user does not know beforehand how many tiles must be extracted.

Types of Changes

Issues Fixed or Closed by This PR

Checklist

My code follows the code style of this project.
My change requires a change to the documentation.
I have updated the documentation accordingly.
I have read the CONTRIBUTING document.
I have added tests to cover my changes.
I have tested the changes and verified that they work and don't break anything (as well as I can manage).

for more information, see https://pre-commit.ci

codecov · 2023-03-21T20:07:40Z

Codecov Report

Patch coverage: 98.36% and project coverage change: -0.07% ⚠️

Comparison is base (7fc5fcd) 100.00% compared to head (dc6cfd5) 99.93%.
Report is 64 commits behind head on master.

❗ Current head dc6cfd5 differs from pull request most recent head 2b8e315. Consider uploading reports for the commit 2b8e315 to get more accurate results

Additional details and impacted files

@@             Coverage Diff             @@
##            master     #539      +/-   ##
===========================================
- Coverage   100.00%   99.93%   -0.07%     
===========================================
  Files           19       19              
  Lines         1576     1637      +61     
  Branches       165      175      +10     
===========================================
+ Hits          1576     1636      +60     
- Partials         0        1       +1

Files Changed	Coverage Δ
histolab/tiler.py	`99.70% <98.36%> (-0.30%)`	⬇️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ernestoarbitrio

Hi @erich-r 👋

Thanks for your contribution, we really appreciate it. ❤️

I had a few time to do a first round of code review and left some comments here.

Please take a look at my comments and also I believe that you miss the integration tests for this new tiler. Unit tests are not enough to prevent regressions.

My comments are mostly implementation and design wise ... waiting for @alessiamarcolini or @nicolebussola for a paired CR.

histolab/tiler.py

ernestoarbitrio · 2023-03-22T08:02:04Z

tests/unit/test_tiler.py

+        tile = Tile(image, coords)
+        _tiles_generator = method_mock(request, GridTiler, "_tiles_generator")
+        # it needs to be a generator
+        _tiles_generator.return_value = ((tile, coords) for i in range(3))


The _tiles_generator return value in the signatures is Tuple[List[Tuple[float, CoordinatePair]], List[Tuple[float, CoordinatePair]]] and here you are assigning (tile, coords) I don't think it's quite correct.
So or the test is wrong or the signature is wrong. :D

The method being mocked is _tiles_generator of the GridTiler class, which I think has the signature Tuple[Tile, CoordinatePair], so I think both test and signatura are correct (?)

Oh ok, but now the question is why are you mocking the GridTiler? You should mock the one from the ThresholdTiler

Oh ok, but now the question is why are you mocking the GridTiler?

Because the aim of the test is to check whether the tiler can extract the scores from tiles, so I think it does not matter if the tiles come from a GridTiler.

In the test it_can_calculate_filtered_tiles I am calling the _tiles_generator from a ThresholdTiler.

yes but if you are testing the ThresholdTiler that has its own _tiles_generator you should mock that one not the super() one. This way it's more readable. Furthermore I do have doubts (as written in another message) that GridTiler inheritance is weird. 🤷🏽‍♂️

I agree with @ernestoarbitrio: here we should mock the _tiles_generator from the ThresholdTiler to be more precise (and the tests for the ScoreTiler should do the same, oops) and assign the correct return value

histolab/tiler.py

ernestoarbitrio · 2023-03-22T08:12:12Z

histolab/tiler.py

+            tile = slide.extract_tile(
+                tile_wsi_coords,
+                tile_size=self.final_tile_size,
+                mpp=self.mpp,
+                level=self.level if self.mpp is None else None,
+            )


IMHO the tile should be extracted in the _tiles_generator and then iterating over the tiles as we do in the GridTiler.

I hope I did understand what you meant, but I think this is the current behavior.
This is because the ThresholdTiler object calls _scores, which calls the _tiles_generator of GridTiler , and lastly assign each extracted tile a score.

histolab/tiler.py

for more information, see https://pre-commit.ci

nicolebussola · 2023-03-27T13:09:31Z

Hi Erich, thank you for your contribution! I think it could be more intuitive to select a threshold when the scores are scaled between 0 and 1 rather than on raw scores, what do you think?

nicolebussola · 2023-03-27T13:11:40Z

Hi Erich, thank you for your contribution! I think it could be more intuitive to select a threshold when the scores are scaled between 0 and 1 rather than on raw scores, what do you think?

alessiamarcolini

Hi @erich-r thank you so much for your contribution ⭐

As @ernestoarbitrio pointed out, it's a bit weird that the ThresholdTiler inherits from the GridTiler and not the ScoreTiler - though I understand that the fact that the ScoreTiler imposes a n_tiles parameter and that is not acceptable in this case.

Anyway this is bringing a lot of code duplication, which makes me wonder whether we should move the common code to some other "support" Tiler, or move out the static methods to standalone functions... What do we think?

alessiamarcolini · 2023-03-29T10:02:53Z

histolab/tiler.py

+        for i in range(len(all_scores)):
+            if all_scores[i][0] > self.threshold:
+                filtered_tiles_scores.append(all_scores[i])
+                filtered_tiles_scaled_scores.append(scaled_scores[i])


This could be refactored into this, to make it more Pythonic

Suggested change

for i in range(len(all_scores)):

if all_scores[i][0] > self.threshold:

filtered_tiles_scores.append(all_scores[i])

filtered_tiles_scaled_scores.append(scaled_scores[i])

for score in all_scores:

if score[0] > self.threshold:

filtered_tiles_scores.append(score)

filtered_tiles_scaled_scores.append(score)

alessiamarcolini · 2023-03-29T10:04:31Z

histolab/tiler.py

+    ) -> None:
+        """Save to ``filename`` the report of the saved tiles with the associated score.
+
+        The CSV file


oh, I see this line is cut (same as in the ScoreTiler). I'm not sure what we wanted to write here (?)
Could you remove this from both here and the ScoreTiler?

alessiamarcolini · 2023-03-29T10:07:20Z

tests/integration/test_tiler.py

+        expectation,
+    ):
+        slide = Slide(fixture_slide, "")
+        scored_tiles_extractor = ThresholdTiler(


maybe we want to call it thresholded_tiles_extractor to be more on the theme?

alessiamarcolini · 2023-03-29T10:16:05Z

tests/unit/test_tiler.py

+        tile = Tile(image, coords)
+        _tiles_generator = method_mock(request, GridTiler, "_tiles_generator")
+        # it needs to be a generator
+        _tiles_generator.return_value = ((tile, coords) for i in range(3))


I agree with @ernestoarbitrio: here we should mock the _tiles_generator from the ThresholdTiler to be more precise (and the tests for the ScoreTiler should do the same, oops) and assign the correct return value

alessiamarcolini · 2023-03-29T10:16:59Z

tests/unit/test_tiler.py

+        _tiles_generator = method_mock(request, GridTiler, "_tiles_generator")
+        # it needs to be an empty generator
+        _tiles_generator.return_value = (n for n in [])


alessiamarcolini · 2023-03-29T10:20:25Z

tests/unit/test_tiler.py

+        assert str(err.value) == "'threshold' cannot be negative (-1)"
+        _scores.assert_called_once_with(threshold_tiler, slide, binary_mask)
+
+    def it_can_extract_score_tiles(self, request, tmpdir):


Suggested change

def it_can_extract_score_tiles(self, request, tmpdir):

def it_can_extract_thresholded_tiles(self, request, tmpdir):

alessiamarcolini · 2023-03-29T10:21:15Z

tests/unit/test_tiler.py

+            [(0.8, coords), (0.7, coords)],
+            [(0.8, coords), (0.7, coords)],
+        )
+        _tile_filename = method_mock(request, GridTiler, "_tile_filename")


Suggested change

_tile_filename = method_mock(request, GridTiler, "_tile_filename")

_tile_filename = method_mock(request, ThresholdTiler, "_tile_filename")

ernestoarbitrio · 2023-04-07T09:16:33Z

Hi 👋,
just checking @erich-r do you have any news on this PR?

erich-r · 2023-04-07T09:28:15Z

Hi wave, just checking @erich-r do you have any news on this PR?

Hi!

I wrote the integration tests
Now the threshold is being compared with the normalized scores (as suggested by @nicolebussola )
Next I'm going to apply @alessiamarcolini suggestions (as soon as possible)

ernestoarbitrio · 2023-04-12T12:34:20Z

Hi wave, just checking @erich-r do you have any news on this PR?

Hi!

I wrote the integration tests

Now the threshold is being compared with the normalized scores (as suggested by @nicolebussola )
Next I'm going to apply @alessiamarcolini suggestions (as soon as possible)

awesome thanks a lot

alessiamarcolini · 2023-08-17T08:55:28Z

hey @erich-r do you have any updates about this PR?

erich-r and others added 2 commits March 21, 2023 20:43

added new tiler

2be138e

[pre-commit.ci] auto fixes from pre-commit.com hooks

c640ef3

for more information, see https://pre-commit.ci

alessiamarcolini self-requested a review March 21, 2023 19:59

ernestoarbitrio requested changes Mar 22, 2023

View reviewed changes

erich-r and others added 3 commits March 23, 2023 08:36

added integration tests, bug fixes

3879049

[pre-commit.ci] auto fixes from pre-commit.com hooks

dc6cfd5

for more information, see https://pre-commit.ci

trimmed too long lines

2b8e315

nicolebussola self-requested a review March 27, 2023 13:07

alessiamarcolini requested changes Mar 29, 2023

View reviewed changes

alessiamarcolini deleted the branch histolab:main August 17, 2023 09:04

alessiamarcolini closed this Aug 17, 2023

alessiamarcolini reopened this Aug 17, 2023

alessiamarcolini changed the base branch from master to main August 17, 2023 09:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added ThresholdTiler #539

Added ThresholdTiler #539

erich-r commented Mar 21, 2023 •

edited

Loading

codecov bot commented Mar 21, 2023 •

edited

Loading

ernestoarbitrio left a comment

ernestoarbitrio Mar 22, 2023

erich-r Mar 22, 2023 •

edited

Loading

ernestoarbitrio Mar 22, 2023

erich-r Mar 22, 2023

ernestoarbitrio Mar 22, 2023

alessiamarcolini Mar 29, 2023

ernestoarbitrio Mar 22, 2023

erich-r Mar 22, 2023

nicolebussola commented Mar 27, 2023

nicolebussola commented Mar 27, 2023

alessiamarcolini left a comment

alessiamarcolini Mar 29, 2023

alessiamarcolini Mar 29, 2023

alessiamarcolini Mar 29, 2023

alessiamarcolini Mar 29, 2023

alessiamarcolini Mar 29, 2023

alessiamarcolini Mar 29, 2023

alessiamarcolini Mar 29, 2023

ernestoarbitrio commented Apr 7, 2023

erich-r commented Apr 7, 2023

ernestoarbitrio commented Apr 12, 2023

alessiamarcolini commented Aug 17, 2023

	def it_can_extract_score_tiles(self, request, tmpdir):
	def it_can_extract_thresholded_tiles(self, request, tmpdir):

	_tile_filename = method_mock(request, GridTiler, "_tile_filename")
	_tile_filename = method_mock(request, ThresholdTiler, "_tile_filename")

Added ThresholdTiler #539

Are you sure you want to change the base?

Added ThresholdTiler #539

Conversation

erich-r commented Mar 21, 2023 • edited Loading

Description

Types of Changes

Issues Fixed or Closed by This PR

Checklist

codecov bot commented Mar 21, 2023 • edited Loading

Codecov Report

ernestoarbitrio left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

erich-r Mar 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nicolebussola commented Mar 27, 2023

nicolebussola commented Mar 27, 2023

alessiamarcolini left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ernestoarbitrio commented Apr 7, 2023

erich-r commented Apr 7, 2023

ernestoarbitrio commented Apr 12, 2023

alessiamarcolini commented Aug 17, 2023

erich-r commented Mar 21, 2023 •

edited

Loading

codecov bot commented Mar 21, 2023 •

edited

Loading

erich-r Mar 22, 2023 •

edited

Loading