-
-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added ThresholdTiler #539
base: main
Are you sure you want to change the base?
Added ThresholdTiler #539
Conversation
for more information, see https://pre-commit.ci
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #539 +/- ##
===========================================
- Coverage 100.00% 99.93% -0.07%
===========================================
Files 19 19
Lines 1576 1637 +61
Branches 165 175 +10
===========================================
+ Hits 1576 1636 +60
- Partials 0 1 +1
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @erich-r 👋
Thanks for your contribution, we really appreciate it. ❤️
I had a few time to do a first round of code review and left some comments here.
Please take a look at my comments and also I believe that you miss the integration tests for this new tiler. Unit tests are not enough to prevent regressions.
My comments are mostly implementation and design wise ... waiting for @alessiamarcolini or @nicolebussola for a paired CR.
tile = Tile(image, coords) | ||
_tiles_generator = method_mock(request, GridTiler, "_tiles_generator") | ||
# it needs to be a generator | ||
_tiles_generator.return_value = ((tile, coords) for i in range(3)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _tiles_generator
return value in the signatures is Tuple[List[Tuple[float, CoordinatePair]], List[Tuple[float, CoordinatePair]]]
and here you are assigning (tile, coords)
I don't think it's quite correct.
So or the test is wrong or the signature is wrong. :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method being mocked is _tiles_generator
of the GridTiler
class, which I think has the signature Tuple[Tile, CoordinatePair]
, so I think both test and signatura are correct (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok, but now the question is why are you mocking the GridTiler
? You should mock the one from the ThresholdTiler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok, but now the question is why are you mocking the
GridTiler
?
Because the aim of the test is to check whether the tiler can extract the scores from tiles, so I think it does not matter if the tiles come from a GridTiler
.
In the test it_can_calculate_filtered_tiles
I am calling the _tiles_generator
from a ThresholdTiler
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes but if you are testing the ThresholdTiler
that has its own _tiles_generator
you should mock that one not the super() one. This way it's more readable. Furthermore I do have doubts (as written in another message) that GridTiler inheritance is weird. 🤷🏽♂️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @ernestoarbitrio: here we should mock the _tiles_generator
from the ThresholdTiler
to be more precise (and the tests for the ScoreTiler
should do the same, oops) and assign the correct return value
tile = slide.extract_tile( | ||
tile_wsi_coords, | ||
tile_size=self.final_tile_size, | ||
mpp=self.mpp, | ||
level=self.level if self.mpp is None else None, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO the tile should be extracted in the _tiles_generator
and then iterating over the tiles as we do in the GridTiler
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope I did understand what you meant, but I think this is the current behavior.
This is because the ThresholdTiler
object calls _scores
, which calls the _tiles_generator
of GridTiler
, and lastly assign each extracted tile a score.
Hi Erich, thank you for your contribution! I think it could be more intuitive to select a threshold when the scores are scaled between 0 and 1 rather than on raw scores, what do you think? |
1 similar comment
Hi Erich, thank you for your contribution! I think it could be more intuitive to select a threshold when the scores are scaled between 0 and 1 rather than on raw scores, what do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @erich-r thank you so much for your contribution ⭐
As @ernestoarbitrio pointed out, it's a bit weird that the ThresholdTiler
inherits from the GridTiler
and not the ScoreTiler
- though I understand that the fact that the ScoreTiler
imposes a n_tiles
parameter and that is not acceptable in this case.
Anyway this is bringing a lot of code duplication, which makes me wonder whether we should move the common code to some other "support" Tiler, or move out the static methods to standalone functions... What do we think?
for i in range(len(all_scores)): | ||
if all_scores[i][0] > self.threshold: | ||
filtered_tiles_scores.append(all_scores[i]) | ||
filtered_tiles_scaled_scores.append(scaled_scores[i]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be refactored into this, to make it more Pythonic
for i in range(len(all_scores)): | |
if all_scores[i][0] > self.threshold: | |
filtered_tiles_scores.append(all_scores[i]) | |
filtered_tiles_scaled_scores.append(scaled_scores[i]) | |
for score in all_scores: | |
if score[0] > self.threshold: | |
filtered_tiles_scores.append(score) | |
filtered_tiles_scaled_scores.append(score) |
) -> None: | ||
"""Save to ``filename`` the report of the saved tiles with the associated score. | ||
|
||
The CSV file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I see this line is cut (same as in the ScoreTiler
). I'm not sure what we wanted to write here (?)
Could you remove this from both here and the ScoreTiler
?
expectation, | ||
): | ||
slide = Slide(fixture_slide, "") | ||
scored_tiles_extractor = ThresholdTiler( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we want to call it thresholded_tiles_extractor
to be more on the theme?
tile = Tile(image, coords) | ||
_tiles_generator = method_mock(request, GridTiler, "_tiles_generator") | ||
# it needs to be a generator | ||
_tiles_generator.return_value = ((tile, coords) for i in range(3)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @ernestoarbitrio: here we should mock the _tiles_generator
from the ThresholdTiler
to be more precise (and the tests for the ScoreTiler
should do the same, oops) and assign the correct return value
_tiles_generator = method_mock(request, GridTiler, "_tiles_generator") | ||
# it needs to be an empty generator | ||
_tiles_generator.return_value = (n for n in []) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
assert str(err.value) == "'threshold' cannot be negative (-1)" | ||
_scores.assert_called_once_with(threshold_tiler, slide, binary_mask) | ||
|
||
def it_can_extract_score_tiles(self, request, tmpdir): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def it_can_extract_score_tiles(self, request, tmpdir): | |
def it_can_extract_thresholded_tiles(self, request, tmpdir): |
[(0.8, coords), (0.7, coords)], | ||
[(0.8, coords), (0.7, coords)], | ||
) | ||
_tile_filename = method_mock(request, GridTiler, "_tile_filename") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_tile_filename = method_mock(request, GridTiler, "_tile_filename") | |
_tile_filename = method_mock(request, ThresholdTiler, "_tile_filename") |
Hi 👋, |
Hi!
|
awesome thanks a lot |
hey @erich-r do you have any updates about this PR? |
Description
Added a new tiler that, given a scorer and a threshold, extracts all tiles whose score is above the threshold.
There may be instances when the user does not want the top n tiles from a slide, but rather every tile with a score over a specified threshold.
In this instance, the user does not know beforehand how many tiles must be extracted.
Types of Changes
Issues Fixed or Closed by This PR
Checklist