Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connected components #54

Merged
merged 9 commits into from
Aug 22, 2018
Merged

Connected components #54

merged 9 commits into from
Aug 22, 2018

Conversation

hanslovsky
Copy link
Member

This PR adds a new version of connected component analysis

  • Special case scan line implementation for DiamondShape with unit range
  • Use union find data structure to track connected components.
  • No ExecutorService required (useful for running wihin parallelization frameworks like Spark)
  • Use imglib2 Shape to allow for arbitrary neighborhoods.
  • Restrict to integer size images because of array based union find. For larger images, we could use a sparse union find based on hash maps but this is slower and will also be limited to images that do not have more than Integer.MAX_VALUE foreground pixels. For large pixels, the task should be divided and parallelized anyway.

I suggest we wait until a new release of pom-imglib2 before merging so we can avoid overriding the managed version of imglib2 dependency.

 - `ConnectedComponentAlaysis` implementation of connected component analysis
 - `UnionFind` implementation of union find (required for cca)
 - `ConnectedComponentAnalaysisTest` tests for 2D and 3D
Thanks to @tpietzsch for a prompt release!
@hanslovsky
Copy link
Member Author

OK to merge @axtimwalde @tpietzsch @StephanPreibisch?

pom.xml Outdated
@@ -209,6 +209,7 @@ Jean-Yves Tinevez and Michael Zinsmaier.</license.copyrightOwners>
<dependency>
<groupId>net.imglib2</groupId>
<artifactId>imglib2</artifactId>
<version>4.6.1</version>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be defined in a property, no?

<properties>
  <imglib2.version>4.6.1</imglib2.version>
</properties>

<dependency>
    <groupId>net.imglib2</groupId>
    <artifactId>imglib2</artifactId>
    <version>{imglib2.version}</version>
</dependency>

Besides this, it can probably dropped entirely when switching to the latest pom-imglib2 parent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch @imagejan
This was necessary at the time of the PR, but we don't need this anymore, now.

Copy link
Member

@ctrueden ctrueden Aug 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need to override the version of a component in the future, please use a version property as @imagejan described, rather than hardcoding it in the <version> itself. Otherwise, the melting-pot script and other tooling will be unable to override it and some kinds of version skew will not be caught by the unified builds.

I filed scijava/scijava-maven-plugin#16 to hopefully aid with avoiding this scenario in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do (actually, already doing that in most of my projects).

@hanslovsky
Copy link
Member Author

hanslovsky commented Aug 15, 2018

@axtimwalde has a use case in which it would be helpful if connected components analysis would produce the same id as if connected components analysis was run on the full volume for any fully contained connected component in an arbitrary subvolume. As an example, let

final RandomAccessibleInterval< B > mask = ...
final RandomAccessibleInterval< B > subVolume1 = Views.interval( mask, ... );
final RandomAccessibleInterval< B > subVolume2 = Views.interval( mask, ... );

Any connected component that is fully contained in subVolume1 should have the same label as it would have in mask. As a consequence, if that same connected component is fully contained in subVolume2, as well, it will be labeled with the same id in that sub-volume, too.

I implemented this extension in a separate branch:
https://github.com/hanslovsky/imglib2-algorithm/tree/connected-components-more-general
Example:
https://github.com/hanslovsky/sandbox-with-ij/blob/master/src/main/kotlin/org/janelia/saalfeldlab/labels/ConnectedComponentsExample.kt
If @axtimwalde thinks that this extension helps his use case, I will rebase that extension into this PR.

 - union find interface to allow for sparse labels
 - add interface for caller to specify id for each voxel (instead of using IntervalIndexer)

Avoid use of streams

Add IntArrayUnionFind

Add union find for sparse labels with TLongMap as store

Add test with offset

Expose mapping from set root to id to caller and add IdFromIntervalIndexerWithInterval

Restructure CCA
Move test into appropriate package
@hanslovsky
Copy link
Member Author

@axtimwalde I rebased that branch as mentioned above.

@hanslovsky hanslovsky merged commit c747a27 into imglib:master Aug 22, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants