Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SortedSet#intersect does not always return elements from this #1666

Open
duponter opened this issue Aug 9, 2024 · 3 comments
Open

SortedSet#intersect does not always return elements from this #1666

duponter opened this issue Aug 9, 2024 · 3 comments

Comments

@duponter
Copy link

duponter commented Aug 9, 2024

Context

According to the JavaDocs SortedSet#intersect should always return members from this:

https://eclipse.dev/collections/javadoc/11.1.0/org/eclipse/collections/api/set/sorted/ImmutableSortedSet.html#intersect(org.eclipse.collections.api.set.SetIterable)

Description copied from interface: SortedSetIterable
Returns the set of all objects that are members of both this and set. The intersection of [1, 2, 3] and [2, 3, 4] is the set [2, 3]. The output will contain instances from this, not set.

Issue

However, testing reveals that elements of the smallest of the 2 sets are returned instead. Not only contradicts this the API documentation, but this is counterintuitive as well.
In addition, when replacing intersect with the select(Set::contains) combination, the same functionality works as expected.

Reproduction

Following test suite contains 4 test cases:

  • SetA.intersect(SetB) where SetA > SetB
  • SetA.intersect(SetB) where SetA < SetB
  • SetA.select(SetB::contains) where SetA > SetB
  • SetA.select(SetB::contains) where SetA < SetB

Technologies used:

  • JDK 21
  • Eclipse Collections 11.1
  • JUnit 5.10
  • AssertJ 3.25
class SortedSetIntersectTest {

    @Nested
    class SelectContains {
        @Test
        void selectContainsReturnsElementsFromTheInvokingSetWhenFirstSetIsLargerThanSecond() {
            ImmutableList<Person> persons = Person.generate(20);
            ImmutableSortedSet<Person> first = SortedSets.immutable.withAll(Person.NAME_ONLY_COMPARATOR, persons.take(15));
            ImmutableSortedSet<Person> second = SortedSets.immutable.withAll(Person.NAME_ONLY_COMPARATOR, persons.drop(10).collect(Person::changeAge));

            System.out.printf("Set 1 with %d elements ...%n", first.size());
            first.forEachWithIndex((p, i) -> System.out.printf("%02d: %s%n", i, p));
            System.out.printf("%n...vs Set 2 with %d elements%n", second.size());
            second.forEachWithIndex((p, i) -> System.out.printf("%02d: %s%n", i, p));

            assertThat(selectContains(first, second)).containsExactlyElementsOf(first.drop(10));
            assertThat(selectContains(second, first)).containsExactlyElementsOf(second.take(5));
        }

        @Test
        void selectContainsReturnsElementsFromTheInvokingSetWhenFirstSetIsSmallerThanSecond() {
            ImmutableList<Person> persons = Person.generate(20);
            ImmutableSortedSet<Person> first = SortedSets.immutable.withAll(Person.NAME_ONLY_COMPARATOR, persons.take(15)).drop(10);
            ImmutableSortedSet<Person> second = SortedSets.immutable.withAll(Person.NAME_ONLY_COMPARATOR, persons.drop(10).collect(Person::changeAge));

            System.out.printf("Set 1 with %d elements ...%n", first.size());
            first.forEachWithIndex((p, i) -> System.out.printf("%02d: %s%n", i, p));
            System.out.printf("%n...vs Set 2 with %d elements%n", second.size());
            second.forEachWithIndex((p, i) -> System.out.printf("%02d: %s%n", i, p));

            assertThat(selectContains(first, second)).containsExactlyElementsOf(first);
            assertThat(selectContains(second, first)).containsExactlyElementsOf(second.take(5));
        }

        private SortedSet<Person> selectContains(ImmutableSortedSet<Person> first, ImmutableSortedSet<Person> second) {
            return SortedSets.immutable.withAll(Person.NATURAL_COMPARATOR, first.select(second::contains)).castToSortedSet();
        }
    }

    @Nested
    class Intersect {

        @Test
        void intersectReturnsElementsFromTheSmallestSetWhenFirstSetIsLargerThanSecond() {
            ImmutableList<Person> persons = Person.generate(20);
            ImmutableSortedSet<Person> first = SortedSets.immutable.withAll(Person.NAME_ONLY_COMPARATOR, persons.take(15));
            ImmutableSortedSet<Person> second = SortedSets.immutable.withAll(Person.NAME_ONLY_COMPARATOR, persons.drop(10).collect(Person::changeAge));

            System.out.printf("Set 1 with %d elements ...%n", first.size());
            first.forEachWithIndex((p, i) -> System.out.printf("%02d: %s%n", i, p));
            System.out.printf("%n...vs Set 2 with %d elements%n", second.size());
            second.forEachWithIndex((p, i) -> System.out.printf("%02d: %s%n", i, p));

            assertThat(intersect(first, second)).containsExactlyElementsOf(second.take(5));   // succeeds, but should return elements from 'first'
            assertThat(intersect(second, first)).containsExactlyElementsOf(second.take(5));
//        assertThat(intersect(first, second)).containsExactlyElementsOf(first.drop(10));   // fails, but should work imho
        }
        
        @Test
        void intersectReturnsElementsFromTheSmallestSetWhenFirstSetIsSmallerThanSecond() {
            ImmutableList<Person> persons = Person.generate(20);
            ImmutableSortedSet<Person> first = SortedSets.immutable.withAll(Person.NAME_ONLY_COMPARATOR, persons.take(15)).drop(10);
            ImmutableSortedSet<Person> second = SortedSets.immutable.withAll(Person.NAME_ONLY_COMPARATOR, persons.drop(10).collect(Person::changeAge));

            System.out.printf("Set 1 with %d elements ...%n", first.size());
            first.forEachWithIndex((p, i) -> System.out.printf("%02d: %s%n", i, p));
            System.out.printf("%n...vs Set 2 with %d elements%n", second.size());
            second.forEachWithIndex((p, i) -> System.out.printf("%02d: %s%n", i, p));

            assertThat(intersect(first, second)).containsExactlyElementsOf(first);
            assertThat(intersect(second, first)).containsExactlyElementsOf(first);            // succeeds, but should return elements from 'second'
//        assertThat(intersect(second, first)).containsExactlyElementsOf(second.take(5));   // fails, but should work imho
        }

        private SortedSet<Person> intersect(ImmutableSortedSet<Person> first, ImmutableSortedSet<Person> second) {
            return SortedSets.immutable.withAll(Person.NATURAL_COMPARATOR, first.intersect(second)).castToSortedSet();
        }
    }

    private record Person(String name, int age) {
        public static final Comparator<Person> NATURAL_COMPARATOR = Comparator.comparing(Person::name).thenComparingInt(Person::age);
        public static final Comparator<Person> NAME_ONLY_COMPARATOR = Comparator.comparing(Person::name);

        private static final ThreadLocalRandom RANDOM = ThreadLocalRandom.current();
        private static final String[] NAMES = new String[]{
            "Hye", "Robbie", "Daryl", "Ulrike", "Marivel", "Huey", "Lindsay", "Avery", "Brock", "Patti", "Makeda", "Rudolf", "Burma", "Slyvia", "Mike", "Oscar", "Cher", "Willard", "Buford", "Hobert", "Brain", "Ricky", "Jeraldine", "Lindsey", "Mia", "Carolee", "Barney", "Emmett", "Lezlie", "Cristi"
        };

        public static ImmutableList<Person> generate(int size) {
            MutableSortedSet<Person> persons = SortedSets.mutable.of(NAME_ONLY_COMPARATOR);
            while (persons.size() < size) {
                persons.with(new Person());
            }
            return persons.toImmutableList();
        }

        private static int randomAge() {
            return RANDOM.nextInt(100);
        }

        private static String randomName() {
            return NAMES[RANDOM.nextInt(NAMES.length)];
        }

        public Person() {
            this(randomName(), randomAge());
        }

        Person changeAge() {
            int newAge = IntStream.generate(Person::randomAge).filter(age -> age != this.age).findFirst().orElseThrow();
            return new Person(name, newAge);
        }
    }
}
@duponter duponter changed the title SortedSet.html#intersect does not always return elements from this SortedSet#intersect does not always return elements from this Aug 11, 2024
@lzcyx
Copy link
Contributor

lzcyx commented Aug 17, 2024

Hi! I think we might be able to fix this issue by gently removing this part of the optimization code in https://github.com/eclipse/eclipse-collections/blob/master/eclipse-collections/src/main/java/org/eclipse/collections/impl/utility/internal/SetIterables.java#L77-L82

@motlin
Copy link
Contributor

motlin commented Aug 17, 2024

It makes sense to me. I'd like to get @mohrezaei's opinion too since there was a similar change in #1661 that raised performance concerns.

@mohrezaei
Copy link
Member

So clearly the behavior and the doc are inconsistent. It should be noted that this can only happen if the set's sense of equality is different from whatever way the returned set's elements are being compared to elements from this.

So there are two ways to fix this:

  1. Keep the doc, change the behavior. This would get rid of the optimization.
  2. Keep the behavior, fix the doc.

I would generally favor (2), for the following reasons:

  • the only real benefit of this funciton is that bit of optimization. It's otherwise just an alias for a one-liner.
  • these functions (intersect, difference, etc) are borrowed from set-theory, where set elements don't have different senses of equality.

We could even be very explicit in the doc and suggest using this.select(that.contains) if someone has the requirement to preserve the memory references from this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants