Fixed and tidied `combinatorics.list_permutation` function #9

DanielSherlock · 2023-10-24T23:02:15Z

Hi, hope this is okay.

The problem

I noticed an error in the implementation of the combinatorics.list_permutation function, when provided with lists containing duplicate items. For example (printing each permutation on a new line):

// (Old behaviour)
// [1, 1, 2] |> list_permutation 
[1, 2]
[1, 2]
[2, 1]
[2, 1]

Clearly, none of those are permutations of [1, 1, 2]! It did work correctly when given lists with unique elements:

// [1, 1, 2] |> list.unique |> list_permutation 
[1, 2]
[2, 1]

The issue was the use of filter on line 357. This removed from the remaining list all duplicates of x, when it shouldn't have.

New behaviour

When fixing this, I considered two reasonable behaviours for list_permutations:

A: Treat duplicate elements as distinct for the purposes of permutations. For example, the list [1, 1] has two permutations... [1, 1] (the same way round) and [1, 1] (the other way round). With this behaviour all lists of the same length will have the same number of permutations, namely factorial(list.length(whichever_list)).
B: Treat duplicate elements as identical for the purposes of permutations. In this case, the list [1, 1] has only one permutation... itself. With this behaviour lists with duplicate elements will have fewer permutations than other lists of the same length.

Both behaviours are plausibly useful. I have decided, for this pull request, to implement behaviour A. Partly because it was a simpler change to write, but also because I think it is slightly more versatile: if you need behaviour B it is as simple as calling the function with behaviour A, and then piping the result to list.unique:

// (New behaviour)
// [1, 1, 2] |> list_permutation
[1, 1, 2]
[1, 2, 1]
[1, 1, 2]
[1, 2, 1]
[2, 1, 1]
[2, 1, 1]

// (Simulating behaviour B)
// [1, 1, 2] |> list_permutation |> list.unique
[1, 1, 2]
[1, 2, 1]
[2, 1, 1]

To implement this new behaviour, I switched from filter to pop. This always removes exactly one instance of x, leaving its duplicates in the remainder of the list. One difficulty is that pop returns a result in case the element it's searching for isn't in the list. But it always will be (as x is drawn from the list in the line above), so we can be sure to get an Ok(...) and I use a let assert expression to destructure this.

This is not exactly the most efficient way to write this function, but it has the same performance as what was there before, and is fairly readable. I would be happy to write a more efficient version if needed.

Other changes

I made a few other code changes at the same time. These do not have any effect on the behaviour of the function.

I noticed we were using local definitions of flat_map and concat for the implementation of list_permutation (and nothing else). As these are standard library functions (in the list module, which we already import) we don't need to duplicate them locally, so I got rid of them.
I rewrote the call to flat_map on line 354 to use use. Personally, I find this more readable.

To do

I have not yet:

Added the above examples (or similar) as tests.
Updated the documentation to explain the behaviour with duplicates.

…r mistakenly shortens lists with duplicates

hayleigh-dot-dev · 2023-10-25T10:19:22Z

Hey Daniel thanks for this! I've converted the PR to draft, please can you address the todos (tests and docs).

I'll defer to @NicklasXYZ on whether this change is sensible but it seems reasonable to me. We could also consider making the behaviour configurable with a second parameter:

combinatorics.list_permutation([1, 1, 2], ignore_duplicates: False)
// [1, 1, 2]
// [1, 2, 1]
// [1, 1, 2]
// [1, 2, 1]
// [2, 1, 1]
// [2, 1, 1]

combinatorics.list_permutation([1, 1, 2], ignore_duplicates: True)
// [1, 1, 2]
// [1, 2, 1]
// [2, 1, 1]

This might make the implementation a bit clunkier but it would definitely be more performant than suggesting folks iterate the list again with list.unique. I don't feel super strongly about this - maybe we could wait and see if/when Gleam gets optional args.

This is not exactly the most efficient way to write this function, but it has the same performance as what was there before, and is fairly readable. I would be happy to write a more efficient version if needed.

If a more efficient version is fairly simple to implement I think we should favour it over readability: perf wins for number-crunching stuff is likely to be quite high impact!

hayleigh-dot-dev · 2023-10-25T10:21:48Z

To implement this new behaviour, I switched from filter to pop. This always removes exactly one instance of x, leaving its duplicates in the remainder of the list. One difficulty is that pop returns a result in case the element it's searching for isn't in the list. But it always will be (as x is drawn from the list in the line above), so we can be sure to get an Ok(...) and I use a let assert expression to destructure this.

Might be good to just leave this as a comment above the assertion too so folks in the future know this assertion is safe!

NicklasXYZ · 2023-10-30T22:29:26Z

I am sorry for keeping you waiting, I will get to this very soon 😅

NicklasXYZ · 2023-11-04T11:37:02Z

Thank you @DanielSherlock, thank you for fixing this and @hayleigh-dot-dev for already having a look.

I think behavior A that you are describing is a reasonable default choice to go with, but in the future it could be cool to have both behaviors implemented as @hayleigh-dot-dev also points out.

If you could update the tests and documentation I'd be happy to approve this.

With respect to the performance of the implementations, of course if things can be improved in that department as well, I would not turn down such improvements ;)

DanielSherlock · 2023-11-14T21:43:33Z

No worries, I'm not in any rush (and I hope you aren't too)!

Actually, I thought after posting this pull request: Oh before I start writing tests and docs I should go and look at the standard (or standard-ish) libraries of other languages to see how they handle this case. Then I got distracted. And now it is now.

As far as I can tell, it's like this:

Language	Behaviour A	Behaviour B	Iterator
Javascript (Deno)	`permutations`
Ruby	`permutation`		✅
Scala		`permutations`	✅
Swift	`permutations`	`uniquePermutations`	✅ (`Sequence`)
Julia	`permutations`	`multiset_permutations`	✅
Rust	`permutations`		✅
Nim		`nextPermutation`	✅ (mutating)
C++		`next_permutation`	✅ (mutating)
Haskell	`permutations`		✅ (everything is lazy)
Python	`permutations`		✅
Clojure		`permutations`	✅ (`lazy-seq`)

(I had hoped to find a list-permuting function in the standard (or standard-ish) libraries of Erlang or Elixir, and prioritise similarity to that. But no luck, not even Elixir's Math library has this. Other languages that seem to have nothing (semi-)official include Java, Elm, and Lobster. Though perhaps I wasn't looking hard enough.)

Personally, I like the way Swift handles it. Plus, their documentation is clear (with examples) and also has some hints for possible efficient implementations. The Clojure implementation seem nice and clean too, and might be easy to port. Not all other documentation was similarly clear...

Another thing to note: Almost all of the languages I surveyed returned an iterator (or equivalent thing in that language), even when given a list/array/etc. If we are going to be worried about performance, we might consider doing the same. This after all exactly the sort of function where iterators make a lot of sense:

It does not take a particularly large input list for the output list to be very very large indeed.
The state needed to generate the next permutation is small.

hayleigh-dot-dev · 2023-11-14T23:47:21Z

I think we could discuss returning an iterator in a separate PR. It'd be a breaking change and just generally beyond the scope of what's already here. Better to merge incremental improvements to the library than spend a lot of back-and-forth making a single PR cover everything all at once!

NicklasXYZ · 2023-11-16T22:46:39Z

Thanks a lot for the Table Daniel. It's a really nice overview and you are right. Returning an iterator would be the way to go for sure :)
I think what @hayleigh-dot-dev suggests sounds very sensible. It would be a larger change and thus probably more suitable to discuss in another PR.

…viour on duplicate elements

…tion

DanielSherlock · 2023-11-19T23:08:23Z

Very practical suggestions, thanks.

I've added tests and docs and an explanatory comment to the current implementation.

I ran the tests and it seems to work fine.
I built the docs locally and nothing looks wrong with them.
I ran gleam format but it didn't do anything, so I hope that means it's all fine.

When running this, I do get loads of warnings - it seems from Gleam's recent type import syntax change. I've not gone and fixed all of those in this PR.

I'm happy to create separate issues for:

Another function implementing behaviour B.
A perhaps slightly more efficient implementation.
The possibility of returning an iterator.

NicklasXYZ · 2023-11-27T10:39:32Z

Looks good to me. I'd be happy, if you could add the other issues :)!

Fixed and tidied combinatorics.list_permutation function: it no longe…

788e142

…r mistakenly shortens lists with duplicates

DanielSherlock requested a review from NicklasXYZ as a code owner October 24, 2023 23:02

hayleigh-dot-dev marked this pull request as draft October 25, 2023 10:06

DanielSherlock added 2 commits November 19, 2023 22:40

Added tests and docs for combinatorics.list_permutation function beha…

52281ce

…viour on duplicate elements

Add explanatory comment for safe assert in combinatorics.list_permuta…

d696734

…tion

NicklasXYZ marked this pull request as ready for review November 27, 2023 10:36

NicklasXYZ merged commit ad11b96 into gleam-community:main Nov 27, 2023
1 check passed

NicklasXYZ mentioned this pull request Jan 23, 2024

Make functions in the combinatoric/sequences modules iterator-based instead of list-based #12

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed and tidied `combinatorics.list_permutation` function #9

Fixed and tidied `combinatorics.list_permutation` function #9

DanielSherlock commented Oct 24, 2023

hayleigh-dot-dev commented Oct 25, 2023

hayleigh-dot-dev commented Oct 25, 2023

NicklasXYZ commented Oct 30, 2023

NicklasXYZ commented Nov 4, 2023

DanielSherlock commented Nov 14, 2023 •

edited

Loading

hayleigh-dot-dev commented Nov 14, 2023

NicklasXYZ commented Nov 16, 2023

DanielSherlock commented Nov 19, 2023

NicklasXYZ commented Nov 27, 2023

Fixed and tidied combinatorics.list_permutation function #9

Fixed and tidied combinatorics.list_permutation function #9

Conversation

DanielSherlock commented Oct 24, 2023

The problem

New behaviour

Other changes

To do

hayleigh-dot-dev commented Oct 25, 2023

hayleigh-dot-dev commented Oct 25, 2023

NicklasXYZ commented Oct 30, 2023

NicklasXYZ commented Nov 4, 2023

DanielSherlock commented Nov 14, 2023 • edited Loading

hayleigh-dot-dev commented Nov 14, 2023

NicklasXYZ commented Nov 16, 2023

DanielSherlock commented Nov 19, 2023

NicklasXYZ commented Nov 27, 2023

Fixed and tidied `combinatorics.list_permutation` function #9

Fixed and tidied `combinatorics.list_permutation` function #9

DanielSherlock commented Nov 14, 2023 •

edited

Loading