Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ulam distance is wrong? #417

Closed
osorensen opened this issue Jul 1, 2024 · 2 comments
Closed

Ulam distance is wrong? #417

osorensen opened this issue Jul 1, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@osorensen
Copy link
Collaborator

No description provided.

@osorensen osorensen added the bug Something isn't working label Jul 1, 2024
@osorensen osorensen mentioned this issue Jul 2, 2024
@osorensen
Copy link
Collaborator Author

osorensen commented Jul 2, 2024

Thanks to Xavier Benavides for pointing out. Here is some reasoning related to the correction, which is in #418.

Definition of Ulam distance

For two rankings r1 and r2 with length n, the Ulam distance distance equals n minus the longest common subsequence of the orderings corresponding to the rankings.

Details are in the following two screenshots from Irurozki, Calvo, and Lozano (2016).

First, note that sigma is an ordering.

image

Next follows the definition of Ulam distance in terms of sigma. Note that there is an error in the example, since the longest increasing subsequence in sigma is 13457, of size 5, so the Ulam distance i 7-5=2.

image

Testing the implementation

The following two orderings were used when discovering the bug: sigma = [51432] and pi = [31245]. Their longest common subsequence is 2, which is either 32 or 12 or 14. Hence, the Ulam distance should be 5 - 2 = 3.

In the previous implementation we would get 2, but now we correctly get 3.

library(BayesMallows)
sigma <- c(5, 1, 4, 3, 2)
pi <- c(3, 1, 2, 4, 5)

ranking1 <- create_ranking(sigma)
ranking2 <- create_ranking(pi)

compute_rank_distance(ranking1, ranking2, "ulam")
#> [1] 3

Created on 2024-07-02 with reprex v2.1.0

Corrected version of the example from Irurozki et al. (2016):

library(BayesMallows)

sigma <- c(2, 1, 3, 6, 4, 5, 7)
pi <- c(1, 2, 3, 4, 5, 6, 7)
ranking1 <- create_ranking(sigma)
ranking2 <- create_ranking(pi)

compute_rank_distance(ranking1, ranking2, "ulam")
#> [1] 2

Created on 2024-07-02 with reprex v2.1.0

@osorensen osorensen changed the title Ulam distance is wrong Ulam distance is wrong? Jul 2, 2024
@osorensen
Copy link
Collaborator Author

Update, the original implementation was indeed correct. We must just be careful to keep in mind that the distance is defined for orderings, not rankings.

Here for the current CRAN version:

library(BayesMallows)
packageVersion("BayesMallows")
#> [1] '2.2.1'
sigma <- c(5, 1, 4, 3, 2)
pi <- c(3, 1, 2, 4, 5)

ranking1 <- create_ranking(sigma)
ranking2 <- create_ranking(pi)

compute_rank_distance(ranking1, ranking2, "ulam")
#> [1] 3

Created on 2024-07-02 with reprex v2.1.0

library(BayesMallows)
packageVersion("BayesMallows")
#> [1] '2.2.1'
sigma <- c(2, 1, 3, 6, 4, 5, 7)
pi <- c(1, 2, 3, 4, 5, 6, 7)
ranking1 <- create_ranking(sigma)
ranking2 <- create_ranking(pi)

compute_rank_distance(ranking1, ranking2, "ulam")
#> [1] 2

Created on 2024-07-02 with reprex v2.1.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant