Skip to content

Weird ordering of elements in Parallel Categories Plot #6803

@malteschwerin

Description

@malteschwerin

grafik
I'm currently using the parallel categories plot in plotly express to display and compare different rankings of elements. I know I'm slightly misusing this plot here, but I still think that my problem should be relevant to others as well. In the example I attached, for simplicity I'm only showing one ranking ("Ranking 1"), where competitor 0 is on the first place and the other competitors share second place. In practice, I would do this with multiple rankings as parallel coordinates, which usually have fewer and fewer distinct ranks going from left to right. As you can also see in the image, the competitors on rank 2 have a weird ordering for some reason, which is why I'm writing here. The dataframe I'm using here looks like this:
index Name Ranking 1
0 0 1
1 1 2
2 2 2
3 3 2
4 4 2
5 5 2
6 6 2
7 7 2
8 8 2
9 9 2
10 10 2
11 11 2
12 12 2
13 13 2
14 14 2
15 15 2
16 16 2
17 17 2
18 18 2
19 19 2
20 20 2
21 21 2

While I usually change some parameters, it looks exactly like this when just calling px.parallel_categories(px). So I really don't see why any of the connecting lines are overlapping. It would be great if I could use this plot like this without creating extra confusion coming from crossing lines.

I used version 5.9 before, but just updated to the latest version (5.18.0) and the issue remains.

Activity

Coding-with-Adam

Coding-with-Adam commented on Nov 24, 2023

@Coding-with-Adam
Contributor

hi @malteschwerin Can you please share a minimal reproducible example so we can replicate this issue locally?

malteschwerin

malteschwerin commented on Nov 26, 2023

@malteschwerin
Author

Hi @Coding-with-Adam, thanks for the quick reply. Sure, here you have a minimal example:

import pandas as pd
import plotly.express as px
df = pd.DataFrame({"Names": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10], "Ranking 1": [1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]})
px.parallel_categories(df)

and this is what the result looks like:
grafik

Now that I look at it, it seems like a lexicographical sorting might be used in the second column ("Ranking 1") while the first column ("Names") is sorted numerically. Changing all values to strings, however, doesn't change anything. Ideally, I would want to have the same ordering in all columns, reducing the number of crossing lines. In this simple example, I think it's clear that no lines should cross at all.

Coding-with-Adam

Coding-with-Adam commented on Nov 28, 2023

@Coding-with-Adam
Contributor

I agree, @malteschwerin.
Ideally, these lines would not cross. I brought your question to the Plotly community, and one of the community members pointed out that by reversing the lists, the graph is created without crossing lines.

import pandas as pd
import plotly.express as px

names = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
ranking =  [1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]

ranking = ranking[::-1]
names=names[::-1]

df = pd.DataFrame({"Names": names, "Ranking 1": ranking})
fig = px.parallel_categories(df)
fig.update_layout(height=700)
fig.show()

image

However, we should be able to prevent the lines from crossing without reversing the lists.

@LiamConnors do you have any idea why this might be happening?

LiamConnors

LiamConnors commented on Nov 28, 2023

@LiamConnors
Member

Looks like this issue exists in Plotly.js - see the following codepen: https://codepen.io/Liam-Connors/pen/ZEwMQRr
Transferring it to the Plotly.js repo as a bug. cc @archmoj

malteschwerin

malteschwerin commented on Dec 1, 2023

@malteschwerin
Author

Hey @Coding-with-Adam @LiamConnors, thanks for your work so far! Do you have any guess on when someone will look at this in detail?

Coding-with-Adam

Coding-with-Adam commented on Dec 1, 2023

@Coding-with-Adam
Contributor

hi @malteschwerin
We should be able to take a deeper look at this bug in December. We'll keep you updated.

malteschwerin

malteschwerin commented on Dec 18, 2023

@malteschwerin
Author

Hey @Coding-with-Adam , is there any update on this already?

Coding-with-Adam

Coding-with-Adam commented on Dec 21, 2023

@Coding-with-Adam
Contributor

no update yet, @malteschwerin . We'll take a deeper look after the holidays.
cc @LiamConnors @archmoj

malteschwerin

malteschwerin commented on Jan 17, 2024

@malteschwerin
Author

Hey @Coding-with-Adam, is there an update on this? I'm currently writing a paper using these plots, and it would be really helpful to be able to use them wihtout these crossing lines.

Coding-with-Adam

Coding-with-Adam commented on Jan 18, 2024

@Coding-with-Adam
Contributor

Thanks for the reminder, @malteschwerin
I'll follow up wit the team today.

Coding-with-Adam

Coding-with-Adam commented on Jan 18, 2024

@Coding-with-Adam
Contributor

@malteschwerin
Unfortunately, the resources needed to tackle this issue are limited given higher priority issues.

Would you be able to use the workaround mentioned here?
If not, would you be able to submit a Pull Request for this issue?

malteschwerin

malteschwerin commented on Jan 22, 2024

@malteschwerin
Author

@Coding-with-Adam I'm afraid the workaround you're mentioning doesn't work for me, since it would look very unintuitive to have the rankings upside-down.
Unfortunately, I'm also too busy right now to have a look at this myself. Sorry!

self-assigned this
on Jul 12, 2024
removed their assignment
on Aug 2, 2024
mosely1

mosely1 commented on Oct 17, 2024

@mosely1

Adding another example here because the workaround provided doesn't resolve the bug in my instance.

# Create a sample dataframe
df = pd.DataFrame({
    'Category1': ['Black', 'Black', 'Black', 'Grey', 'Grey', 'Grey', 'Orange'],
    'Category2': ['Green', 'Blue', 'Red', 'Red', 'Purple', 'Yellow', 'Orange'],
    'Color': ['#1bb066', '#5781c1', '#ee2a2c', '#ee2a2c', '#7951a1', '#ecde13', '#f8981d'],
    'Order': [0,1,2,3,4,5,6]
})

# Create the parallel categories plot with color mapping
fig = px.parallel_categories(df, dimensions=['Category1','Category2'])
fig.show()

image

As you can see, none of the lines cross. This is good.

However, my color assignments are important to me (I am categorizing items based on their actual real-world color). When I add my color map, the plot reorders the items in Category1, causing a collision with red.

# Create the parallel categories plot with color mapping
fig = px.parallel_categories(df, dimensions=['Category1','Category2'], color='Color')
fig.show()

image

Is there a way to force the order? The below does not work!

# Create the parallel categories plot with color mapping
fig = px.parallel_categories(df.sort_values('Order'), dimensions=['Category1','Category2'], color="Color")
fig.show()

Edit: I solved the problem with the below by creating a custom gradient. A pretty annoying workaround, but it does work .

# Create a sample dataframe
df = pd.DataFrame({
    'Category1': ['Black', 'Black', 'Black', 'Grey', 'Grey', 'Grey', 'Orange'],
    'Category2': ['Green', 'Blue', 'Red', 'Red', 'Purple', 'Yellow', 'Orange'],
    'Order': [0,1,2,2,3,4,5]
})
color_scale = ['#1bb066', '#5781c1', '#ee2a2c', '#7951a1', '#ecde13', '#f8981d',]

# Create the parallel categories plot with color mapping
fig = px.parallel_categories(df.sort_values('Order'), dimensions=['Category1','Category2'], color="Order", color_continuous_scale=color_scale)
fig.show()

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    P3backlogbugsomething brokensev-3annoyance with workaround

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @gvwilson@LiamConnors@Coding-with-Adam@malteschwerin@mosely1

        Issue actions

          Weird ordering of elements in Parallel Categories Plot · Issue #6803 · plotly/plotly.js