-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plotting categorical values as colors #351
Comments
cc: @daschw |
Very practically, my question is the following: How can I plot a scatter of points where colors are levels without depending on CategoricalArrays.jl? using Plots
using CategoricalArrays
c = categorical([1,2,3])
scatter([1,2,3], [1,2,3], marker_z=c) |
Appreciate any help regarding this issue. Maybe it should be moved to Plots.jl? |
I think you need to wait for @nalimilan to have time to look at the issue, as he implemented the recipe AFAICT. |
I have no idea, I just copied the definition provided by @daschw. Maybe @mkborregaard could help too? |
Any help would be great. The issue is specific to the |
I guess I will have to revert the changes in downstream projects in order to release? Who is leading Plots.jl nowadays? Should the Julia community pay a software engineer to maintain and fix these issues? It is really hard to progress otherwise. I will start a thread on Discourse to see what people think about starting a group to split the payment of a salary to a free lancer. |
Unfortunately, Recipes only apply to input data and not to attributes like If I try to run your example I get the following error: julia> scatter([1,2,3], [1,2,3], marker_z=c)
Error showing value of type Plots.Plot{Plots.GRBackend}:
ERROR: MethodError: no method matching get(::ColorSchemes.ColorScheme{Vector{RGBA{Float64}}, String, String}, ::CategoricalValue{Int64, UInt32}, ::Tuple{Float64, Float64})
Closest candidates are:
get(::CategoricalPool, ::Any, ::Any) at /home/dani/.julia/packages/CategoricalArrays/rDwMt/src/pool.jl:55
get(::DataStructures.RobinDict{K, V}, ::Any, ::Any) where {K, V} at /home/dani/.julia/packages/DataStructures/ixwFs/src/robin_dict.jl:384
get(::Test.GenericDict, ::Any, ::Any) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1663
...
Stacktrace:
[1] get(::PlotUtils.ContinuousColorGradient, ::CategoricalValue{Int64, UInt32}, ::Tuple{Float64, Float64})
@ PlotUtils ~/.julia/packages/PlotUtils/es5pb/src/colorschemes.jl:18 So it seems like there is no method for I tried to overcome this by loosening the type restrictions for julia> scatter([1,2,3], [1,2,3], marker_z=c)
Error showing value of type Plots.Plot{Plots.GRBackend}:
ERROR: ArgumentError: cannot compare a `CategoricalValue` to value `v` of type `CategoricalValue{Int64, UInt32}`: wrap `v` using `CategoricalValue(v, catvalue)` or `CategoricalValue(v, catarray)` first
Stacktrace:
[1] <(x::CategoricalValue{Int64, UInt32}, y::Float64)
@ CategoricalArrays ~/.julia/packages/CategoricalArrays/rDwMt/src/value.jl:176
[2] <(y::Float64, x::CategoricalValue{Int64, UInt32})
@ CategoricalArrays ~/.julia/packages/CategoricalArrays/rDwMt/src/value.jl:180
[3] >(x::CategoricalValue{Int64, UInt32}, y::Float64)
@ Base ./operators.jl:305
[4] clamp(x::CategoricalValue{Int64, UInt32}, lo::Float64, hi::Float64)
@ Base.Math ./math.jl:65
[5] _broadcast_getindex_evalf
@ ./broadcast.jl:648 [inlined]
[6] _broadcast_getindex
@ ./broadcast.jl:621 [inlined]
[7] getindex
@ ./broadcast.jl:575 [inlined]
[8] copy
@ ./broadcast.jl:898 [inlined]
[9] materialize
@ ./broadcast.jl:883 [inlined]
[10] get(cscheme::ColorSchemes.ColorScheme{Vector{RGBA{Float64}}, String, String}, x::CategoricalValue{Int64, UInt32}, rangescale::Tuple{Float64, Float64})
@ ColorSchemes ~/.julia/dev/ColorSchemes/src/ColorSchemes.jl:240 I'm not really sure what we in Plots can do here without depending on CategoricalArrays. |
Would you reconsider the dependency on CategoricalArrays.jl? Without an explicit treatment of categorical variables we won't be able to generate correct legend elements for nominal/ordered variables for example. |
Agreed. Also given that CategoricalArrays.jl is now compiler friendly and Plots.jl is compiler-heavy anyway I would vote to add this integration. An alternative would be to add appropriate methods to DataAPI.jl and use the interface. |
The methods that error really look like they need a @juliohm How do you expect categorical values to be translated to colors? Should that be equivalent to passing the result of |
I was referring to the general approach. In this case (the mapping @juliohm wants) - I do not know enough about the problem to informatively comment. |
Or a method that maximizes the visual discrepancy between the categories. When the notion of order is important, it could produce a sequential colormap for example. Think of a cloropleth maps like this one: https://en.wikipedia.org/wiki/Choropleth_map#/media/File:Countries_by_mean_wealth_per_adult_in_2018.png |
The approach I mentioned would definitely work if you ignore the order and just choose a qualitative palette. To take into account order, the implementation would have to find a reliable way of checking whether the input values are ordered or not (see JuliaData/DataAPI.jl#26). I'm not sure whether it should be Plots or ColorSchemes' job to do this, but probably worth filing an issue in one of these packages? |
The |
Actually, I think |
I assumed that the plot recipes provided in CategoricalArrays.jl + Plots.jl would be the official way moving forward. So any package interested in plotting categorical variables would just assume it works out of the box and would forward a CategoricalArray to the Plots.jl pipeline. |
If you use scatter([1,2,3], [1,2,3], marker_z=[1, 2, 3]) If you use scatter([1,2,3], [1,2,3], markercolor=[1, 2, 3]) I suppose (I might be wrong here) you want different legend entries for different categorical values. For this you have to group your input into multiple series. This already works for CategoricalArrays with scatter([1,2,3], [1,2,3], group=categorical(["A", "B", "C"])) Anyway, I think this is not at all a CategoricalArrays issue and can be closed here. @juliohm if you want we can continue the discussion in a Plots issue. |
From what I understand the problem now is about figuring out automatically which attribute to set depending on the vector type. If the vector of values is a vector of Number then we should use The problem remains because in order to differentiate between the two cases we need access to the categorical array type. So if Plots.jl could take CategoricalArrays.jl as a dependency, we could have a single attribute type for "color of markers" that would do the correct thing internally. Does it make sense? I like that the issue is discussed here because then core maintainers of CategoricalArrays.jl can share their perspectives on a good design. Right now even with the group option (which I am gonna try soon), we need to be able to differentiate normal arrays from categorical arrays in user code. |
Actually, I'd argue we (Plots) should not automatically figure out which attribute to set, but use the attribute that is provided by the user. I think the combination of plot(categorical([1, 3, 7], categorical([19, -4, 100]))) |
Perhaps I didn't explain myself clearly. I am talking about automatic detection between non-categorical arrays and categorical arrays. End users will want to pass colors no matter if their variables are continuous or categorical. Currently they have to figure out by themselves that Package writers like myself could add a dependency on CategoricalArrays.jl to implement this basic choice for the user, but I think this would be much more useful in Plots.jl already (specifically plot recipes). Anyone wanting to plot categorical or continuous values could just pass a vector and internally Plots.jl would use the correct attribute. |
Why not check whether the value is a |
Unfortunately the group option doesn't work within plot recipes: using RecipesBase
struct Foo end
@recipe function f(foo::Foo, data)
seriestype --> :scatter
if eltype(data) <: Number
marker_z --> data
colorbar --> true
else
group --> data
end
[Tuple(rand(2)) for i in 1:length(data)]
end
using CategoricalArrays
plot(Foo(), categorical([1,2,3])) |
I will go ahead and submit a release with this bug because of pressing deadlines, but it would be nice to see a workaround. |
@daschw do you have a solution for the plot recipe situation above? |
That is no longer needed, and can be handled in downstream recipes with post-processing. |
I have plot recipes that try to plot the categorical values as colors in a geographic map. For example, the crop type in this plot: https://juliaearth.github.io/GeoStats.jl/stable/workflow.html#Plotting-solutions
I was doing some manual pre-processing by depending on CategoricalArrays.jl and converting to a vector of level codes manually. Given that the latest CategoricalArrays.jl support plot recipes already, I wanted to stop doing this manual fix. What is the appropriate method to pass categorical arrays to be interpreted as colors with an appropriate legend containing the levels?
I tried to pass the categorical array as
marker_z -> array
but it didn´t work. Appreciate any help as this is the last issue I need to solve before releasing a new version of the project.The text was updated successfully, but these errors were encountered: