Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What do you think about @groupby ? #302

Closed
flyaflya opened this issue Sep 28, 2021 · 9 comments
Closed

What do you think about @groupby ? #302

flyaflya opened this issue Sep 28, 2021 · 9 comments

Comments

@flyaflya
Copy link

I am a big fan of Hadley Wickham's pit of success where it is easy to remember how to do things. I find the fact that groupby is the only non-macro getting used in a typical DataFramesMeta workflow creates mental friction for new users (and me). Wouldn't it be nice if the below table could have the @ symbol in front of group by? Can an alias be made?
image

@pdeffebach
Copy link
Collaborator

Thanks for filing the issue!

I don't think it's worth adding a macro just for the sake of consistency with @. It seems redundant to have a @ which doesn't do anything different. But if there is large community interest for this I may reconsider the importance of easy-to-rememberness.

This was actually a major benefit of the now-deprecated @linq, which allowed omitting the @ for macro names.

DataFrameMacros.jl has a @groupby macro which is used to construct groups on the fly, which also copies a data frame.

I want to emulate this behavior but I want to call it something different. DataFrames.groupby does not allocate a new data frame, wheras @groupby from DataFramesMeta.jl does, and I don't like deviating from DataFrames.jl too much.

One alternative would be to have a @makegroups macro which performs a transformation and returns a grouped data frame. I thought I had an issue mentioning this, but I can't find it.

@flyaflya
Copy link
Author

Thx for the quick reply and the consideration. I am coming from rstats tidyverse where it seems everything is conceptually so much easier to accomplish than their counterparts in Julia; the Julia syntax and typing is a lot of mental overhead. Your package is an exception though - it is exceptionally easy to use! Thank you! For example, the @rfoo macros are just a tremendous addition to easy-to-code data manipulation pipelines.

Anyway, anything leading to easy-to-rememberness is very welcomed by me and this pattern-breaking by groupby just makes my brain twitch a little bit. In the R tidyverse-world, I am confident they would add this alias. Functions like is.character() in base get rewritten as is_character() in the tidyverse so that the syntax is the same as tidyverse-only functions like is_dictionaryish() or is_scalar_vector().

I hope some community members might chime in and agree, but either way thx for the great package.

@bkamins
Copy link
Member

bkamins commented Sep 28, 2021

I would have no problem with adding @groupby. The issue is - as @pdeffebach commented that ideally it should allow transforms, but then it would have to make a copy.

@nalimilan
Copy link
Member

I agree that it could make sense to add the macro just for consistency, even if it's identical to the function.

@pdeffebach
Copy link
Collaborator

Sounds good. I can start a PR adding this.

@flyaflya
Copy link
Author

Great! Please know that I am not clear on the "transforms" issue above, so can't comment on that. That being said, identical functionality to groupby sounds good to me.

@bkamins
Copy link
Member

bkamins commented Sep 29, 2021

"transforms" issue above

You might want to groupby not by columns directly but by their transformations (e.g. you have a columns of integers and you want to groupby the fact if the value is even or odd)

@pdeffebach
Copy link
Collaborator

I had a good idea this week for more features, something like

df = @groubby df :a begin 
    @transform :y = :x .- mean(:x)
    @combine :z = first(:y)
end

where the operations inside the @groupby block are transform(df, ...; ungroup = false), so you can do a lot of grouped transformations together.

This is partly a hack because we still don't have a syntax for keyword arguments, so you can't have ungroup = false in the @transform call.

I don't think this means we can't introduce a @groupby which only accepts a few things right now though.

@pdeffebach
Copy link
Collaborator

Added @groupby in [#373] (did not add the feature above though. We now have keyword arguments in #323)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants