Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checks fail with duckplyr #43

Open
krlmlr opened this issue Oct 28, 2024 · 4 comments
Open

Checks fail with duckplyr #43

krlmlr opened this issue Oct 28, 2024 · 4 comments

Comments

@krlmlr
Copy link

krlmlr commented Oct 28, 2024

The duckplyr package is aimed to be a drop-in replacement for dplyr, with full behavior compatibility. To assert that, I'm running checks with a rigged version of dplyr. This package fails its checks in this scenario.

Details: https://github.com/krlmlr/dplyr/blob/6ef6df78190c3c05f3ac63b97584f1ca2c3f49b3/revdep/problems.md .

Learn more about duckplyr: https://duckplyr.tidyverse.org/ .

From the error message, I can't tell immediately what the cause of the failure is. I'd appreciate your help: can you please help digest a reproducible example that shows how duckplyr is behaving differently from dplyr in your use case?

The modified dplyr version can be installed with any of:

pak::pak("krlmlr/dplyr@f-revdep-duckplyr")
# remotes::install_github("krlmlr/dplyr@f-revdep-duckplyr")
# devtools::install_github("krlmlr/dplyr@f-revdep-duckplyr")

Thanks a lot for your help! Please let me know if you have any questions.

Tracker: tidyverse/duckplyr#297.

@malcolmbarrett
Copy link
Collaborator

Before I look into it, is the idea here that we do library(duckplyr) and we get a free speedup via the dplyr parts of the codebase?

@krlmlr
Copy link
Author

krlmlr commented Oct 28, 2024

Precisely, that's the idea. Packages could use as_duckplyr_tibble() .

There is a translation layer, but it either translates perfectly-ish, or falls back to original dplyr -- in which case there's no speed-up, but still full compatibility.

@malcolmbarrett
Copy link
Collaborator

Thanks! I'll take a look this week. Already a big fan of duckplyr, so thanks for your work on that

@malcolmbarrett
Copy link
Collaborator

Ok, as far as I can tell, there's no error per se, but the package vignettes take considerably more time to run. Perhaps they are timing out in your revdep check?

Here's an example from one vignette where you can see the discrepancy in computation time:

library(partition)
prt <- partition(baxter_otu, threshold = .5)

When I try with the current CRAN version of dplyr, it completes in a few seconds. When I try with the version installed by pak::pak("krlmlr/dplyr@f-revdep-duckplyr"), it takes a lot longer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants