Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write DataFrameLike interface (either pandas or polars) #61

Open
t-ober opened this issue Jun 1, 2023 · 0 comments
Open

Write DataFrameLike interface (either pandas or polars) #61

t-ober opened this issue Jun 1, 2023 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@t-ober
Copy link
Contributor

t-ober commented Jun 1, 2023

We currently rely on the pd.DataFrame and pd.Series as the base data model. Rather than relying on these explicitly we could abstract that behind an interface e.g. DataFrameLike and SeriesLike which offer the same (or a subset of important) public methods.

This would allow us to easily switch out the actual data source for

  1. a database interface (Implement SQL as possible data source #31)
  2. a different data library (e.g. Polars Exchange pandas for polars for much better performance #28)

With respect to 2 we would need to consider if we can leverage the performance boosts of polars because some of them rely on a different of dealing with data. E.g. collecting tasks and the executing them in a combined fashion. Therefore the first task would be to find out if we should rather switch to polars first and do polars.DataFrameLike interface. Pandas would be much easier as we wouldn't have to change any of the implemented logic.

@t-ober t-ober added the enhancement New feature or request label Jun 1, 2023
@t-ober t-ober self-assigned this Jun 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant