Skip to content

Conversation

@FBruzzesi
Copy link
Collaborator

Description

In #604 there was a tiny hint at this 😂

In this PR I tried to remove the use of pandas from GroupedTimeSeriesSplit by moving on numpy backend. The only method that still needs a "dataframe"-like object is .summary() (which it's even a nice to have).

In a way this is an alternative way to achieve #597 when a dataframe is just a nice abstraction but not mandatory. The only function that has to work to proceed forward and run .split() method is indexable (from sklearn.utils.validation).

yield np.where(groups == i)[0], np.where(groups == i + 1)[0]
yield group_indices[i], group_indices[i + 1]

def _calc_first_and_last_split_index(self, X=None, y=None, groups=None):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a way, the changes in _calc_first_and_last_split_index are the only hard core difference

@FBruzzesi
Copy link
Collaborator Author

Following all the effort put into adopting Narwhals, it would be nice to revamp this as well. I will close the PR and come back to the topic!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants