Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Date-agnostic way to define training/validation split of dataset #142

Open
cathalobrien opened this issue Nov 13, 2024 · 0 comments
Open
Labels
enhancement New feature or request

Comments

@cathalobrien
Copy link
Contributor

cathalobrien commented Nov 13, 2024

Is your feature request related to a problem? Please describe.

Currently I define my training/validation split like so

  training:
    start: null
    end: 2020
  validation:
    start: 2021
    end: 2021

This breaks when i change to a different dataset which covers a different date range. Then I am forced to use my brain to remember the syntax and think up a different set of dates that fall within the new dataset. When all I really want is an 80% training 20% validation split.

Describe the solution you'd like

It would be nice if there was a way to select fractions of the dataset, without having to mention dates. This would make the same config portable to datasets covering different date ranges. An example of an 80% training 20% validation split is below, where I give fractions as floats (and ideally Anemoi will work out the date ranges itself).

  training:
    start: 0.0
    end: 0.8
  validation:
    start: 0.9
    end: 1.0

Describe alternatives you've considered

No response

Additional context

No response

Organisation

ECMWF

@cathalobrien cathalobrien added the enhancement New feature or request label Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant