Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KFoldSplitter: instead of returning k folds, return k pairs train/test #10

Open
andreap-bh opened this issue Jun 12, 2023 · 0 comments
Open

Comments

@andreap-bh
Copy link

andreap-bh commented Jun 12, 2023

Is your feature request related to a problem? Please describe.
Currently, cocohelper.splitters.kfold.KFoldSplitter is not very useful to train models using common frameworks, because, instead of returning k pairs [train_k.json, test_k.json], it returns just the k folds. However, in most frameworks training a model for each iteration of k-fold CV, requires having separate annotation files for the train set and the test set. This means that the user has to do the following:

  • create the k folds using cocohelper.splitters.kfold.KFoldSplitter
  • for each fold, hold one fold out (test) and merge the remaining k-1 folds in a train set

This is boring and possibly error-prone.

Describe the solution you'd like
It would be better to either return the k pairs directly. For example, something like this should work:

def generate_train_test_splits(n_fold: int,):
    splitter = KFoldSplitter(n_fold=k)
    folds = splitter.apply(ch)
    train_test_splits = []
    for i, fold in enumerate(folds):
        test = fold
        train_folds = folds[:i] + folds[(i+1):]
        train = merge_coco(*train_folds)
        train_test_splits.append([train, test])
    return train_test_splits

Alternatively, one could add a booelan argument train_test_pairs which allows the user to decide whether KFoldSplitter.apply() returns the k folds, or k pairs [train_k.json, test_k.json]. However, this has the side effect that now KFoldSplitter.apply() can return objects of different structure (either a list, or a list of lists). To avoid this, another method could be added which returns the pairs, instead of adding an argument to KFoldSplitter.apply() .

Describe alternatives you've considered
N/A

Additional context
N/A

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant