KFoldSplitter: instead of returning k folds, return k pairs train/test #10

andreap-bh · 2023-06-12T17:31:00Z

Is your feature request related to a problem? Please describe.
Currently, cocohelper.splitters.kfold.KFoldSplitter is not very useful to train models using common frameworks, because, instead of returning k pairs [train_k.json, test_k.json], it returns just the k folds. However, in most frameworks training a model for each iteration of k-fold CV, requires having separate annotation files for the train set and the test set. This means that the user has to do the following:

create the k folds using cocohelper.splitters.kfold.KFoldSplitter
for each fold, hold one fold out (test) and merge the remaining k-1 folds in a train set

This is boring and possibly error-prone.

Describe the solution you'd like
It would be better to either return the k pairs directly. For example, something like this should work:

def generate_train_test_splits(n_fold: int,):
    splitter = KFoldSplitter(n_fold=k)
    folds = splitter.apply(ch)
    train_test_splits = []
    for i, fold in enumerate(folds):
        test = fold
        train_folds = folds[:i] + folds[(i+1):]
        train = merge_coco(*train_folds)
        train_test_splits.append([train, test])
    return train_test_splits

Alternatively, one could add a booelan argument train_test_pairs which allows the user to decide whether KFoldSplitter.apply() returns the k folds, or k pairs [train_k.json, test_k.json]. However, this has the side effect that now KFoldSplitter.apply() can return objects of different structure (either a list, or a list of lists). To avoid this, another method could be added which returns the pairs, instead of adding an argument to KFoldSplitter.apply() .

Describe alternatives you've considered
N/A

Additional context
N/A

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KFoldSplitter: instead of returning k folds, return k pairs train/test #10

KFoldSplitter: instead of returning k folds, return k pairs train/test #10

andreap-bh commented Jun 12, 2023 •

edited

Loading

KFoldSplitter: instead of returning k folds, return k pairs train/test #10

KFoldSplitter: instead of returning k folds, return k pairs train/test #10

Comments

andreap-bh commented Jun 12, 2023 • edited Loading

andreap-bh commented Jun 12, 2023 •

edited

Loading