Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge_coco doesn't work with default setting drop_duplicates=True #11

Open
AndreaPi opened this issue Jun 12, 2023 · 0 comments
Open

merge_coco doesn't work with default setting drop_duplicates=True #11

AndreaPi opened this issue Jun 12, 2023 · 0 comments

Comments

@AndreaPi
Copy link

Describe the bug
merge_coco doesn't work, because the drop_duplicates code is buggy. Calling it with an exploded list of datasets results in the following error message:

train = merge_coco(*train_folds)
[..]/lib/python3.10/site-packages/cocohelper/dataframe.py:37: UserWarning: COCODataFrame created by a COCODataFrame without using a copy constructor.
  warnings.warn("COCODataFrame created by a COCODataFrame without using a copy constructor.")
[..]/lib/python3.10/site-packages/cocohelper/dataframe.py:37: UserWarning: COCODataFrame created by a COCODataFrame without using a copy constructor.
  warnings.warn("COCODataFrame created by a COCODataFrame without using a copy constructor.")
Traceback (most recent call last):
  File "[..]/pipeline/6a_split_dataset_with_kfold_CV.py", line 29, in <module>
    train = merge_coco(*train_folds)
  File "[..]/lib/python3.10/site-packages/cocohelper/merge.py", line 38, in merge_coco
    return merged.drop_duplicate_cats().drop_duplicate_imgs().drop_duplicate_anns().drop_duplicate_licenses()
  File "[..]/lib/python3.10/site-packages/cocohelper/helper.py", line 743, in drop_duplicate_licenses
    lic_df, id_mapping = drop_duplicate_rows(self.licenses)
  File "[..]/lib/python3.10/site-packages/cocohelper/utils/dataframe.py", line 74, in drop_duplicate_rows
    raise ValueError("There are no columns that can be used to check for duplicates.")
ValueError: There are no columns that can be used to check for duplicates.

To Reproduce
Steps to reproduce the behavior:

  1. Create a list of COCOHelper objects, for example by splitting a COCOHelper ch object using KFoldSplitter:
splitter = KFoldSplitter(n_fold=5)
splits = splitter.apply(ch)
  1. Try to merge the objects back using merge_coco:
train = merge_coco(*splits)
  1. get the error above

Expected behavior
merge_coco should merge the COCOHelper objects without erroring out.

Screenshots
N/A

Desktop (please complete the following information):

  System Version: macOS 13.4 (22F66)
  Kernel Version: Darwin 22.5.0

Additional context
N/A

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant