Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DetectionDataset.as_coco having annotation id collision between splits #768

Open
1 of 2 tasks
adbcode opened this issue Jan 23, 2024 · 4 comments
Open
1 of 2 tasks
Assignees
Labels
bug Something isn't working

Comments

@adbcode
Copy link

adbcode commented Jan 23, 2024

Search before asking

  • I have searched the Supervision issues and found no similar bug report.

Bug

When exporting a YOLOv8 formatted dataset to COCO format (using the attached code), the JSON files associated with each split uses its own sequence for assigning annotation ID values.

This causes issues when trying to import the output dataset with other libraries, which expect a unique ID across all splits for each annotation.

Example from dataset with train, valid and test splits:

  • test
    image
  • train
    image
  • valid
    image

Kindly consider using a common sequence when generating annotation IDs for a dataset across splits.

Environment

Supervision 0.16.0

Minimal Reproducible Example

import supervision as sv

yolo = sv.DetectionDataset.from_yolo(
images_directory_path=f"{dataset_root}/images",
annotations_directory_path=f"{dataset_root}/labels",
data_yaml_path=f"{dataset_root}/data.yaml",
force_masks=True
)

yolo.as_coco(
images_directory_path=f"{target}/images",
annotations_path=f"{target}/annotations.json"
)

Additional

No response

Are you willing to submit a PR?

  • Yes I'd like to help by submitting a PR!
@adbcode adbcode added the bug Something isn't working label Jan 23, 2024
@SkalskiP
Copy link
Collaborator

Hi @adbcode 👋🏻 Thanks a lot for your interest in supervision. How does that influence your workflow?

@SkalskiP SkalskiP self-assigned this Jan 23, 2024
@adbcode
Copy link
Author

adbcode commented Jan 24, 2024

Hi @adbcode 👋🏻 Thanks a lot for your interest in supervision. How does that influence your workflow?

Hello! When using it with other libraries, especially those who expect unique IDs for each annotation, the dataset gets corrupted on import.

Current workaround is to recreate the IDs during import, but it loses the original order.

@SkalskiP
Copy link
Collaborator

SkalskiP commented Jan 26, 2024

@adbcode, Would something like this satisfy you:

  • Merging YOLO splits.
  • Converting YOLO to COCO.
  • Splitting COCO into subsets while preserving ID continuity.

@adbcode
Copy link
Author

adbcode commented Jan 26, 2024

@adbcode, Would something like this satisfy you:

  • Merging YOLO splits.

  • Converting YOLO to COCO.

  • Splitting COCO into subsets while preserving ID continuity.

this will be fine as long as we can recreate the original split in the end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants