Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotation quality issues #2

Open
bkkm78 opened this issue Mar 17, 2023 · 5 comments
Open

Annotation quality issues #2

bkkm78 opened this issue Mar 17, 2023 · 5 comments

Comments

@bkkm78
Copy link

bkkm78 commented Mar 17, 2023

Thanks for the effort to create this dataset!

When inspecting the annotations, I found some quality issues with the annotations. For some images, the annotations do not seem to be exhaustive. Some clearly visible persons in the foreground are missing in the annotations, such as the one the left in the following image (105520.jpg):

105520 jpg

There are also cases where the full box annotation is only covering part of the person, even if other parts are clearly visible, such the old man riding a horse in this image (100134.jpg):

100134 jpg

There also seem to be overlap between the training set and the eval/test set. For example, 109136.jpg in the validation set appears to be a resized version of 000154.jpg in the training set:

duplicate

Would the authors mind looking into these issues? Thanks!

@bkkm78
Copy link
Author

bkkm78 commented Mar 19, 2023

Here is a list of potential duplicate images I found in the dataset: https://gist.github.com/bkkm78/95fb4faf9ca8303005349a5c396af3c0

@Arthur151
Copy link
Owner

@bkkm78 Thanks a lot for reporting this.
The reported issues have been added to my schedule and I need some time to fix them.
About the issues:

  1. Duplicated images: These images are collected from some existing datasets, such as CrowdPose, using their original name. I will remove the duplicated images and update the evaluation results if necessary.
  2. Missed 2D pose & in-complete full body bounding boxes: Similarly, 2D pose and bounding boxes of some images are inherented from existing datasets. I have tried to maually fix some errors but there might be still some errors out there. I will double check the inhereted annotations again. Welcome to explicitly name the images with errors.

Thanks a lot for being helpful.
Let me know if you'd like to make further report or discussion.

Best,
Yu

@bkkm78
Copy link
Author

bkkm78 commented Mar 20, 2023

@Arthur151 Thank you for your reply! Here is a list of images that may contain incomplete annotations.
https://gist.github.com/bkkm78/e38d089a0cd833bf793c4fb2da7102c1

This list may not be complete, but may be helpful as a starting point. (Being exhaustive is indeed difficult. :))

@bkkm78
Copy link
Author

bkkm78 commented Mar 20, 2023

It may also be helpful if you could release the meta data for each image, such as the source dataset from which the image is collected.

@Arthur151
Copy link
Owner

Arthur151 commented Mar 22, 2023

@bkkm78 Thanks for your efforts! The image list would be very helpful!

The image name of different datasets are quite easy to tell. For example, the image name of CrowdPose would be 6 number starting with 1, like 1xxxxx.jpg. The name of images we collect from InterNet would be 7 number. The image name of OCHuman would be 6 number starting with 0. Something like that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants