Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about training on my custom datasets? #36

Open
Hezhexi2002 opened this issue Nov 16, 2021 · 6 comments
Open

Some questions about training on my custom datasets? #36

Hezhexi2002 opened this issue Nov 16, 2021 · 6 comments

Comments

@Hezhexi2002
Copy link

Hezhexi2002 commented Nov 16, 2021

@ternaus Hi,I'm a undergraduate from North University of China and I'm currently working on using neural network to detect the armor in the Robomaster competitions![3_demo.jpg](https://user-images.githubusercontent.com/53631206/141986706-3446a23c-16d6-437e-8f48-3d54e4563497.jpg
just like the above image,my job is to do so too.And I have tried yolov5 to achieve the goal but I found that the bbox can't fit the armor perfectly which will influence the attitude solution using pnp in the next step,so I wonder if I can use the keypoint to replace the bbox which drives me to find your project.However the main problem I meet now is that I don't know how to convert my label format into which can be trained with your model,my labels seems like this:
id+x1,y1+x2,y2+...+x4,y4,so what should I change in your code to adapt to my datasets?Hope you can give me some suggestions.

@corkillj
Copy link

"id+x1,y1+x2,y2+...+x4,y4"
I have no idea what this is supposed to be, but you just need to give it format "x_min,y_min,x_max,y_max" -> the top left and the bottom right of the corner of the bounding box. Maybe draw it on a piece of paper and you will understand.

@Hezhexi2002
Copy link
Author

"id+x1,y1+x2,y2+...+x4,y4"
I have no idea what this is supposed to be, but you just need to give it format "x_min,y_min,x_max,y_max" -> the top left and the bottom right of the corner of the bounding box. Maybe draw it on a piece of paper and you will understand.

Thank you for your reply,maybe the format I described here is not clear,I will take a picture of it this afternoon when I go to the shool lab because my computer has been left there:-)

@Hezhexi2002
Copy link
Author

"id+x1,y1+x2,y2+...+x4,y4"
I have no idea what this is supposed to be, but you just need to give it format "x_min,y_min,x_max,y_max" -> the top left and the bottom right of the corner of the bounding box. Maybe draw it on a piece of paper and you will understand.

I know the format you described above.It's the original format of yolo series' label.txt,However,the question is that now I want to use the four keypoints of the corner of the bbox to replace it,and the retinaface is initially trained on wider face which have 5 keypoints and the xywh of the bbox,so I wonder what should I do to make it possible to train retinaface on my datasets.Maybe it need a script to convert or something else?

@Hezhexi2002
Copy link
Author

"id+x1,y1+x2,y2+...+x4,y4" I have no idea what this is supposed to be, but you just need to give it format "x_min,y_min,x_max,y_max" -> the top left and the bottom right of the corner of the bounding box. Maybe draw it on a piece of paper and you will understand.
(1).txt
My label.txt is I post here,the first number is the class id and left 8 numbers is the normalized coordinates of the four points of the corner of the bbox so could I use the script the author provided in the repo to convert the txt into json?

@corkillj
Copy link

"id+x1,y1+x2,y2+...+x4,y4"
I have no idea what this is supposed to be, but you just need to give it format "x_min,y_min,x_max,y_max" -> the top left and the bottom right of the corner of the bounding box. Maybe draw it on a piece of paper and you will understand.

I know the format you described above.It's the original format of yolo series' label.txt,However,the question is that now I want to use the four keypoints of the corner of the bbox to replace it,and the retinaface is initially trained on wider face which have 5 keypoints and the xywh of the bbox,so I wonder what should I do to make it possible to train retinaface on my datasets.Maybe it need a script to convert or something else?

This Retinaface Repo expects "x_min,y_min,x_max,y_max" for the bounding box, not xywh. Also it expects the input in pixel coordinates. You will have to write a converter that unnormalizes the coordinates according to the image size, or adjust the retinaface code. To train without landmarks I guess you could set the weight of the landmark loss to 0 and just give it dummy landmarks. (Retinaface will have worse results when training without landmarks)

@Hezhexi2002
Copy link
Author

"id+x1,y1+x2,y2+...+x4,y4"
I have no idea what this is supposed to be, but you just need to give it format "x_min,y_min,x_max,y_max" -> the top left and the bottom right of the corner of the bounding box. Maybe draw it on a piece of paper and you will understand.

I know the format you described above.It's the original format of yolo series' label.txt,However,the question is that now I want to use the four keypoints of the corner of the bbox to replace it,and the retinaface is initially trained on wider face which have 5 keypoints and the xywh of the bbox,so I wonder what should I do to make it possible to train retinaface on my datasets.Maybe it need a script to convert or something else?

This Retinaface Repo expects "x_min,y_min,x_max,y_max" for the bounding box, not xywh. Also it expects the input in pixel coordinates. You will have to write a converter that unnormalizes the coordinates according to the image size, or adjust the retinaface code. To train without landmarks I guess you could set the weight of the landmark loss to 0 and just give it dummy landmarks. (Retinaface will have worse results when training without landmarks)

Thank you for your reply again,I think I maybe get your point now.So the labels should contains the unnormalized coordinates instead of the normalized ones unless you adjust the code,However,I just need the landmarks which are the four points of the corners of the armor or I won't use the retinaface .Still I have a question what is format of the datasets this repo needs,txt or json,I see the README.md which says you need to convert the txt into json first then you can train,and now I consider to use cvat to label my image which can export the datasets as the same format as WIDER FACE,so is that mean I can use it to train directly?I would appreciate it if you can give me some instructions again:-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants