`log_image`: log bounding boxes #766

dberenbaum · 2024-01-23T22:05:43Z

Related: iterative/dvc#10198, iterative/vscode-dvc#4917

We need a way to log bounding boxes (and maybe later other annotations like segmentation masks) for images saved with dvclive.

p1

The API can look like this:

boxes = [
  {"label": "cat", "box": {"x_min": 100, "x_max": 110, "y_min": 5, "y_max": 20}},
  {"label": "cat", "box": {"x_min": 30, "x_max": 55, "y_min": 75, "y_max": 90}},
  {"label": "dog", "box": {"x_min": 80, "x_max": 100, "y_min": 25, "y_max": 50}}
]
live.log_image("myimg.png", myimg, boxes=boxes)

In addition to saving the image to dvclive/plots/images/myimg.png, this will also save annotations to dvclive/plots/images/myimg.json in the following format:

{"boxes":
  [
    {"label": "cat", "box": {"x_min": 100, "x_max": 110, "y_min": 5, "y_max": 20}},
    {"label": "cat", "box": {"x_min": 30, "x_max": 55, "y_min": 75, "y_max": 90}},
    {"label": "dog", "box": {"x_min": 80, "x_max": 100, "y_min": 25, "y_max": 50}}
  ]
}

p2:

Other box formats (using width, height, and x/y for the center/corner) ({"x_center": 100, "y_center": 50, "width": 10, "height": 20})
Normalized coordinates (between 0 and 1) instead of pixel coordinates (we could probably auto-detect this)
Scores ("scores": {"acc": 0.9, "loss": 0.05}) so that users can filter boxes based on thresholds (only show boxes where acc > 0.8)
Segmentations masks (tbd, requires a class per pixel)

The text was updated successfully, but these errors were encountered:

dberenbaum · 2024-01-23T22:12:26Z

May need to consider whether it's necessary to list the universe of labels somewhere or if it's fine to parse them as the set of all individual labels.

AlexandreKempf · 2024-02-01T10:16:55Z

I'm not 100% sure this is the right place for my first discussion on a feature.
But I'll jump on that one and remove my comment if it is not the correct way of dealing with feature discussion internally.

I would advise using "left," "top," "right," and "bottom" instead of "x" and "y" notations. The first one leaves no ambiguity, while the second is quite ambiguous. First, because it depends on what you consider x and y to be (images can be seen as a matrix (x is vertical, and y is horizontal) or as a plot (x is horizontal, and y is vertical). Then, "x_min" and "x_max" depend on your reference point. For instance, torchvision and Shapely don't have the same. The first considers the top left of the image to be the reference, and the second considers the bottom left (It is the same debate as matrix vs plot). Honestly, after many years working on object detection, the only format that never confused us was "left," "top," "right," and "bottom". While I agree that the user interface should have several options, internally, I can't recommend enough that we use a nonambiguous notation.
You mentioned a "score" feature, which is a great idea. In my opinion, it should probably be in P1, actually. There are so many detections out of a detection model that they only make sense if you have a score attached. What could be very interesting to have a threshold set by class in the visual interface. Usually, some classes are more represented than others, so the threshold you want to set for each class can be very different (for the same model, it could be 0.3 for rare classes and 0.95 for common classes).
I realized you wanted to give an example, but you don't usually have an accuracy score for each bounding box. The best most libraries out there give you is the confidence for the winning class, and only during the validation process (not during training). Indeed, during training, the model (or framework) will only return the loss for the all image. I would suggest we had this "score" at the same level of "label" and "box" and make it a float.
We should take advantage of other tools dealing with classification + detection + polygons + segmentation + multiclass like Supervisely (a labeling platform) or lightning-flash. Honestly, having a nice and intuitive data format for all these use cases is not trivial. We might benefit from looking at their data schema and eventually asking them what they would do differently if they could start over.

Feel free to tell me if I should have done this discussion differently or elsewhere. I'll act accordingly.

dberenbaum · 2024-02-01T12:31:34Z

Great feedback @AlexandreKempf! Let's go with your suggestions here.

@mattseddon and @julieg18 have been working on this functionality, and you could work with them on getting this implemented. @AlexandreKempf is our newest ML product engineer who just joined the team.

julieg18 · 2024-02-02T22:27:29Z

@AlexandreKempf, great suggestions on this feature!

Feel free to take a look at iterative/vscode-dvc#5227 if you'd like to give any feedback on the plots' current design and reach out if you have any questions about VSCode's or Studio's side of things.

AlexandreKempf · 2024-02-27T07:28:54Z

TODO list for this project:

DVClive should save the annotations in a .json file close to the image file
DVC should display the annotations when running the query dvc plots diff --json --split so that VScode can read them
VSCode should display the annotations
DVC should sent the annotations to Studio
DVC should sent the annotations to Studio for live experiments
Studio should display the annotations

dberenbaum added p1-important Include in the next sprint A: log_image Area: `live.log_image` A: studio Area: Studio integration A: vscode Area: DVC VSCode Extension integration labels Jan 23, 2024

dberenbaum mentioned this issue Jan 23, 2024

plots: interactive plots with toggling bounding box iterative/dvc#10198

Open

dberenbaum mentioned this issue Jan 24, 2024

Add bounding boxes plot frontend components iterative/vscode-dvc#5227

Closed

4 tasks

julieg18 mentioned this issue Feb 1, 2024

Add backend logic for bounding box plots iterative/vscode-dvc#5250

Merged

dberenbaum assigned AlexandreKempf Feb 6, 2024

AlexandreKempf mentioned this issue Feb 7, 2024

Add Bounding Boxes annotations #776

Closed

2 tasks

dberenbaum added p2-medium and removed p1-important Include in the next sprint labels Apr 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`log_image`: log bounding boxes #766

`log_image`: log bounding boxes #766

dberenbaum commented Jan 23, 2024

dberenbaum commented Jan 23, 2024

AlexandreKempf commented Feb 1, 2024

dberenbaum commented Feb 1, 2024

julieg18 commented Feb 2, 2024

AlexandreKempf commented Feb 27, 2024

log_image: log bounding boxes #766

log_image: log bounding boxes #766

Comments

dberenbaum commented Jan 23, 2024

p1

p2:

dberenbaum commented Jan 23, 2024

AlexandreKempf commented Feb 1, 2024

dberenbaum commented Feb 1, 2024

julieg18 commented Feb 2, 2024

AlexandreKempf commented Feb 27, 2024

`log_image`: log bounding boxes #766

`log_image`: log bounding boxes #766