Skip to content

Commit

Permalink
Add GroundingDINO on ODinW results, and support caption prompt of Gro…
Browse files Browse the repository at this point in the history
…undingDINO (#11187)
  • Loading branch information
Cycyes committed Nov 21, 2023
1 parent 24bb129 commit ee2e542
Show file tree
Hide file tree
Showing 6 changed files with 2,409 additions and 80 deletions.
116 changes: 59 additions & 57 deletions configs/odinw/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

## Get Started

1. development Developmennt Setup can reger to hits /\\To download dataset, you can refer to [reference document](../../docs/zh_cn/user_guides/dataset_prepare.md)
1. To download dataset, you can refer to [reference document](../../docs/zh_cn/user_guides/dataset_prepare.md)

2. You can use the following data to run the inference.

Expand All @@ -22,73 +22,75 @@ Learning visual representations from natural language supervision has recently s

## Results and models of odinw13

| Method | GLIP-T(A) | Official | GLIP-T(B) | Official | GLIP-T(C) | Official |
| --------------------- | --------- | --------- | --------- | --------- | --------- | --------- |
| AerialMaritimeDrone | 0.123 | 0.122 | 0.110 | 0.11 | 0.130 | 0.130 |
| Aquarium | 0.175 | 0.174 | 0.173 | 0.169 | 0.191 | 0.190 |
| CottontailRabbits | 0.686 | 0.686 | 0.688 | 0.688 | 0.744 | 0.744 |
| EgoHands | 0.013 | 0.013 | 0.003 | 0.540 | 0.314 | 0.315 |
| NorthAmericaMushrooms | 0.502 | 0.502 | 0.367 | 0.051 | 0.297 | 0.296 |
| Packages | 0.589 | 0.589 | 0.083 | 0.030 | 0.699 | 0.699 |
| PascalVOC | 0.512 | 0.512 | 0.541 | 0.288 | 0.565 | 0.565 |
| pistols | 0.339 | 0.339 | 0.502 | 0.338 | 0.503 | 0.504 |
| pothole | 0.007 | 0.007 | 0.030 | 0.475 | 0.058 | 0.058 |
| Raccoon | 0.075 | 0.075 | 0.285 | 0.288 | 0.241 | 0.244 |
| ShellfishOpenImages | 0.372 | 0.372 | 0.337 | 0.338 | 0.300 | 0.302 |
| thermalDogsAndPeople | 0.372 | 0.372 | 0.475 | 0.475 | 0.510 | 0.510 |
| VehiclesOpenImages | 0.574 | 0.574 | 0.562 | 0.547 | 0.549 | 0.534 |
| Average | **0.334** | **0.324** | **0.320** | **0.318** | **0.392** | **0.392** |
| Method | GLIP-T(A) | Official | GLIP-T(B) | Official | GLIP-T(C) | Official | GroundingDINO-T | GroundingDINO-B |
| --------------------- | --------- | --------- | --------- | --------- | --------- | --------- | --------------- | --------------- |
| AerialMaritimeDrone | 0.123 | 0.122 | 0.110 | 0.110 | 0.130 | 0.130 | 0.173 | 0.281 |
| Aquarium | 0.175 | 0.174 | 0.173 | 0.169 | 0.191 | 0.190 | 0.195 | 0.445 |
| CottontailRabbits | 0.686 | 0.686 | 0.688 | 0.688 | 0.744 | 0.744 | 0.799 | 0.808 |
| EgoHands | 0.013 | 0.013 | 0.003 | 0.004 | 0.314 | 0.315 | 0.608 | 0.764 |
| NorthAmericaMushrooms | 0.502 | 0.502 | 0.367 | 0.367 | 0.297 | 0.296 | 0.507 | 0.675 |
| Packages | 0.589 | 0.589 | 0.083 | 0.083 | 0.699 | 0.699 | 0.687 | 0.670 |
| PascalVOC | 0.512 | 0.512 | 0.541 | 0.540 | 0.565 | 0.565 | 0.563 | 0.711 |
| pistols | 0.339 | 0.339 | 0.502 | 0.501 | 0.503 | 0.504 | 0.726 | 0.771 |
| pothole | 0.007 | 0.007 | 0.030 | 0.030 | 0.058 | 0.058 | 0.215 | 0.478 |
| Raccoon | 0.075 | 0.074 | 0.285 | 0.288 | 0.241 | 0.244 | 0.549 | 0.541 |
| ShellfishOpenImages | 0.253 | 0.253 | 0.337 | 0.338 | 0.300 | 0.302 | 0.393 | 0.650 |
| thermalDogsAndPeople | 0.372 | 0.372 | 0.475 | 0.475 | 0.510 | 0.510 | 0.657 | 0.633 |
| VehiclesOpenImages | 0.574 | 0.566 | 0.562 | 0.547 | 0.549 | 0.534 | 0.613 | 0.647 |
| Average | **0.325** | **0.324** | **0.320** | **0.318** | **0.392** | **0.392** | **0.514** | **0.621** |

Note:

1. The above are zero-shot evaluation results.
2. The config and weights can be found at [here](../glip/README.md)
2. The config and weights of GLIPs models can be found at [here](../glip/README.md)
3. The config and weights of GroundingDINO models can be found at [here](../grounding_dino/README.md)

## Results and models of odinw35

| Method | GLIP-T(A) | Official | GLIP-T(B) | Official | GLIP-T(C) | Official |
| --------------------------- | --------- | --------- | --------- | --------- | --------- | --------- |
| AerialMaritimeDrone_large | 0.123 | 0.122 | 0.110 | 0.110 | 0.130 | 0.130 |
| AerialMaritimeDrone_tiled | 0.174 | 0.174 | 0.172 | 0.172 | 0.172 | 0.172 |
| AmericanSignLanguageLetters | 0.001 | 0.001 | 0.003 | 0.003 | 0.009 | 0.009 |
| Aquarium | 0.175 | 0.175 | 0.173 | 0.171 | 0.192 | 0.182 |
| BCCD | 0.016 | 0.016 | 0.001 | 0.001 | 0.000 | 0.000 |
| boggleBoards | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
| brackishUnderwater | 0.016 | 0..013 | 0.021 | 0.027 | 0.020 | 0.022 |
| ChessPieces | 0.001 | 0.001 | 0.000 | 0.000 | 0.001 | 0.001 |
| CottontailRabbits | 0.710 | 0.709 | 0.683 | 0.683 | 0.752 | 0.752 |
| dice | 0.005 | 0.005 | 0.004 | 0.004 | 0.004 | 0.004 |
| DroneControl | 0.016 | 0.017 | 0.006 | 0.008 | 0.005 | 0.007 |
| EgoHands_generic | 0.009 | 0.010 | 0.005 | 0.006 | 0.510 | 0.508 |
| EgoHands_specific | 0.001 | 0.001 | 0.004 | 0.006 | 0.003 | 0.004 |
| HardHatWorkers | 0.029 | 0.029 | 0.023 | 0.023 | 0.033 | 0.033 |
| MaskWearing | 0.007 | 0.007 | 0.003 | 0.002 | 0.005 | 0.005 |
| MountainDewCommercial | 0.218 | 0.227 | 0.199 | 0.197 | 0.478 | 0.463 |
| NorthAmericaMushrooms | 0.502 | 0.502 | 0.450 | 0.450 | 0.497 | 0.497 |
| openPoetryVision | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 |
| OxfordPets_by_breed | 0.001 | 0.002 | 0.002 | 0.004 | 0.001 | 0.002 |
| OxfordPets_by_species | 0.016 | 0.011 | 0.012 | 0.009 | 0.013 | 0.009 |
| PKLot | 0.002 | 0.002 | 0.000 | 0.000 | 0.000 | 0.000 |
| Packages | 0.569 | 0.569 | 0.279 | 0.279 | 0.712 | 0.712 |
| PascalVOC | 0.512 | 0.512 | 0.541 | 0.540 | 0.565 | 0.565 |
| pistols | 0.339 | 0.339 | 0.502 | 0.501 | 0.503 | 0.504 |
| plantdoc | 0.002 | 0.002 | 0.007 | 0.007 | 0.009 | 0.009 |
| pothole | 0.007 | 0.010 | 0.024 | 0.025 | 0.085 | 0.101 |
| Raccoons | 0.075 | 0.074 | 0.285 | 0.288 | 0.241 | 0.244 |
| selfdrivingCar | 0.071 | 0.072 | 0.074 | 0.074 | 0.081 | 0.080 |
| ShellfishOpenImages | 0.253 | 0.253 | 0.337 | 0.338 | 0.300 | 0.302 |
| ThermalCheetah | 0.028 | 0.028 | 0.000 | 0.000 | 0.028 | 0.028 |
| thermalDogsAndPeople | 0.372 | 0.372 | 0.475 | 0.475 | 0.510 | 0.510 |
| UnoCards | 0.000 | 0.000 | 0.000 | 0.001 | 0.002 | 0.003 |
| VehiclesOpenImages | 0.574 | 0.566 | 0.562 | 0.547 | 0.549 | 0.534 |
| WildfireSmoke | 0.000 | 0.000 | 0.000 | 0.000 | 0.017 | 0.017 |
| websiteScreenshots | 0.003 | 0.004 | 0.003 | 0.005 | 0.005 | 0.006 |
| Average | **0.134** | **0.134** | **0.138** | **0.138** | **0.179** | **0.178** |
| Method | GLIP-T(A) | Official | GLIP-T(B) | Official | GLIP-T(C) | Official | GroundingDINO-T | GroundingDINO-B |
| --------------------------- | --------- | --------- | --------- | --------- | --------- | --------- | --------------- | --------------- |
| AerialMaritimeDrone_large | 0.123 | 0.122 | 0.110 | 0.110 | 0.130 | 0.130 | 0.173 | 0.281 |
| AerialMaritimeDrone_tiled | 0.174 | 0.174 | 0.172 | 0.172 | 0.172 | 0.172 | 0.206 | 0.364 |
| AmericanSignLanguageLetters | 0.001 | 0.001 | 0.003 | 0.003 | 0.009 | 0.009 | 0.002 | 0.096 |
| Aquarium | 0.175 | 0.175 | 0.173 | 0.171 | 0.192 | 0.182 | 0.195 | 0.445 |
| BCCD | 0.016 | 0.016 | 0.001 | 0.001 | 0.000 | 0.000 | 0.161 | 0.584 |
| boggleBoards | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.134 |
| brackishUnderwater | 0.016 | 0..013 | 0.021 | 0.027 | 0.020 | 0.022 | 0.021 | 0.454 |
| ChessPieces | 0.001 | 0.001 | 0.000 | 0.000 | 0.001 | 0.001 | 0.000 | 0.000 |
| CottontailRabbits | 0.710 | 0.709 | 0.683 | 0.683 | 0.752 | 0.752 | 0.806 | 0.797 |
| dice | 0.005 | 0.005 | 0.004 | 0.004 | 0.004 | 0.004 | 0.004 | 0.082 |
| DroneControl | 0.016 | 0.017 | 0.006 | 0.008 | 0.005 | 0.007 | 0.042 | 0.638 |
| EgoHands_generic | 0.009 | 0.010 | 0.005 | 0.006 | 0.510 | 0.508 | 0.608 | 0.764 |
| EgoHands_specific | 0.001 | 0.001 | 0.004 | 0.006 | 0.003 | 0.004 | 0.002 | 0.687 |
| HardHatWorkers | 0.029 | 0.029 | 0.023 | 0.023 | 0.033 | 0.033 | 0.046 | 0.439 |
| MaskWearing | 0.007 | 0.007 | 0.003 | 0.002 | 0.005 | 0.005 | 0.004 | 0.406 |
| MountainDewCommercial | 0.218 | 0.227 | 0.199 | 0.197 | 0.478 | 0.463 | 0.430 | 0.580 |
| NorthAmericaMushrooms | 0.502 | 0.502 | 0.450 | 0.450 | 0.497 | 0.497 | 0.471 | 0.501 |
| openPoetryVision | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.051 |
| OxfordPets_by_breed | 0.001 | 0.002 | 0.002 | 0.004 | 0.001 | 0.002 | 0.003 | 0.799 |
| OxfordPets_by_species | 0.016 | 0.011 | 0.012 | 0.009 | 0.013 | 0.009 | 0.011 | 0.872 |
| PKLot | 0.002 | 0.002 | 0.000 | 0.000 | 0.000 | 0.000 | 0.001 | 0.774 |
| Packages | 0.569 | 0.569 | 0.279 | 0.279 | 0.712 | 0.712 | 0.695 | 0.728 |
| PascalVOC | 0.512 | 0.512 | 0.541 | 0.540 | 0.565 | 0.565 | 0.563 | 0.711 |
| pistols | 0.339 | 0.339 | 0.502 | 0.501 | 0.503 | 0.504 | 0.726 | 0.771 |
| plantdoc | 0.002 | 0.002 | 0.007 | 0.007 | 0.009 | 0.009 | 0.005 | 0.376 |
| pothole | 0.007 | 0.010 | 0.024 | 0.025 | 0.085 | 0.101 | 0.215 | 0.478 |
| Raccoons | 0.075 | 0.074 | 0.285 | 0.288 | 0.241 | 0.244 | 0.549 | 0.541 |
| selfdrivingCar | 0.071 | 0.072 | 0.074 | 0.074 | 0.081 | 0.080 | 0.089 | 0.318 |
| ShellfishOpenImages | 0.253 | 0.253 | 0.337 | 0.338 | 0.300 | 0.302 | 0.393 | 0.650 |
| ThermalCheetah | 0.028 | 0.028 | 0.000 | 0.000 | 0.028 | 0.028 | 0.087 | 0.290 |
| thermalDogsAndPeople | 0.372 | 0.372 | 0.475 | 0.475 | 0.510 | 0.510 | 0.657 | 0.633 |
| UnoCards | 0.000 | 0.000 | 0.000 | 0.001 | 0.002 | 0.003 | 0.006 | 0.754 |
| VehiclesOpenImages | 0.574 | 0.566 | 0.562 | 0.547 | 0.549 | 0.534 | 0.613 | 0.647 |
| WildfireSmoke | 0.000 | 0.000 | 0.000 | 0.000 | 0.017 | 0.017 | 0.134 | 0.410 |
| websiteScreenshots | 0.003 | 0.004 | 0.003 | 0.005 | 0.005 | 0.006 | 0.012 | 0.175 |
| Average | **0.134** | **0.134** | **0.138** | **0.138** | **0.179** | **0.178** | **0.227** | **0.492** |

Note:

1. The above are zero-shot evaluation results.
2. The config and weights can be found at [here](../glip/README.md)
2. The config and weights of GLIPs models can be found at [here](../glip/README.md)
3. The config and weights of GroundingDINO models can be found at [here](../grounding_dino/README.md)

## Citation

Expand Down

0 comments on commit ee2e542

Please sign in to comment.