-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detection score of the segment #42
Comments
Great point! the mentioned problem isnt addressed in this work. |
And we agree with your opinion that the problem is because the input videos always include video segments that correspond to text queries. |
When testing, I input a 150s video into the model.
Test Scenario 1: The input video is of a woman dancing, and the query text is "a woman is dancing." The model correctly detects the corresponding segment, which meets expectations.
Test Scenario 2: The input video does not contain any clips of a woman dancing; it is just a video of a woman sitting on a chair. The query text is "a woman is dancing," yet the model still detects a corresponding segment, which does not meet expectations.
Test Scenario 3: The input is a combination of videos from Scenario 1 and Scenario 2. The query text is "a man is playing basketball." There are no men or basketball scenes in the video, but among the top 10 results, there are still segments with high scores.
My question is, for a test video and a query text, is there always a highly scored positive segment detected? What is the reason for this phenomenon? Is it because during your training, each video always has at least one segment that corresponds to the query text as a positive example?
The text was updated successfully, but these errors were encountered: