About region caption #48

mu-jin-meng · 2024-05-08T08:54:36Z

the generated results only describe the content and not the answer for the specified prompt.

result:

kanguyen-vn · 2024-05-11T22:21:13Z

The model wasn't really trained to perform region-level reasoning; it was only train to do region-level captioning. If you look in these region-level dataset classes, they only use the REGION_QUESTIONS and REGION_GROUP_QUESTIONS prompt templates from here as questions for LLM training, and they're all captioning questions. If you want region-level reasoning capabilities, GLaMM might not be the best solution for you. If you don't really need segmentation masks in the output, I'd try something like Shikra, for example.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About region caption #48

About region caption #48

mu-jin-meng commented May 8, 2024

kanguyen-vn commented May 11, 2024

About region caption #48

About region caption #48

Comments

mu-jin-meng commented May 8, 2024

kanguyen-vn commented May 11, 2024