Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to run captioning on any image? E.g. how to prepare the test.yaml and other required files required by the run_captioning.py? #183

Open
aliencaocao opened this issue Jan 22, 2022 · 11 comments

Comments

@aliencaocao
Copy link

The image contains classes seen in coco captioning dataset but I do not know how to extract the features for captioning using oscar+

@jontooy
Copy link

jontooy commented Feb 16, 2022

Hi aliencaocao,

It takes a couple of steps to prepare the features and yaml files

I attached a Colab notebook where I step by step generate the features with VinVL.

@Amir-mjafari
Copy link

Hi aliencaocao,

It takes a couple of steps to prepare the features and yaml files

I attached a Colab notebook where I step by step generate the features with VinVL.

Hi, Thank you so much for sharing. I tested and it really worked well. Can you please also provide us a demo on how to run run_captioning.py to get captions from box features extracted for each image with your demo?

@Nidadadadada
Copy link

Hi aliencaocao,

It takes a couple of steps to prepare the features and yaml files

I attached a Colab notebook where I step by step generate the features with VinVL.

hello,thank you so much for sharing.I tested it and found a bug when python tools/test_sg_net.py .I just executed each command in sequence,and I wonder if I did something wrong when executed your Colab notebook?Thank you for answering!
Here is the bug information

2022-03-08 05:49:34,799 maskrcnn_benchmark.data.build WARNING: When using more than one image per GPU you may encounter an out-of-memory (OOM) error if your GPU does not have sufficient memory. If this happens, you can reduce SOLVER.IMS_PER_BATCH (for training) or TEST.IMS_PER_BATCH (for inference). For training, you must also adjust the learning rate and schedule length according to the linear scaling rule. See for example: https://github.com/facebookresearch/Detectron/blob/master/configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml#L14
Traceback (most recent call last):
File "tools/test_sg_net.py", line 197, in
main()
File "tools/test_sg_net.py", line 193, in main
run_test(cfg, model, args.distributed, model_name)
File "tools/test_sg_net.py", line 55, in run_test
data_loaders_val = make_data_loader(cfg, is_train=False, is_distributed=distributed)
File "/content/drive/MyDrive/scene_graph_benchmark/maskrcnn_benchmark/data/build.py", line 170, in make_data_loader
datasets = build_dataset(cfg, transforms, DatasetCatalog, is_train or is_for_period)
File "/content/drive/MyDrive/scene_graph_benchmark/maskrcnn_benchmark/data/build.py", line 45, in build_dataset
cfg, dataset_name, factory_name, is_train
File "/content/drive/MyDrive/scene_graph_benchmark/maskrcnn_benchmark/data/datasets/utils/config_args.py", line 7, in config_tsv_dataset_args
assert op.isfile(full_yaml_file)
AssertionError

@feifang24
Copy link

@Nidadadadada I encountered the same error at first and I think it's because we also need to modify the config yaml file. Right above the cell you executed the author writes:

Configure sgg_configs/vgattr/vinvl_x152c4.yaml and make sure os.path.join(DATA_DIR, DATASETS.TEST) is to your dataset yaml file. Current settings:

  DATASETS.TEST: ("train.yaml",)
  OUTPUT_DIR: "output/"
  DATA_DIR: "tools/mini_tsv/data/"

@2021202420
Copy link

I also meet your error, and I try your method, It works, thank you.

@eslambakr
Copy link

eslambakr commented Jan 28, 2023

Dears, @2021202420 @feifang24 @Amir-mjafari @jontooy

I run it and I successfully generated the needed features and files.
But when I run Oscar model on the generated features using "run_captioning.py" I got wrong captions, it is just random staff and weird words, whihc indicates there is an issue in the feature format or something.
Despite, I checked the generated labels and it seems make sense where almost all objects in the images is detected correctly.

So do u face this issue?
I see u said that u managed to run the code and it works so can someone help me in this regard.

Thanks in advance!

@jontooy
Copy link

jontooy commented Jan 28, 2023

Dears, @2021202420 @feifang24 @Amir-mjafari @jontooy

I run it and I successfully generated the needed features and files. But when I run Oscar model on the generated features using "run_captioning.py" I got wrong captions, it is just random staff and weird words, whihc indicates there is an issue in the feature format or something. Despite, I checked the generated labels and it seems make sense where almost all objects in the images is detected correctly.

So do u face this issue? I see u said that u managed to run the code and it works so can someone help me in this regard.

Thanks in advance!

Hi eslambakr,

Although this was for me long ago, I do recall having a similar issue. I don't think your features are wrong (If you doubled checked them and they look right, they should be right).

Could you share the command you use to run the model? What settings do you use? I'd start with changing the BERT-model for a start.

@eslambakr
Copy link

Thanks for your prompt response!
I figured out what was the issue.
I was using the basic weights for OSCAR, when I used the Vinvl version; OSCAR+ it works fine.
But I need to run the basic OSCAR therefore I guess I have to extract the features using the Bottom-Up approach instead.

@hamzakhalil798
Copy link

hamzakhalil798 commented Feb 16, 2023

@jontooy @eslambakr Hey! iv created the dataset using the three images present inside the above colab notebook...
prepared dataset using type=Test and caption=False.
model iv used for oscar+ is checkpoiint_base
but I'm getting error on inference.
Can you tell me how you ran inference using run_captioning?

here's my error..
image

@hamzakhalil798
Copy link

never mind got it fixed.

@hamza13-12
Copy link

@hamzakhalil798 I am also having trouble running image captioning. Can you please offer some guidance on how to set up oscar correctly and how to load the pre-trained checkpoints to accomplish this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants