image-captioning-transduction-models

We implement RNNS(vanilla, LSTM) and use them to train a model that can generate novel captions for images.

We have used Microsoft COCO dataset which has become the standard testbed for image captioning. The dataset consists of 80,000 training images and 40,000 validation images, each annotated with 5 captions written by workers on Amazon Mechanical Turk.

MobileNet is used for image feature extraction. For vanilla RNN and LSTM, we use the pooled CNN feature activation.

Please check notebook file for detailed explanation of overall process.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
backend		backend
README.md		README.md
captioning_transduction.ipynb		captioning_transduction.ipynb
helper.py		helper.py
rnn_lstm_attention_captioning.py		rnn_lstm_attention_captioning.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

image-captioning-transduction-models

About

Releases

Packages

Languages

iafarhan/image-captioning-transduction-models

Folders and files

Latest commit

History

Repository files navigation

image-captioning-transduction-models

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages