End-to-End Learning Using Cycle Consistency for Image-to-Caption Transformations #10

soneo1127 · 2019-04-25T11:07:20Z

0. 論文

https://arxiv.org/abs/1903.10118

1. どんなもの？

対になるデータがない場合のテキストからの画像生成

2. 先行研究と比べてどこがすごい？

3. 技術や手法のキモはどこ？

画像→キャプション
2つのGenerator
1つは、データセット内の画像からキャプションを生成するために採用されている。図2（a）
GSGANの方法を取り入れて、画像特徴からテキストを生成するためにGumbel-softmaxを利用した。これは、生成されたキャプションからさらなる画像を生成し、学習中に逆誤差伝播を実行するため。

Discriminator
VGG16で抽出した画像の特徴と、LSTMで抽出したキャプションの特徴がマッチするか

キャプション→画像：図2（b）に示されており、Vinyalsらの方法に基づいている
https://arxiv.org/abs/1411.4555
O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In the IEEE conference on Computer Vision and Pattern Recognition (CVPR), 2015.

4. どうやって有効だと検証した？

5. 議論はある？

画像からテキストを生成するGANよりも高い精度を達成するのは困難であると考えられている。

6. 次に読むべき論文は？

soneo1127 added reading reading now, leave this issue as it is text2image GAN labels Apr 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

End-to-End Learning Using Cycle Consistency for Image-to-Caption Transformations #10

End-to-End Learning Using Cycle Consistency for Image-to-Caption Transformations #10

soneo1127 commented Apr 25, 2019 •

edited

End-to-End Learning Using Cycle Consistency for Image-to-Caption Transformations #10

End-to-End Learning Using Cycle Consistency for Image-to-Caption Transformations #10

Comments

soneo1127 commented Apr 25, 2019 • edited

0. 論文

1. どんなもの？

2. 先行研究と比べてどこがすごい？

3. 技術や手法のキモはどこ？

4. どうやって有効だと検証した？

5. 議論はある？

6. 次に読むべき論文は？

soneo1127 commented Apr 25, 2019 •

edited