VisualMRC

VisualMRC is a visual machine reading comprehension dataset that proposes a task: given a question and a document image, a model produces an abstractive answer.

You can find more details, analyses, and baseline results in our paper. You can cite it as follows:

@inproceedings{VisualMRC2021,
  author    = {Ryota Tanaka and
               Kyosuke Nishida and
               Sen Yoshida},
  title     = {VisualMRC: Machine Reading Comprehension on Document Images},
  booktitle = {AAAI},
  year      = {2021}
}

Statistics

10,197 images
30,562 QA pairs
10.53 average question tokens (tokenizing with NLTK tokenizer)
9.53 average answer tokens (tokenizing wit NLTK tokenizer)
151.46 average OCR tokens (tokenizing with NLTK tokenizer)

Get Started

If you want to use the dataset including ground-truth annotations, please contact me at [email protected]. Please let us know your institution, name, and purpose.

Dataset Format

id: "image id",
url: "URL",
screenshot_filename: "screenshot file name",
image_filename: "image file name",
bounding_boxes: [
  {
  id: "bounding box id",
  structure: "semantic class of the bounding box",
  shape:
    {
      x: "INT, Top left x coordinate of the bounding box",
      y: "INT, Top left y coordinate of the bounding box ",
      width: "INT, Width of the ROI bounding box",
      height: "INT, Height of the bounding box",
    }
  ocr_info: [
    {
      word: "OCR token",
      confidence: "Confiden score produced by tesseract",
      bbox: 
        {
          x: "INT, Top left x coordinate of the OCR bounding box",
          y: "INT, Top left y coordinate of the OCR bounding box ",
          width: "INT, Width of the OCR bounding box",
          height: "INT, Height of the OCR bounding box",
        }
     }
   ]
  }
]
qa_data:[
  {
  question:
    {
      text: "question"
    }
   answer:
    {
      text: "answer",
      relevant: ["relevant bounding boxes that need to answer the question"]
    }
  }
]

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
README.md		README.md
figure1.png		figure1.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VisualMRC

Statistics

Get Started

Dataset Format

About

Releases

Packages

Contributors 2

nttmdlab-nlp/VisualMRC

Folders and files

Latest commit

History

Repository files navigation

VisualMRC

Statistics

Get Started

Dataset Format

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages