Skip to content

A tool to download and format NUS-WIDE dataset for multilabel classification

Notifications You must be signed in to change notification settings

bbenligiray/nus_wide_formatter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NUS-WIDE Formatter

A tool to format NUS-WIDE dataset. It outputs a .h5 file that contains the following:

  • data_types: 'train' and 'val'
  • cats: names of the 81 categories

(replace x with any data type)

  • x_images: flattened images (not preprocessed in any way)
  • x_shapes: shapes of the images, to reshape the flattened images
  • x_names: file names of the images
  • x_label: a one-hot integer vector of labels

Follow the instructions here to get a download link of the raw dataset (don't bother scraping, there are too many missing images):

http://lms.comp.nus.edu.sg/research/NUS-WIDE.htm

Stats

Train

Total: 161789

Missing: 0

Unlabeled: 81980

Remaining: 79809

Test

Total: 107859

Missing: 0

Unlabeled: 54227

Remaining: 53632

About

A tool to download and format NUS-WIDE dataset for multilabel classification

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages