Skip to content

Object detection dataset based on images from Macaulay Library for 29 bird species in the Pittsburgh area. The 29 species are a subset of the 400 species in the NABirds dataset. For each species, the dataset contains about 1000 labeled images.

Notifications You must be signed in to change notification settings

ankurdave/macaulay-bird-species-pittsburgh

Repository files navigation

See bird_classes.py for the meaning of each class id.

Use a pretrained YOLOv5 model:

Model Download links val [email protected]:.95
yolov5n-birds-pittsburgh .pt, ONNX .802
yolov5s-birds-pittsburgh .pt, ONNX .838

Or train your own as follows:

# Datasets
# ========
mkdir bird-datasets
pushd bird-datasets

# Clone this repo and run `python3 download_images.py && python3 create-train-test-val-split.py`,
# or download a pre-created version using
# `aws s3 cp s3://macaulay-bird-species-pittsburgh/macaulay.tar.gz . && tar xzf macaulay.tar.gz`.

# Edit macaulay/macaulay-bird-species-pittsburgh.yaml and set the path to bird-datasets.

# Optional: Download the NABirds dataset (creation instructions TODO) using
# `aws s3 cp s3://macaulay-bird-species-pittsburgh/nabirds_yolov5.tar.gz . && tar xzf nabirds_yolov5.tar.gz`.

popd

# Start from pretrained model (optional)
# ======================================
mkdir bird-models
curl -o bird-models/yolov5n-birds-pittsburgh.pt -L https://github.com/ankurdave/bird-models/raw/master/yolov5n-birds-pittsburgh.pt

# Training
# ========
git clone https://github.com/ultralytics/yolov5.git
cd yolov5
pip3 install -r requirements.txt
python3 train.py --data ../bird-datasets/macaulay/macaulay-bird-species-pittsburgh.yaml --weights ../bird-models/yolov5n-birds-pittsburgh.pt --cfg yolov5n.yaml --cache disk

To add more labels to this dataset:

  1. Search for a single bird species on eBird. Download a CSV of the search results.
  2. Edit scrape-macaulay-search-csv.py to add the CSV to csv_to_dir, then run that script to download the images.
  3. Run python3 autolabel.py <class_id> images/all/<dir>/ to label the new images with model assist. For example, to label images of Mourning Doves: python3 autolabel.py '0' images/all/0_mourning_dove/

About

Object detection dataset based on images from Macaulay Library for 29 bird species in the Pittsburgh area. The 29 species are a subset of the 400 species in the NABirds dataset. For each species, the dataset contains about 1000 labeled images.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages