Skip to content

acmiyaguchi/iyse6420-birdcall-distributions

Repository files navigation

birdcall-distribution

This repository contains IYSE 6420 project code for building BirdCLEF birdcall distribution maps. It uses Bayesian modeling techniques to estimate frequencies of birdcall recording metadata. This project relies heavily on PyMC and Google Earth Engine.

Below are a few plots from the project, which are discussed in more detail in the report. Also check out a demo of the resulting plots for various models and regions here.

car distribution

Distribution of the California Quail smoothed using the a Poisson GLM with a CAR prior distribution for random effects.

western predict 16

Poisson GLM posterior predictive species distribution map of the top 15 species in the Western US.

quickstart

Make sure you have Python and poetry installed.

poetry install
poetry shell

Notebooks with exploratory data analysis and initial modeling are found in the notebooks directory. We try to consolidate most code into the birdcall_distribution package.

If you are contributing, make sure to install and initialize pre-commit.

pip install pre-commit
pre-commit install

miscellaneous

fetching data against Google Earth Engine

You will need to be authenticated against Google Earth Engine. Check out one of the earlier notebooks to see how this works.

To get statistics about elevation, temperature, and land cover classification:

# v1 - includes basic information
python -m birdcall_distribution.commands.earth_engine data/earth_engine.parquet

# v2 - includes the grid size
python -m birdcall_distribution.commands.earth_engine --parallelism 16 data/earth_engine_v2.parquet

# v3 - add a new region, use percentiles as primary data summarization technique
python -m birdcall_distribution.commands.earth_engine --parallelism 16 ca 1 data/ee_v3_ca_1.parquet
python -m birdcall_distribution.commands.earth_engine --parallelism 16 western_us 2 data/ee_v3_western_us_2.parquet
python -m birdcall_distribution.commands.earth_engine --parallelism 16 americas 2 data/ee_v3_americas_2.parquet
python -m birdcall_distribution.commands.earth_engine --parallelism 16 americas 5 data/ee_v3_americas_5.parquet

These are saved as parquet files and are checked into the repository.

generating assets for demo

python -m birdcall_distribution.commands.model_assets intercept_car data/ee_v3_americas_5.parquet data/processed/models/intercept_car/americas/5 --n-species 10 --cores 4 --samples 5000
python -m birdcall_distribution.commands.model_assets intercept_car data/ee_v3_western_us_2.parquet data/processed/models/intercept_car/western_us/2 --n-species 10 --cores 4 --samples 5000
python -m birdcall_distribution.commands.model_assets intercept_car data/ee_v3_ca_1.parquet data/processed/models/intercept_car/ca/1 --n-species 10 --cores 4 --samples 5000

python -m birdcall_distribution.commands.model_assets intercept_covariate_car data/ee_v3_americas_5.parquet data/processed/models/intercept_covariate_car/americas/5 --n-species 10 --cores 4 --samples 5000
python -m birdcall_distribution.commands.model_assets intercept_covariate_car data/ee_v3_western_us_2.parquet data/processed/models/intercept_covariate_car/western_us/2 --n-species 10 --cores 4 --samples 5000
python -m birdcall_distribution.commands.model_assets intercept_covariate_car data/ee_v3_ca_1.parquet data/processed/models/intercept_covariate_car/ca/1 --n-species 10 --cores 4 --samples 5000

We also generate the manifest:

python -m birdcall_distribution.commands.generate_manifest data/processed data/processed/manifest.json
python -m birdcall_distribution.commands.bird_name_mapping data/processed data/processed
python -m birdcall_distribution.commands.earth_engine_assets data data/processed/earth_engine

uploading data directory to google cloud

We have set up a public facing bucket with copies wheels and data files. To upload new data, authenticate against gcloud and ensure you have access to the acmiyaguchi project.

gcloud storage buckets create gs://iyse6420-birdcall-distribution
gsutil -m rsync -r data/ gs://iyse6420-birdcall-distribution/

installing cartopy on windows

The majority of development was done on a Windows 10 machine. Cartopy, which is used for plotting, is not available on Windows as a wheel. This is because it needs to be linked against system packages. Download the packages from: https://www.lfd.uci.edu/~gohlke/pythonlibs/

We also have a copy of the wheels in the cloud storage bucket, which can be synced down.

converting svg images

The graphviz images need to be converted into png so we can embed them into the the report. Use inkscape to do this:

inkscape full_model.svg --export-type=png --export-filename=full_model.png

References

Here's a list of many of the resources that were used while building this project. This does not encompass everything, because some references are in the report or notebooks.

google earth engine and remote sensing data

modeling geospatial data