public-tree-data

Public Tree Map uses open datasets to document publicly owned park + street trees in Santa Monica, California. We're working to add data about other LA county public trees. Please see below for more information about the data sources and project.

Running the Pipeline Locally

Prerequisites:

make
node

After a fresh clone, run npm install to install the necessary node modules.

To run the full pipeline, which will download the latest tree data and all images, run:

make release

To skip lengthy network requests, you can run a smaller version of the pipeline with:

make local-only

See the Makefile for other rules that are available.

Viewing the Logs

The various scripts that makeup the pipeline rely on reading/writing to stdin and stdout, so the scripts can't log to stdout like you'd expect. Instead, they write to a log file that's located at tmp/log.txt. If you'd like to watch logs as they happen, simply run:

tail -f tmp/log.txt

Command Documentation

find_missing_species.py

This covers how to run the find_missing_species.py script. Let's start with the command line options and what they do:

python find_missing_species.py -u <inventory url> -s <known species csv file> -o <output file>

-u: This is the url to download the tree inventory csv from santa monica. This data must also contain the column, Species ID, which is the species id. If not specified, it defaults to https://data.smgov.net/resource/w8ue-6cnd.csv?$limit=50000

-s: This specifies the csv file containing all known species ids. This script expects the species id in the column named Species ID. If not specified, it defaults to data/species_attributes.csv

-o: This specifies the name of the output csv file. If not specified it prints the csv file to stdout (aka the command line).

Here are some examples:

 python find_missing_species.py -h # this shows you all the options (and explanations)

# this grabs data from the santa monica trees dataset and saves them to missing_trees.csv
python find_missing_species.py -o missing_trees.csv

# uses all the defaults and prints the output to the command line
python find_missing_species.py

General Thoughts on the Pipeline

We don't want a server. To avoid this, we serve static data as JSON via a Google Cloud bucket. This has a number of benefits, namely cost and client simplicity.

The pipeline in general works like this:

Start with tree data provided by Santa Monica.
End with one JSON file that can be used to render the map, and a series of JSON files that represent the details of each individual tree.
In between, we break down each augmentation/alteration of the data into a series of distinct processes, each of which reads from stdin and writes to stdout. Examples include doing the initial parse of the CSV, and finding images for each tree.
Each of these scripts are written and documented extensively.
The Makefile composes these scripts into a set of routines.
CircleCI will run the make release script nightly to update the data.

Protocol for pull requests + code review

Please review open issues and link your pull request to the relevant issue.
Please create new branch!
For all new changes, please submit your pull request to the test-circleci branch.
In your pull request, please list and explain all proposed changes to the code base (additions, deletions). If you reuse code from elsewhere, please make sure you've attributed it.
Please apply all relevant labels to your pull request.
Please request a review (either from a specific person or from the appropriate slack channel).
Reviewers: please review all proposed changes, write comments and questions in line notes. Please review all updates made at your request.
Reviewer and requester: please confirm with each other that the PR is ready to merge. Please make sure that the PR branch name documents the new changes.

Data Sources

Tree attributes and current sources

List of attribute fields and views for our application - gist
Initial views (desktop):
- 1 - no tree selected (home/map view)
- 2 - native CFP tree species view
- 3 - non-native tree species view
- 4 - Washingtonia filifera (only native CFP palm species) view
- 5 - non-native palm species view
- 6 - tree family view
Species Imagery - Encyclopedia of Life
CA Native status - Calflora.org and Theodore Payne Foundation
Nearest Address, GPS Coordinates, Height Range, Trunk Diameter (DBH) Range, Tree ID - Trees Inventory - Santa Monica Open Data
Geographic Range description (countries occurrence), IUCN Red List Status - IUCN Red List API - v3
Recommended Watering Frequency - City of Santa Monica Public Works Department PDF (pp9-13)
Species Growth, Shade Production, Shedability, Spread, Trunk clearnance - Canopy Tree Library
Street View Imagery - Google Street View
For CA native species:
Species Height by Width, Native Distribution, Native Habitat - Theodore Payne Foundation
For non-native tree species:
Invasive Status - California Invasive Plant Council
Our public google drive folder

Name		Name	Last commit message	Last commit date
Latest commit History 403 Commits
.circleci		.circleci
.github		.github
data		data
scripts		scripts
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
download-images.js		download-images.js
environment.yml		environment.yml
find_missing_species.py		find_missing_species.py
package-lock.json		package-lock.json
package.json		package.json
parse-trees.js		parse-trees.js
pruning_planting.py		pruning_planting.py
python-env.sh		python-env.sh
requirements.txt		requirements.txt
setup_python_locally.sh		setup_python_locally.sh
split-trees.js		split-trees.js
util.js		util.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

public-tree-data

Running the Pipeline Locally

Viewing the Logs

Command Documentation

find_missing_species.py

General Thoughts on the Pipeline

Protocol for pull requests + code review

Data Sources

Tree attributes and current sources

About

Releases

Packages

Contributors 10

Languages

License

Public-Tree-Map/public-tree-map-data-pipeline

Folders and files

Latest commit

History

Repository files navigation

public-tree-data

Running the Pipeline Locally

Viewing the Logs

Command Documentation

find_missing_species.py

General Thoughts on the Pipeline

Protocol for pull requests + code review

Data Sources

Tree attributes and current sources

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 10

Languages

Packages