Skip to content

ard8/CS513-Project

 
 

Repository files navigation

CS513-Project

Authors

Team 194

Fabricio Brigagao [email protected]
Roberto Godoy [email protected]
Steve McHenry [email protected]

Summary

This repository contains the deliverables for the CS 513 group project. Included are the supplementary materials as dictated by the final project instructions, materials referenced within the Phase-II report, and additional analysis and profiling scripts that users may find of interest for their own research or expansion.

Organization of Contents

This repository is organized into directories containing different aspects of the project.

operation-history/

This directory contains the operation history for the project produced by workflow W. Except for the file named "Latitude-Longitude-Location-Stage2.json", all files in this directory are OpenRefine recipes generated by OpenRefine v3.5.2.

"Latitude-Longitude-Location-Stage2.json" describes operation history performed by SQL DML queries presented in a JSON recipe-like format.

python/

This folder contains Python scripts for the project as described in the Phase-II report.

sql/

This folder contains SQL queries. The prerequisite to execution are:

  1. Microsoft SQL Server 2017 (or greater) as the database engine
  2. A database named CDPH with the default collation SQL_Latin1_General_CP1_CI_AS

This folder contains a separate README further describing the contents of the directory.

yw-diagrams/

This folder contains YesWorkFlow diagrams (.yw) and Grahphiz Files (.gv) for the project, organized in 3 levels. WO gives an overview of the entire cleaning project. W1 describes the 3 phases of the project in more detail. Finally, W2 contain the diagrams for the OpenRefine recipes.

tableau/

This folder contains the Tableau Desktop work files for the dashboard visualization published to Tableau Public with the cleaned dataset.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TSQL 86.0%
  • Python 14.0%