Skip to content

l2nguyen/metro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Study of Metro Ridership

This is the class project for my General Assembly Data Science class.

Goals:

  • To visualize the DC Metro rail historical ridership data
  • To determine the variables that affect ridership
  • To build a model that detremines the relationship between the response (metrorail ridership) and the feature variables (ie gas price, weather, unemployment)

Game Plan:

  • This is a regression problem and I plan to use a linear regression model
  • The main model evaluation tool will be RMSE
  • Will make models of increasing complexity and see what works best

Guide:

  • A presentation can be found here
  • Want more details? A report can be found here
  • Graphs visualizing data can be found here
  • Data wrangling code can be found here
  • Modeling code can be found here
  • Data dictionary can be found here

To Do List

  • Clean up code to be more elegant/shorter
  • Study the large residuals to see if they have anything in common
  • Parameter tuning of the models
  • Add data for days when sports games exis
  • Find better proxy for tourism

Wish List

  • Make interactive data visualizations using javascript/d3. Maybe something like this

About

Class project for Data Science class

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages