Skip to content

ShubhamGupta19/Taxi-Data-Analytics-ETL-pipeline

Repository files navigation

Taxi-Data-Analytics-ETL-pipeline

Introduction

The aim of this project is to carry out a comprehensive analysis of taxi data by utilizing a range of tools and technologies, such as GCP Storage, Python, Compute Instance, Mage Data Pipeline Tool, BigQuery, and Looker Studio. The project intends to explore and extract meaningful insights from the taxi data by leveraging these tools and technologies. This analysis could provide useful information for transportation planning, traffic management, and other relevant applications.

Architecture

Technology Used

  • Programming Language - Python

Google Cloud Platform

  1. Google Storage
  2. Compute Instance
  3. BigQuery
  4. Looker Studio

Modern Data Pipeine Tool - https://www.mage.ai/

Dataset Used

TLC Trip Record Data Yellow and green taxi trip records include fields capturing pick-up and drop-off dates/times, pick-up and drop-off locations, trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts.

More info about dataset can be found here:

  1. Website - https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page
  2. Data Dictionary - https://www.nyc.gov/assets/tlc/downloads/pdf/data_dictionary_trip_records_yellow.pdf

Data Model

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published