Skip to content

This is my thesis repository on Big Data postgraduate studies written in Quarto

Notifications You must be signed in to change notification settings

michkam89/big-data-pw-project

Repository files navigation

Big Data Studies project with pySpark and AWS EMR

This project was built using Quarto.

To reproduce the environment use Dockerfile provided in the root directory with VS Code environement. Install python libraries stored in requirements.txt in venv called env (automatically detected by Quarto).

Once you open VS code, install Quarto extentions and render the book.

In the misc directory you can find jupyter notebooks from EMR, configuration files and required jars.

Thesis text and code is in .qmd files.