Working with Apache Spark using its Python library PYSPARK, Creating some small tutorials that consists of couple of functionalities like (Importing data via csv and text file, cleaning and pre-processing data-sets, using SQL queries in Apache Spark, Using SQL joins with Apache Spark) and at last implemeting a small project using pyspark, This project consists of implementing some Machine Learning algorithm like (Linear Regression) on example data-set like COVID-19 or some stock market data for some futuristic analysis.
I will start pushing into the repository soon, and will try to keep it full of material that will be helpful for learning and utilization.