Labs for the course "Big Data: architectures and data analytics" @ Politecnico di Torino a.y. 2021/22
-
Updated
Jan 4, 2022 - Java
Labs for the course "Big Data: architectures and data analytics" @ Politecnico di Torino a.y. 2021/22
Data preparation, visualization and feature engineering and classification of survival people using pyspark libraries
Developed a model/Spark ML pipeline stream to identify potential customers that may purchase top up services in the future.
The current repository contains all the code developed during the Big Data processing and Analytics laboratories. Data are processed and analyzed using Hadoop and Spark
This repo contains code for restuarant recommendation system for users based upon business rating value.
This repository contains Apache Spark, Apache Hive, Apache Pig work
Introduction à Pyspark pour les Data Engineers par la pratique
Solving Big Data Problems using Spark framework in Java. Running the Project on HDFS clusters (BigData@Polito) to get the results.
😅 A topic model of reddit.com/r/EmojiPasta trained with Spark and an LDA model (NSFW) - Trigger Warning: The r/emojipasta subreddit posts controversial content and anything I have crawled is to provide visibility of a topic modeling some of this controversial content. Unfortunately there is also discriminatory speech which must be called out!
Fire accidents data analysis with Spark
A Production Machine Learning Pipeline for Predicting Future Sales with Spark
Implemented an auto-clustering tool with seed and number of clusters finder. Optimizing algorithms: Silhouette, Elbow. Clustering algorithms: k-Means, Bisecting k-Means, Gaussian Mixture. Module includes micro-macro pivoting, and dashboards displaying radius, centroids, and inertia of clusters. Used: Python, Pyspark, Matplotlib, Spark MLlib.
Yelp Toronto User Pattern Analysis and Recommender System
A UDF to evaluate Spark-MLlib classification model using PySpark
A movie recommender system implementing collaborative filtering using PySpark
Apache Spark mllib example for seminar 'AI with scala'
A movie recommendation system built using Apache Spark’s ML library
Add a description, image, and links to the spark-mllib topic page so that developers can more easily learn about it.
To associate your repository with the spark-mllib topic, visit your repo's landing page and select "manage topics."