Data Engineering Course
-
Updated
Jun 4, 2024 - TeX
Data Engineering Course
💾 Kotlin Extensions for Apache Hadoop (MapReduce).
Some simple, kinda introductory projects based on Apache Hadoop to be used as guides in order to make the MapReduce model look less weird or boring.
distributed training of a SVM with sparkML
Hadoop MapReduce using Java to combine different files and listed the temperature of US states from the weather stations from all over the world
The repository showcases a series of exercises and projects focused on big data processing using Hadoop, HBase, Hive, and Spark with Python. Hosted on AWS EMR, these projects demonstrate efficient data handling and processing techniques, leveraging the power of cloud computing to tackle complex data challenges.
Cloud Shuffle Service(CSS) is a general purpose remote shuffle solution for compute engines, including Spark/Flink/MapReduce.
This repository contains solutions to common mapper and reducer problems in Hadoop using Python
hadoop mapreduce algorithm with hadoop streaming (Python)
Repositorio de datos
Big Data Analytics Assignment on Hadoop MapReduce
Average Temperature - Hadoop - Mapper - Reducer
Student projects in Big Data field.
This project leverages Java and Hadoop MapReduce to analyze text and flight data, focusing on a classic Word Count problem and detailed flight data analysis.
Changed readme. This is a Java project that is used for a simple MapReduce Word Count problem.
This project investigates how to build Bloom Filters using the MapReduce approach in Hadoop and Spark. Different implementations and further anlysis on performances are reported
Add a description, image, and links to the hadoop-mapreduce topic page so that developers can more easily learn about it.
To associate your repository with the hadoop-mapreduce topic, visit your repo's landing page and select "manage topics."