hadoop-cluster

Here are 146 public repositories matching this topic...

akshayavb99 / Ansible-Examples

The repository contains all the Playbooks and other files used to work with different applications for Ansible

docker ansible webserver ansible-playbooks yum hadoop-cluster explanation webservers loadbalancer dynamic-inventory-aws webserver-setup rhel8 linux-scripting

Updated Apr 4, 2023
Python

dhitaj / bdc-sapienza

Star

Assignments of Big Data course during the Spring 2017 semester at Sapienza

java big-data hadoop hadoop-cluster hadoop-filesystem hadoop-mapreduce hadoop-hdfs

Updated Mar 8, 2018
Java

shreyasshivakumara / Reddit-Analysis-Large-Dataset-Scientific-Application

Star

Architected and developed a horizontally scalable data processing solution for the reddit dataset. Demonstrated the scalability (Weak Scalability and Strong Scalability) tests in suitable computational analysis.

github reddit spark python3 master-slave data-analysis hadoop-cluster hadoop-mapreduce large-dataset hadoop-hdfs

Updated Jul 3, 2020
Jupyter Notebook

comoyi / docker-hadoop-cluster

Star

A docker hadoop cluster

docker hadoop hadoop-cluster

Updated Feb 24, 2018
Shell

aogunwoolu / Ethereum-analysis

Star

ETH analysis using big data for the QMUL Big Data Processing module. Intended to promote analysis of data retrieved via big data processing

python big-data hadoop ethereum hadoop-cluster hadoop-filesystem hadoop-mapreduce mrjob big-data-analytics hadoop-hdfs mrjob-dataproc

Updated Dec 25, 2021
Jupyter Notebook

HeliaHashemipour / Hadoop-Spark

Star

Third homework of CloudComputing - Fall 2022

spark hadoop collaborative-filtering hadoop-cluster als cosine-similarity spark-sql

Updated Feb 9, 2023
Jupyter Notebook

deepakag5 / Cloud-Computing-AWS

Star

Cloud Computing Tutorials for AWS

s3-bucket load-balancer vpc hadoop-cluster aws-rds disaster-recovery hadoop-streaming rds-database iam-users emr-cluster

Updated Nov 14, 2019
Python

uncleislearning / learning-Hadoop

Star

HDFS、MapReduce、Hive、Zookeeper原理以及实践操作

hadoop hadoop-cluster hadoop-filesystem hadoop-mapreduce hadoop-ecosystem

Updated Feb 15, 2018

jbw / hadoop-docker-cluster

Star

Hadoop cluster on Docker (single host)

docker hadoop hadoop-cluster hadoop-mapreduce hadoop-docker

Updated Aug 3, 2018
Shell

silencebingo / hadoop-spark-cluster

Star

A Hadoop and Spark Cluster on Docker

hadoop-cluster spark-cluster

Updated Apr 12, 2018
Shell

lk5164 / hadoop-cluster-setup

Star

ubuntu hadoop-cluster fully-distributed

Updated Aug 25, 2019
Shell

shubhambhardwaj007 / Ansible-Hadoop-DataNode-Role

Star

An Ansible Role to Configure and setup Hadoop Data Node.

ansible big-data hadoop cluster ansible-role hadoop-cluster ansible-roles ansible-galaxy hadoop-hdfs hadoop-data-platform

Updated May 18, 2021
Jinja

DanMolenhouse / Distributed-Systems-Project5-Hadoop-and-Spark

Star

In this project, we used both Hadoop / MapReduce and Spark to do distributed computing. The first task was to perform a series of operations using a Mapper and Reduce java file that was implemented on a Hadoop server. The second task was to perform similar operations, but on Spark instead.

spark apache-spark hadoop hadoop-cluster mapreduce hadoop-mapreduce spark-cluster mapreduce-java hadoop-hdfs

Updated Oct 31, 2022
Java

vineetdcunha / Hadoop_Ecosystem

Star

Processing and transforming data via Hadoop Ecosystem

python hive hadoop python-script hbase pyspark mahout pig hadoop-cluster hadoop-mapreduce hadoop-streaming hadoop-ecosystem hiveql multinode hadoop-hdfs hbase-standalone

Updated Nov 26, 2020
Python

mr-ravin / Smart-Hadoop-Cluster-SMHACL

Star

This is an automated hadoop cluster building tool,which implements distributed computing for creating the cluster over the network. This is implemented in python 2.7

docker distributed-systems automation big-data hadoop hadoop-cluster python-2 hadoop-docker hadoop-hdfs