AWS Cloudera Hadoop setup with H2O, Spark, MR
-
Updated
Apr 24, 2017 - Java
AWS Cloudera Hadoop setup with H2O, Spark, MR
Big Data Technologies can be defined as software tools for analyzing, processing, and extracting data from an extremely complex and large data set with which traditional management tools can never deal
Applying MapReduce in Java on a Twitter dataset using Apache Hadoop
Implement a Hive data warehouse to store meaningful data, apply Machine Learning like Clustering or Regression for dealing with business problems
Hadoop, HBase, Phoenix, and Zookeeper Integration
This repository contains all the material related to this big data certification.
🌟Spark Ceph Connector: Implementation of Hadoop Filesystem API for Ceph
Learning Apache Hadoop for Big Data. Moreover, exploring Map Reduce, Apache Spark RDD, Distributed Processing and Stream Processing
Implementation of Statistical Methods via Hadoop Map-Reduce Library.
Big Data pipeline for real-time sensor fusion and predective analysis.
A BASH script to setup Apache Hadoop and Apache Hive with Derby database on Debian GNU/Linux
A Hadoop-based Java project that counts the max number of word occurences for each letter in a textfile of a folder.
Data Science Project - for 'Advanced Topics in Database Systems' M.Sc. Course ECE @ntua
Apache Hadoop. Apache Hadoop is a collection of open-source software utilities that facilitate using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Originally designed for co…
Repository for the master's course Cloud Computing of the TU Berlin in the winter term 2020/21.
Add a description, image, and links to the apache-hadoop topic page so that developers can more easily learn about it.
To associate your repository with the apache-hadoop topic, visit your repo's landing page and select "manage topics."