Skip to content

lynnlangit/learning-hadoop-and-spark

Repository files navigation

Learning Hadoop and Spark

Contents

This is the companion repo to my Linked In Learning Courses on Apache Hadoop and Apache Spark.

🐘 1. Learning Hadoop - link
- this course demos I use mostly GCP Dataproc
- for running Hadoop & associated libraries (i.e. Hive, Pig, Spark...) workloads

🌩️ 2. Cloud Hadoop: Scaling Apache Spark - link
- this course demos I use GCP DataProc, AWS EMR --or--
- I use Databricks on AWS or on GCP

⛈️ 3. Azure Databricks Spark Essential Training - link
- this course demos I use Azure with Databricks
- for scaling Apache Spark workloads


Other LinkedIn Learning Courses on Hadoop or Spark

There are ~ 10 courses on Hadoop/Spark topics on LinkedIn Learning. See graphic below
Learning Paths

  • Hadoop for Data Science Tips and Tricks - link
    • Set up Cloudera Enviroment
    • Working with Files in HDFS
    • Connecting to Hadoop Hive
    • Complex Data Structures in Hive
  • Spark courses - link
    • Various Topics - see screenshot below

LinkedInLearningSpark