Skip to content

kudeh/udacity-dsnd-projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Data Streaming Nanodegree Projects

My Project Solutions From Udacity Data Streaming Nanodegree Program.

Worked on ingesting data into kafka and stream processing on kafka topics.

  • Tasks Completed:

    • Set up Kafka Topics and Schemas
    • Set up postgres kafka connector
    • Implement Kafka Producers using kafka-client and rest proxy
    • Implement Kafka Consumers/Pipelines with faust and ksql
  • Concepts Learned:

    • Apache Kafka Infrastructure
    • Data Schemas and Apache Avro
    • Kafka Connect and REST Proxy
    • Stream Processing Fundamentals
    • Stream Processing with Faust
    • KSQL
  • Core Technologies Used:

    • Python (confluent-kafka, faust)
    • Apache Kafka

Worked on processing apache kafka data streams using pyspark streaming APIs

  • Tasks Completed:

    • Implement spark streaming pipelines
    • Consume from apache kafka topic
    • decode base64 encoded json data from stream
    • perform transformations and aggregation on data
    • join streams and output to new kafka topic or console
  • Concepts Learned:

    • Spark Dataframes, Views and Spark SQL
    • Reading binary, json and base64 encoded data from kafka streams
    • Joining streams
  • Core Technologies Used:

    • Python (pyspark)
    • Apache Spark
    • Apache Kafka
    • Redis

About

Udacity Data Streaming Nanodegree Projects

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published