Skip to content

Using kafka streaming to implement the streaming ETL job, append data onto GCP Big Query after formatting.

Notifications You must be signed in to change notification settings

saLeox/Lambda-Streaming-HousePricePredict

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Lambda-Streaming-HousePricePredict

Architect Overview

Architect Decision

Spark-streaming or Flink-streaming need cluster to submit jobs, will not make any attempt so far.

Components

4 main sections, 4/4 now

  • 1. Kafka Producer

  • 2. Kafka Consumer

  • 3. Kafka Streaming

  • Materialize the KTable.

  • Sink to Big Query using Streaming API provided by GCP.

  • Handover Topology management to spring and only need to configure the connection info in one place.

  • 4. InteractiveQuery to Ktable.

  • Get the time-accumulated result.

  • Get the KafkaStreams by bean injection in Spring, refer to here.

Before you deploy

  • Prepare the kafka cluster, can follow the instruction here and run on the top of docker.
  • Create the topics in yml in advance.
  • Check the big query connection info whether is defined correctly in yml.

After you deploy

The producer to send msg and streaming-interactive-query to query are open on Swagger page, once you start this project.

The incoming topics can be viewd on Kafdrop or log.

About

Using kafka streaming to implement the streaming ETL job, append data onto GCP Big Query after formatting.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages