Skip to content

CQRS implementation using Debezium + Spring Cloud Stream + Kafka Connect + Kafka + Druid

Notifications You must be signed in to change notification settings

mgorav/CqrsWithCDC

Repository files navigation

CQRS + Event Sourcing Using CDC + OLAP Using Druid

CQRS & Event Sourcing are "THE BUZZ" words these days and how it fits in real time analytics. Microservices architecture causes segregation of applications (also called mushrooming of application landscape). This posses challenge for the Data Engineering teams. The proposed architecture (in the GitHub blog) demystifies CQRS + Event Sourcing in Real time analytics and hence bridges the gap between Full Stack Developers & Data Engineer.

alt text

NOTE: CQRS + Event Sourcing = Elegant DDD

Event Souring - “All changes to an application state are stored as a sequence of events.” Martin Fowler

  • Change made to state are tracked as events
  • Event are stored in event store (any database)
  • Use stored events and summation of all these events (always arrive in current state)

NOTE  Event Sourcing is not part of CQRS

alt text

The Change Data Capture (CDC) provides an easy mechanism to implement these. Further CDC, in the world of micro-services, data services & real time analytics is a central part of a modern architecture these days.Check out my GitHub project which demonstrates:

  1. CQRS
  2. Event Sourcing
  3. CDC using Debezium (good bye to expensive CDC products & complicated integration)
  4. Kafka Connect
  5. Kafka
  6. Real time streaming
  7. Spring Cloud Stream

CQRS Bank Application - Using CQRS + Event Sourcing with Events relay using CDC

A bank application which demonstrates CQRS design pattern. This application performs following operations:

  1. Money withdrawal using debit card

  2. List all the money withdrawal (mini bank statement)

This application uses following two tables for above operations:

  1. debit_card

  2. money_withdrawal

A debit card withdrawl operation is stored in debit_card table. Once the transaciton is successfully committed in the debit_card table, using Debezium's & Kafka connect the CDC is moved to Kafka. Once the message arrive in Kafka topic, using Spring Cloud Stream Stream Listener, an entry in made to money_withdrawl table. This table is used to create mini statement (query)

NOTE For sake of simplicity same DB is used but as can be seen - "a command to perform debit operation" is separated from mini statement.

Following picture shows architecture of this application:

alt text

To demonstrate OLAP capabilities, this application write mini statement to Kafka topic - "ministatement". From this topic, Druid picks it up and provide fast querying abilities

Pre-requisite

  1. MySQL

  2. Apache Kafka

  3. Kafka Connect

  4. Debezium

  5. Spring Cloud Stream

  6. Zookeeper

  7. Druid

  8. Docker

NOTE: This application is completely dockerized.

Run Application

The complete reference architecture implementation is dockerized. Hence it takes just few minutes to run this application locally or any cloud provider of your choice.

Execute following steps to run the application:

  • Build bank app
    mvn clean install -DskipTests
  • Run bank application complete infrastructure:
    docker-compose up
  • Instruct Kafka Connect to tail transaction log of MySQL DB and start sending messages as CDC to Kafka:
    curl -i -X POST -H "Accept:application/json" -H  "Content-Type:application/json" http://localhost:8083/connectors/ -d @mysqlsource.json --verbose
  • Instruct Druid connect with Kafka
curl -XPOST -H'Content-Type: application/json' -d @bank-app-mini-statement-supervisor.json http://localhost:8081/druid/indexer/v1/supervisor
  • Money withdrawal operation:
    curl http://localhost:8080/moneywithdrawals -X POST --header 'Content-Type: application/json' -d '{"debitCard":"123456789", "amount": 10.00}' --verbose
  • Mini statement fetching operation (query/read model)
    curl http://localhost:8080/moneywithdrawals?debitCardId=123456789 --verbose

CQRS,Event Sourcing & Its Usage In Analytics Demystified

In section some of key concepts of CQRS & Event Sourcing are explained along with it's usage in analytics

Command Vs Event

Command – “submit order”

  • A request (imperative statement)
  • May fail
  • May affect multiple aggregates

NOTE: Rebuild Aggregate State from Commands

Event – “order submitted”

  • Statement of fact (past tense)
  • Never fails
  • May affect a single aggregate

NOTE: Rebuild Aggregate State from Events

Command to Event

Following digram explain creation of Event from a Command

alt text

Once a Command is converted to an Event, the state can be reterived as shown below:

alt text

Classical BI

A classical BI is shown below: NOT A NEW IDEA

alt text

Tying knots together - CQRS & Event Sourcing

Below shows CQRS & Event Sourcing:

alt text

But "?" is still missing block. This missing "?" is called "State" or "Materialized View"

alt text

  • Query a RDMS?  Old style
    • RDMS is option (will become bottleneck as volume of data increases)
  • Views are optimized for specific query use cases
    • multiple view of same events
  • Update is asynchronously  delayed  eventual consistency  build to scale
  • Easy to evolve or fix
    • Change or fix logic; rebuild view from events  event log is the source of truth not the view
  • Views can be rebuilt from the events

Indexing is key concerns when "Command" is separated from "Query". Following diagram shows architecture to achieve this:

alt text

Snapshotting Snapshotting is one of the key concepts in "Event Sourcing". Following diagram shows this:

alt text

Event sourcing/cqrs drawback

  • No “One-Size-Fits-All”
    • Multiple “topic’ implementation
  • Delayed reads
  • No ACID transactions
  • Additional complexity

Event sourcing/cqrs benefits

  • No “One-Size-Fits-All”
    • “topic’ are optimized for usecases
  • Eventual consistency
  • History, temporal queries
  • Robust for data corruption

The Complete Picture

The below picture shows the end to end architecture using CQRS & Event Sourcing

alt text

About

CQRS implementation using Debezium + Spring Cloud Stream + Kafka Connect + Kafka + Druid

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published