Streaming-process E-Commerce Analytics with Flink, Elasticsearch, Kibana and MySQL

This repository contains an Apache Flink application for real-time sales analytics built using Docker Compose to orchestrate the necessary infrastructure components, including Apache Flink, Elasticsearch, Kibana, and MySQL. The application processes financial transaction data from Kafka, performs aggregations, and stores the results in both MySQL and Elasticsearch for further analysis.

Requirements

Docker
Docker Compose
Python (3.9.18)

Architecture

Installation and Setup

Clone this repository.
Navigate to the repository directory.
Run docker-compose up -d to start the required services (Apache Flink, Elasticsearch, MySQL, Kafka).
Run python src to start project with generate data and process data.

Usage

Ensure all Docker containers are up and running.
The Sales Transaction Generator generate_data.py.py helps to generate the sales Transactions into Kafka.
stream_process.py used to ELT data from kafka to destinations.

Application Details

The application consumes financial transaction data from Kafka, performs various transformations, and stores aggregated results in both MySQL and Elasticsearch.

Components

Apache Flink

Sets up the Flink execution environment.
Connects to Kafka as a source for financial transaction data.
Processes, transforms, and performs aggregations on transaction data streams.

MySQL

Stores transaction data and aggregated results in tables (Transactions, sales_per_category, sales_per_day, sales_per_month).

Elasticsearch

Stores transaction data for further analysis.

Kibana

Visualize data through dashboard.

Code Structure

stream_process.py: Contains the Flink application logic, including Kafka source setup, stream processing, transformations, and sinks for MySQL and Elasticsearch.

Configuration

Kafka settings (bootstrap servers, topic) are configured within the Kafka source setup.
MySQL connection details (URL, username, password) are defined in the jdbcUrl, username, and password variables.

Sink Operations

The application includes MySQL Python API to create tables (Transactions, sales_per_category, sales_per_day, sales_per_month) and perform insert/update operations.
Additionally, it includes an Elasticsearch Python API to index transaction data for further analysis.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
jars		jars
sql/mysql		sql/mysql
src		src
.gitignore		.gitignore
README.md		README.md
SystemArchitecture.png		SystemArchitecture.png
docker-compose.yaml		docker-compose.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jars

jars

sql/mysql

sql/mysql

src

src

.gitignore

.gitignore

README.md

README.md

SystemArchitecture.png

SystemArchitecture.png

docker-compose.yaml

docker-compose.yaml

requirements.txt

requirements.txt

Repository files navigation

Streaming-process E-Commerce Analytics with Flink, Elasticsearch, Kibana and MySQL

Requirements

Architecture

Installation and Setup

Usage

Application Details

Components

Apache Flink

MySQL

Elasticsearch

Kibana

Code Structure

Configuration

Sink Operations

About

Releases

Packages

Languages

tmph2003/Streaming-Project-with-Flink

Folders and files

Latest commit

History

Repository files navigation

Streaming-process E-Commerce Analytics with Flink, Elasticsearch, Kibana and MySQL

Requirements

Architecture

Installation and Setup

Usage

Application Details

Components

Apache Flink

MySQL

Elasticsearch

Kibana

Code Structure

Configuration

Sink Operations

About

Topics

Resources

Stars

Watchers

Forks

Languages