This project offers a business-focused solution for analyzing SQL query logs and predicting memory usage, primarily for AWS Athena.
-
Updated
May 30, 2024 - Python
This project offers a business-focused solution for analyzing SQL query logs and predicting memory usage, primarily for AWS Athena.
These are the handwritten notes on Coursera's Practical data science specialization course.
Codebase for CCAO data infrastructure construction and management
This AWS-based data pipeline manages data from storage in S3 data lakes, through transformation with AWS Glue and Lambda, to refined storage in separate S3 repositories. Using Athena for SQL querying and QuickSight for interactive dashboards, this solution optimizes data processing and visualization, facilitating informed decision-making and insigh
🌳 A sustainable Terraform Package which creates resources for Data Services on AWS
In this project, you will execute an End-To-End Data Engineering Project on Real-Time Stock Market Data using Kafka. We are going to use different technologies such as Python, Amazon Web Services (AWS), Apache Kafka, Glue, Athena, and SQL.
Stream CDC into an Amazon S3 data lake in Apache Iceberg format with AWS Glue Streaming and DMS
Describes the concepts of lambda architecture and the actual deployment process with an example of building a serverless business intelligence systems using Amazon Kinesis, S3, Athena, OpenSearch Service, and QuickSight.
Analyzing and detecting anomalies in S3 Data using Athena JDBC Driver
This project aims to securely manage, streamline, and perform analysis on the structured and semi-structured YouTube videos data based on the video categories and the trending metrics.
ETL Data pipeline using aws services
This Project demonstrates the Technology shift in Automobile Firm to resolve the data engineering challenge of manual data ops. AWS Cloud Services implemented here as: S3 bucket for lake storage incoming batches, Lambda Python Script for automating the validation function call and Glue Crawler to generate relational table with successful testing.
☁️ Análise de dados do data lake de covid-19 da AWS
Streamlit EDA Dashboard Powered by AWS Cloud
This project provides a comprehensive data pipeline solution to extract, transform, and load (ETL) Reddit data into a Redshift data warehouse. The pipeline leverages a combination of tools and services including Apache Airflow, Celery, PostgreSQL, Amazon S3, AWS Glue, Amazon Athena, and Amazon Redshift.
Unveiling job market trends with Scrapy and AWS
Athena-Query provide simple interface to get athena query results.
A set of commands that can help when working with AWS
An end-to-end data pipeline built with AWS S3, Glue, Crawler, Athena, Tableau visulization
Add a description, image, and links to the aws-athena topic page so that developers can more easily learn about it.
To associate your repository with the aws-athena topic, visit your repo's landing page and select "manage topics."