Docker를 사용하여 Hadoop 생태계의 구성 요소와 기타 필수 서비스를 컨테이너화하여 강력한 데이터 엔지니어링 환경을 설정하는 방법을 보여줍니다. 설정에는 Hadoop (HDFS, YARN), Apache Hive, PostgreSQL 및 Apache Airflow가 포함되며, 이들 모두가 원활하게 작동하도록 구성되어 있습니다.
-
Updated
May 29, 2024 - Shell
Docker를 사용하여 Hadoop 생태계의 구성 요소와 기타 필수 서비스를 컨테이너화하여 강력한 데이터 엔지니어링 환경을 설정하는 방법을 보여줍니다. 설정에는 Hadoop (HDFS, YARN), Apache Hive, PostgreSQL 및 Apache Airflow가 포함되며, 이들 모두가 원활하게 작동하도록 구성되어 있습니다.
Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.
Solução completa dedicada a realizar ETL de dados de cotações de moedas usando Python. Fonte dos dados: https://docs.awesomeapi.com.br/api-de-moedas
A simple ETL for temperature data from the Openweathermap API, storing it into an Azure SQL Database
Airflow DAGs for the Stellar ETL project
ETL pipeline 🪈 for scraping, transforming, and loading YTS movie data 🎞️ into PostgreSQL 🛢️ Container using Docker🐳
Jayvee is a domain-specific language and runtime for automated processing of data pipelines
TED Semantic Web Services
A CLI tool for transforming large RDF datasets using pure SPARQL.
No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents
In this project we are going to create an end-to-end data platform right from Data Ingestion, Data Transformation, Data Loading and Reporting.
Pipeline ETL (Extract, Transform, Load) permettant de faire une modification de la data provenant du report généré par Drag'n Survey pour que celle-ci puisse être utilisée dans l'outil de visualisation Power BI.
One ETL tool to rule them all
Make stream processing easier! Easy-to-use streaming application development framework and operation platform.
OpenSource data platform to build event-driven systems. It's like Deebezium for golang :)
Tutorial for Pydantic and SQLAlchemy
Ayush @ Data Engineering Portfolio
Add a description, image, and links to the etl-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the etl-pipeline topic, visit your repo's landing page and select "manage topics."