The open source high performance ELT framework powered by Apache Arrow
-
Updated
May 17, 2024 - Go
The open source high performance ELT framework powered by Apache Arrow
CloudQuery Go SDK for source and destination plugins
Powerful RDF Knowledge Graph Generation with RML Mappings
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
Conduit streams data between data stores. Kafka Connect replacement. No JVM required.
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.
Business Automations is a collection of automations built to enhance productivity, increase revenue, and reduce manual data manipulation at a retail store location that integrates a NCR Counterpoint SQL database with the BigCommerce e-commerce platform.
Infinitely scalable, event-driven, language-agnostic orchestration and scheduling platform to manage millions of workflows declaratively in code.
Translator of spreadsheet mappings into R2RML, RML or YARRRML
An intuitive and flexible RDF pipeline solution designed to simplify and automate ETL processes for efficient data management.
Hop Orchestration Platform
Upserts, Deletes And Incremental Processing on Big Data.
🧙 Build, run, and manage data pipelines for integrating and transforming data.
This repository hosts the code for a hybrid ML model integrating technical, fundamental, and sentiment data for stock prediction. It features advanced techniques like Random Forest, XGBoost, and LSTM networks, alongside an automated trading bot. A comprehensive solution for enhancing stock market prediction accuracy.
Turns Data and AI algorithms into production-ready web applications in no time.
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
An orchestration platform for the development, production, and observation of data assets.
Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
Add a description, image, and links to the data-integration topic page so that developers can more easily learn about it.
To associate your repository with the data-integration topic, visit your repo's landing page and select "manage topics."