AWS Redshift

Sample of CSV data ingestion from S3 to Redshift.

Redshift

Create the infrastructure:

terraform init
terraform apply -auto-approve

Once done, connect to the Redshift Query Editor v2, and copy the CSV to the table:

copy products from 's3://<your-bucket-name>/data/products.csv' 
credentials 'aws_iam_role=arn:aws:iam::<aws-account-id>:role/<role-name>'
csv;

A ready-to-use command line is printed to the TF outputs. After that you should Vacuum and analyze the database:

Whenever you add, delete, or modify a significant number of rows, you should run a VACUUM command and then an ANALYZE command. A vacuum recovers the space from deleted rows and restores the sort order. The ANALYZE command updates the statistics metadata, which enables the query optimizer to generate more accurate query plans.

vacuum;
analyze;

You should now be able to select data with high performance:

select * from public.products;

Spectrum

You may also use Spectrum to read files directly from S3 on Redshift.

Glue

For an example of using Glue, check my other repository aws-glue-athena.

Data Pipelines

You can also create a Data Pipeline and periodically ingest data:

Clean-up

terraform destroy -auto-approve

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.gitignore		.gitignore
.terraform.lock.hcl		.terraform.lock.hcl
LICENSE		LICENSE
README.md		README.md
create-employees-table.sql		create-employees-table.sql
create-tables.sql		create-tables.sql
datapipelines.png		datapipelines.png
employees.csv		employees.csv
main.tf		main.tf
products.csv		products.csv
redshift.png		redshift.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

.terraform.lock.hcl

.terraform.lock.hcl

LICENSE

LICENSE

README.md

README.md

create-employees-table.sql

create-employees-table.sql

create-tables.sql

create-tables.sql

datapipelines.png

datapipelines.png

employees.csv

employees.csv

main.tf

main.tf

products.csv

products.csv

redshift.png

redshift.png

Repository files navigation

AWS Redshift

Redshift

Spectrum

Glue

Data Pipelines

Clean-up

About

Releases

Packages

Languages

License

epomatti/aws-redshift

Folders and files

Latest commit

History

Repository files navigation

AWS Redshift

Redshift

Spectrum

Glue

Data Pipelines

Clean-up

About

Topics

Resources

License

Stars

Watchers

Forks

Languages