Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multi-part uploads in AWS S3 sink connector #1053

Open
brandon-powers opened this issue Mar 18, 2024 · 1 comment
Open

Support multi-part uploads in AWS S3 sink connector #1053

brandon-powers opened this issue Mar 18, 2024 · 1 comment

Comments

@brandon-powers
Copy link
Contributor

brandon-powers commented Mar 18, 2024

Issue Guidelines

Please review these questions before submitting any issue?

What version of the Stream Reactor are you reporting this issue for?

6.1.0, latest stable release.

Are you running the correct version of Kafka/Confluent for the Stream reactor release?

Yes.

Do you have a supported version of the data source/sink .i.e Cassandra 3.0.9?

Yes.

Have you read the docs?

Yes.

What is the expected behaviour?

AWS S3 sink connector adds configuration to support S3 part sizes for multi-part uploads, and implements it in the storage interface. The S3 client used supports it, though the current implementation doesn't call that flow.

What was observed?

File sizes produced in S3 have to be buffered in the tmpfs /tmp mount all at once. For example, if flush size is 100 MB, and there are 100 topic-partitions producing files on the worker, it would require 10 GB of RAM for the connector(s) per flush (excluding the Kafka consumer fetch request storage).

What is your Connect cluster configuration (connect-avro-distributed.properties)?

N/A

What is your connector properties configuration (my-connector.properties)?

N/A

Please provide full log files (redact and sensitive information)

N/A

References

@brandon-powers
Copy link
Contributor Author

Verified that this is not supported in Slack. Created this issue to track it, and may take a look at implementation when I have some time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants