Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

javascript client takes too much memory upon call to putObject #1045

Open
jpchev opened this issue Jun 27, 2022 · 7 comments
Open

javascript client takes too much memory upon call to putObject #1045

jpchev opened this issue Jun 27, 2022 · 7 comments

Comments

@jpchev
Copy link

jpchev commented Jun 27, 2022

please try the code of #1043 (comment)_ to reproduce the issue described at #1043

@jpchev
Copy link
Author

jpchev commented Aug 23, 2022

hello, any update on this?

@prakashsvmx
Copy link
Member

I could not replicate locally. will try again when i get sometime.
in the tests i ran , i was able to upload in GiBs without any memory issues. i tried upto 16 GiB file.

@Villain88
Copy link

Villain88 commented Aug 25, 2022

I have a similar problem.
I noticed that if you pass a size (any number) to the PutObject method then the memory consumption drops from 1.5GB to 300MB
At the same time, parts of 64 megabytes are used to write the object, if you do not pass the size, then the chunks have a size of about 5 MB

size number Size of the object (optional).

@jpchev
Copy link
Author

jpchev commented Aug 25, 2022

I have a similar problem. I noticed that if you pass a size (any number) to the PutObject method then the memory consumption drops from 1.5GB to 300MB At the same time, parts of 64 megabytes are used to write the object, if you do not pass the size, then the chunks have a size of about 5 MB

size number Size of the object (optional).

that's interesting, I think that's probably due to this line

size = this.maxObjectSize

which defaults to the maxObjectSize when no size is passed

@trim21
Copy link
Contributor

trim21 commented Apr 16, 2023

I have a similar problem. I noticed that if you pass a size (any number) to the PutObject method then the memory consumption drops from 1.5GB to 300MB At the same time, parts of 64 megabytes are used to write the object, if you do not pass the size, then the chunks have a size of about 5 MB
size number Size of the object (optional).

that's interesting, I think that's probably due to this line

size = this.maxObjectSize

which defaults to the maxObjectSize when no size is passed

Yes the solution is to always pass a size argument. or use a Buffer/string as input, or use API fPutObject, it will stat the file size.

Otherwize minio-js don't know the size of object and will just assume you are tring to upload a 5T file and use 500M as partSize to avoid S3 multiple upload limit. that means if you want to upload a file with GBs size, and you putObject(...,..., fs.createReadStream(...)) then it will load many 500mb chunks into memory then try to multiple uploading.

the example prakashsvmx use in previous issue, he always pass a size argument so minio-js doesn't consume huge memory

https://docs.aws.amazon.com/AmazonS3/latest/userguide/qfacts.html

personaly I think putObject and fPutObject should not use multiple upload implicitly.

@harshavardhana
Copy link
Member

The implicit use is normal, this is true with all SDKs such as AWS SDK - AWS SDK forces you to provide a size which > 0 so that it's easier to calculate. Parallelism is necessary by default.

However, what we can do is on our end simply never do parallel for the stream that doesn't provide the correct length.

We simply provide an added functionality by asking for a -1 size - we assume that you know what you are doing and you have sufficient memory to facilitate an upload.

All of this is nice to have you should ideally provide the right size of the object. Memory usage is simply needed because that's what AWS S3 API mandates

  • Mandates checksum either via sha256, md5sum for both you need to buffer there is no way to send checksums as trailer headers in AWS S3.
  • Newer checksums are available now that avail trailer-based mechanism that can be supported in minio-js it is a sizable effort that would take time. So any of you folks have time feel free to send a fix or let us work on it when we find time.
  • Make a tiny improvement use the same buffer and make the transfer serial when size is provided as -1 and turn-off all concurrency. Avoid memory build-up and re-use the same 500MiB buffer for the entire transfer.

// cc @prakashsvmx

@prakashsvmx
Copy link
Member

Thank you @harshavardhana , i will check -1 size further and implement it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants