Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to read huge Excel file without load entire file in memory and append new rows? #2707

Open
aislanmaia opened this issue Jun 9, 2022 · 6 comments

Comments

@aislanmaia
Copy link

I need to append new rows to a huge excel file. How to do it without to read the entiry file in the memory, which can potencially slowing down the process and heap out of memory error ?

@farideliyev
Copy link

I am also struggling with the same issue, but in my case it is CSV file

@HappyFerry
Copy link

excel stream to csv, and handle csv file using pipeline. this is my may. it can append data item into any where.

@farideliyev
Copy link

@HappyFerry may you please share your implementation ?

@flaushi
Copy link

flaushi commented Sep 3, 2022

Yeah, @HappyFerry I'd be highly interested in how you stream excel to csv, especially with large excel files. Would be very grateful if you shared some details.

@SheetJSDev
Copy link
Contributor

This library, and every other JS library, is limited to the browser APIs in the frontend and limited to platform APIs on the backend.

XLSX is a ZIP-based file format and cannot be incrementally processed with a streaming data source for structural reasons as described in https://docs.sheetjs.com/docs/solutions/input#example-readable-streams . ZIP was designed for streaming write and assumes readers have random access.

That means, to first order, the browser must retain a seekable copy of the original file. (In NodeJS there is a way to avoid this using child_process to extract the ZIP file to the filesystem using the unzip command-line tool.)

Assuming retaining the original file in memory is not an issue, that specific workflow (reading a file and appending rows) can be optimized at the expense of being able to extract data in the process. Our Pro Edit build can skip the worksheet generation and surgically edit the raw XML.

If there is interest in a NodeJS-specific large file processor, please let us know.

@RameshDhanapal2022
Copy link

Save as the xlsx format or CSV format change it in to binary sheet so the huge data file easily access.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants