Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: read list of parquet files at once #5723

Open
mvashishtha opened this issue Mar 1, 2023 · 0 comments · May be fixed by #5724
Open

PERF: read list of parquet files at once #5723

mvashishtha opened this issue Mar 1, 2023 · 0 comments · May be fixed by #5724
Labels
new feature/request 💬 Requests and pull requests for new features P2 Minor bugs or low-priority feature requests Performance 🚀 Performance related issues and pull requests.

Comments

@mvashishtha
Copy link
Collaborator

You can provide a list of local files (but not a list of directories or s3 files) to read_parquet, as in #5698. For #5698 I will make a fix to read the files separately and concat the results. I attempted a more general solution but it required too much surgery on modin's read_parquet code, which assumes in many places that there's just one file or directory to read. One difficulty I had is that ParquetDirectory can take a list of files but not a list of directories, so we can't convert to a list of directories at the beginning.

@mvashishtha mvashishtha added new feature/request 💬 Requests and pull requests for new features Triage 🩹 Issues that need triage P2 Minor bugs or low-priority feature requests labels Mar 1, 2023
mvashishtha pushed a commit to mvashishtha/modin that referenced this issue Mar 1, 2023
@mvashishtha mvashishtha linked a pull request Mar 1, 2023 that will close this issue
7 tasks
@mvashishtha mvashishtha added Performance 🚀 Performance related issues and pull requests. and removed Triage 🩹 Issues that need triage labels Mar 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature/request 💬 Requests and pull requests for new features P2 Minor bugs or low-priority feature requests Performance 🚀 Performance related issues and pull requests.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant