You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In ORC chunked reader, the decoded stripes are materialized into a cudf table before it is splitt into multiple output chunks through slicing. Materializing such table is memory consuming. We can optimize memory usage in the chunked reader by avoiding that step altogether. By doing so, only part of the decoded stripes is materialized which is enough for one output chunk.
This requires some amount of work to rewrite cudf::io::detail::column_buffer.
The text was updated successfully, but these errors were encountered:
ttnghia
changed the title
[FEA] Avoid materialize temporary table in ORC chunked reader
[FEA] Avoid materializing temporary table in ORC chunked reader
May 16, 2024
In ORC chunked reader, the decoded stripes are materialized into a cudf table before it is splitt into multiple output chunks through slicing. Materializing such table is memory consuming. We can optimize memory usage in the chunked reader by avoiding that step altogether. By doing so, only part of the decoded stripes is materialized which is enough for one output chunk.
This requires some amount of work to rewrite
cudf::io::detail::column_buffer
.The text was updated successfully, but these errors were encountered: