Speed-up reading of large header files #204

astaric · 2022-06-09T22:18:59Z

When a large file was passed to the -H parameter, bwa-mem2 spent more than 10 minutes just reading the lines from the file (I experienced this on a 70MB file with ~1.5M lines).

This was caused by computing the length of the already parsed header and reallocating the string for every single line in the input file resulting in quadratic complexity. The new implementation allocates a buffer that is able to fit the whole file up front and then reads all "valid" lines into the buffer before calling the original bwa_insert_header implementation. With this change the reading of header file takes <1s.

Speed-up reading of large header files

99f794c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speed-up reading of large header files #204

Speed-up reading of large header files #204

astaric commented Jun 9, 2022

Speed-up reading of large header files #204

Are you sure you want to change the base?

Speed-up reading of large header files #204

Conversation

astaric commented Jun 9, 2022