Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

radix_sort_128x() produces unstable sorting output #1191

Open
gsc74 opened this issue Apr 12, 2024 · 0 comments
Open

radix_sort_128x() produces unstable sorting output #1191

gsc74 opened this issue Apr 12, 2024 · 0 comments

Comments

@gsc74
Copy link

gsc74 commented Apr 12, 2024

@lh3, I was benchmarking the radix_sort_128x() function from minimap2 to perform anchor sorting (map.c [Line 202, Line 289]). For small input sizes of anchors i.e. $\approx 10^5$, radix_sort_128x() produces stable sorting output. In contrast, the outputs are not stable for large inputs (checked for input anchors of size $10^6$). I have used std::stable_sort() to verify the stability of the implementation.

If I want to use std::stable_sort() to sort the anchors, then the PAF output is different for whole-genome alignment (i.e. the alignment of T2T assembly of HG002 against CHM13 v2.0).

Question: Let us say minimap2 produces three different PAF files, X, Y, and Z, by using std::sort(), radix_sort_128x() and std::stable_sort() for anchors sorting at [Line 202, Line 289] of map.c. Which output is preferable to use and why?

The key used for sorting the anchors in std::stable_sort() is specified here, and this repository (https://github.com/gsc74/test_ksort.git) contains the code for verification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant