Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support to Windows NUMA core count #4167

Draft
wants to merge 1 commit into
base: dev
Choose a base branch
from
Draft

Conversation

wargio
Copy link
Member

@wargio wargio commented Jan 29, 2024

Your checklist for this pull request

  • I've read the guidelines for contributing to this repository
  • I made sure to follow the project's coding style
  • I've documented or updated the documentation of every function and struct this PR changes. If not so I've explained why.
  • I've added tests that prove my fix is effective or that my feature works (if possible)
  • I've updated the rizin book with the relevant information (if needed)

Detailed description

This is not testable normally but it "should work".

@karliss
Copy link
Member

karliss commented Jan 29, 2024

As was already pointed out to GermanAizek multiple times in other issues/PR that he spammed . The NUMA behavior on windows is a bit more complicated than simply bigger number goes faster. Even if you ignore the implications of memory access from multiple sockets, there is windows scheduler which isn't necesarily going to spread the threads across all CPUs (with behavior differing depending on Windows version). Actually utilizing multiple CPUs requires more work than returning bigger number in get_thread_count().

The smaller number that you get by default from windows is intentional.

I don't have the required hardware to confirm how exactly each windows version behaves.

See
llvm/llvm-project#72270 (review)
https://developercommunity.visualstudio.com/t/hardware-concurrency-returns-an-incorrect-result/350854#T-N354351
Xaymar/obs-StreamFX#1093

@wargio
Copy link
Member Author

wargio commented Jan 30, 2024

that is a good point, now the issue is that even if we wanted to implement that, we would need to have a NUMA system (which we don't).

@wargio wargio marked this pull request as draft January 30, 2024 02:22
@GermanAizek
Copy link

that is a good point, now the issue is that even if we wanted to implement that, we would need to have a NUMA system (which we don't).

I can test branches, but I need time to build as in CI under Windows, it is easy to building on Linux. Why is there no cmake build, because it is so convenient to change C CXX compiler for building via -DCXX_COMPILER, but how do I change MinGW to MSVC using meson build?

@wargio
Copy link
Member Author

wargio commented Jan 31, 2024

just run the build from the cmd prompt or powershell.

@wargio
Copy link
Member Author

wargio commented Mar 1, 2024

rizinorg/cutter#3286 (comment)

As per cutter thread, this works but numa support requires some better implementation of the multhreading code, which we cannot right now implement.

@GermanAizek
Copy link

rizinorg/cutter#3286 (comment)

As per cutter thread, this works but numa support requires some better implementation of the multhreading code, which we cannot right now implement.

@wargio, you can take this, it really quickly check NUMA support. I can helped with NUMA portability.

bool IsNUMA() noexcept
{
    ULONG HighestNodeNumber;
    return !(!GetNumaHighestNodeNumber(&HighestNodeNumber) || HighestNodeNumber == 0);
}

@wargio
Copy link
Member Author

wargio commented Mar 2, 2024

the issue mentioned above is that in order to use all the numa capabilities, we need a more numa-friendly code, which requires specific malloc etc.. on windows. we don't have that yet. potentially we could implement a piece of code that detects if numa and allows to run with more cores, but i'm not sure.
@XVilka what do you think?

@XVilka
Copy link
Member

XVilka commented Mar 2, 2024

I don't think it worth the trouble at this point, especially since currently file parsing and analysis aren't multithreaded, thus benefits of this will be quite limited.

@GermanAizek
Copy link

we need a more numa-friendly code, which requires specific malloc etc.. on windows.

@wargio,
Are you talking about WinAPI allocation?
Their implementation has malloc functions for working with NUMA, previously I tried to send PR to x64dbg
x64dbg/x64dbg#3307

@wargio
Copy link
Member Author

wargio commented Mar 21, 2024

yes, but the issue is related on how rizin works. we do not have a centralized malloc function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants