Wrong detection count threads in NUMA configuration in Windows #5524

GermanAizek · 2024-02-16T11:04:01Z

@ggerganov,
I solved similar issue in different projects on github, solution is simple to make wrapper function for geting threads, in Windows it is necessary to take into accumulate all logical processors in all NUMA nodes.

Problem lines in common, tests and examples llama.cpp:

llama.cpp/common/common.cpp

Lines 85 to 86 in 5f5808c

 unsigned int n_threads = std::thread::hardware_concurrency(); 

 return n_threads > 0 ? (n_threads <= 4 ? n_threads : n_threads / 2) : 4;

llama.cpp/common/common.cpp

Lines 160 to 187 in 5f5808c

 } else if (arg == "-t" || arg == "--threads") { 

 if (++i >= argc) { 

 invalid_param = true; 

 break; 

 } 

 params.n_threads = std::stoi(argv[i]); 

 if (params.n_threads <= 0) { 

 params.n_threads = std::thread::hardware_concurrency(); 

 } 

 } else if (arg == "-tb" || arg == "--threads-batch") { 

 if (++i >= argc) { 

 invalid_param = true; 

 break; 

 } 

 params.n_threads_batch = std::stoi(argv[i]); 

 if (params.n_threads_batch <= 0) { 

 params.n_threads_batch = std::thread::hardware_concurrency(); 

 } 

 } else if (arg == "-td" || arg == "--threads-draft") { 

 if (++i >= argc) { 

 invalid_param = true; 

 break; 

 } 

 params.n_threads_draft = std::stoi(argv[i]); 

 if (params.n_threads_draft <= 0) { 

 params.n_threads_draft = std::thread::hardware_concurrency(); 

 } 

 } else if (arg == "-tbd" || arg == "--threads-batch-draft") {

llama.cpp/common/common.cpp

Lines 187 to 196 in 5f5808c

 } else if (arg == "-tbd" || arg == "--threads-batch-draft") { 

 if (++i >= argc) { 

 invalid_param = true; 

 break; 

 } 

 params.n_threads_batch_draft = std::stoi(argv[i]); 

 if (params.n_threads_batch_draft <= 0) { 

 params.n_threads_batch_draft = std::thread::hardware_concurrency(); 

 } 

 } else if (arg == "-p" || arg == "--prompt") {

llama.cpp/common/common.cpp

Lines 1089 to 1099 in 5f5808c

 std::string get_system_info(const gpt_params & params) { 

 std::ostringstream os; 

 os << "system_info: n_threads = " << params.n_threads; 

 if (params.n_threads_batch != -1) { 

 os << " (n_threads_batch = " << params.n_threads_batch << ")"; 

 } 

 os << " / " << std::thread::hardware_concurrency() << " | " << llama_print_system_info(); 

 return os.str(); 

 }

Solutions:

[❌] C variant detection is not done here: git-for-windows/git#4766

[✔️] C++11 Windows XP minimal (rewriten modern variant by @mrexodia): x64dbg/x64dbg@d2f6ba7

[✔️] Modern C++17 (variant by @GermanAizek and #llvm-project maintainers): GermanAizek/llvm-project@d1fa25f

If you need NUMA detection function and an addition to my variant function, then
more optimized variant detection NUMA and return count threads on host, must added before GetLogicalProcessorInformationEx:

    if (!IsNUMA())
        return single_cpu_concurrency();

    if (GetLogicalProcessorInformationEx(RelationAll, nullptr, &length) != FALSE)
    {
        return single_cpu_concurrency();
    }

bool IsNUMA() noexcept
{
    ULONG HighestNodeNumber;
    return !(!GetNumaHighestNodeNumber(&HighestNodeNumber) || HighestNodeNumber == 0);
}

The text was updated successfully, but these errors were encountered:

GermanAizek · 2024-03-01T16:03:12Z

@ggerganov, can anyone confirm bug?

Here is another good example of bug fixing:
rizinorg/rizin#4167

github-actions · 2024-04-15T02:46:56Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

GermanAizek added the bug-unconfirmed label Feb 16, 2024

GermanAizek mentioned this issue Mar 5, 2024

Does not using all threads on NUMA configuration (server motherboards 2, 4, 6 multisocket CPU) ollama/ollama#2936

Open

github-actions bot added the stale label Apr 1, 2024

github-actions bot closed this as completed Apr 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Wrong detection count threads in NUMA configuration in Windows #5524

Wrong detection count threads in NUMA configuration in Windows #5524

GermanAizek commented Feb 16, 2024 •

edited

Loading

GermanAizek commented Mar 1, 2024 •

edited

Loading

github-actions bot commented Apr 15, 2024

Wrong detection count threads in NUMA configuration in Windows #5524

Wrong detection count threads in NUMA configuration in Windows #5524

Comments

GermanAizek commented Feb 16, 2024 • edited Loading

GermanAizek commented Mar 1, 2024 • edited Loading

github-actions bot commented Apr 15, 2024

GermanAizek commented Feb 16, 2024 •

edited

Loading

GermanAizek commented Mar 1, 2024 •

edited

Loading