You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I plan to deploy the model using ggml on Qualcomm's chip. I'm curious about the comparison between using ggml for inference on an SoC chip (such as a Qualcomm SoC, involving components like CPU, GPU, NPU, etc.) versus leveraging the inference engine provided by the chip itself (such as qualcomm SNPE). Since ggml inference primarily takes place on the CPU, whereas the chip's inference engine can offload computations to the GPU or NPU, does using ggml lead to a significant increase in CPU memory usage and %CPU, potentially impacting other tasks? Has anyone conducted a similar comparative test?
The text was updated successfully, but these errors were encountered:
Hello, I plan to deploy the model using ggml on Qualcomm's chip. I'm curious about the comparison between using ggml for inference on an SoC chip (such as a Qualcomm SoC, involving components like CPU, GPU, NPU, etc.) versus leveraging the inference engine provided by the chip itself (such as qualcomm SNPE). Since ggml inference primarily takes place on the CPU, whereas the chip's inference engine can offload computations to the GPU or NPU, does using ggml lead to a significant increase in CPU memory usage and %CPU, potentially impacting other tasks? Has anyone conducted a similar comparative test?
The text was updated successfully, but these errors were encountered: