Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update memory info to v2 #57

Open
marcelroed opened this issue Mar 25, 2024 · 0 comments · May be fixed by #58
Open

Update memory info to v2 #57

marcelroed opened this issue Mar 25, 2024 · 0 comments · May be fixed by #58

Comments

@marcelroed
Copy link

nvmlDeviceGetMemoryInfo_v2 has been available for a few years now. I propose to update the implementation to v2, so that the results correspond with what nvidia-smi and gpustat output.

Currently, the outputs of device.memory_info() return an nvmlMemory_t struct with total, free and used. nvmlMemory_v2_t is defined in the existing code, but never used. Calling the NVML function with a struct set up for v2 gives a slightly different result, where the used part doesn't include cache memory and other non-allocated stuff. Additionally, the struct has a field for version and reserved memory.

I implemented this change and did some testing, and it works! Importantly, the results of device.memory_info()?.used would produce 912MB on our A100s before the change, and now show 7.8MB with the v2 version, which matches nvidia-smi and gpustat.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant