Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SBGrid/SGE: How to exclude nodes by hostname? #107

Open
HenrikBengtsson opened this issue Feb 10, 2023 · 5 comments
Open

SBGrid/SGE: How to exclude nodes by hostname? #107

HenrikBengtsson opened this issue Feb 10, 2023 · 5 comments
Labels

Comments

@HenrikBengtsson
Copy link
Contributor

On https://wynton.ucsf.edu/hpc/software/sbgrid.html#sbgrid-programs-with-gpu-support we suggest:

"You may need to specify a beta version of the SBGrid programs, or avoid the qb3-atgpu* nodes."

but we don't give instructions anywhere how to avoid those nodes. Is that done by:

-l h="!qb3-atgpu*"

?

@ellestad
Copy link
Contributor

ellestad commented Feb 10, 2023

Actually, now most GPU enabled SBGrid software ARE compiled for versions of CUDA new enough to run on the AMD/Nvidia A40 nodes. At least GROMACS and RELION are. Not sure what other softwares people use, those are the ones we get the most comments about.

@ellestad
Copy link
Contributor

But, the above limit would avoid the atgpu nodes.

@ellestad
Copy link
Contributor

Also, this "Because of this, you have to make sure you load a corresponding CUDA environment module, e.g. module load cuda/10.1." comment can be removed. SBGrid includes NVIDIA libraries where necessary, it doesn't depend on the system cuda.

@HenrikBengtsson
Copy link
Contributor Author

I see. To be honest, I had to read that whole paragraph so many times to understand it. I blame lack of experience with GPU/CUDA.

Also, this "Because of this, you have to make sure you load a corresponding CUDA environment module, e.g. module load cuda/10.1." comment can be removed. SBGrid includes NVIDIA libraries where necessary, it doesn't depend on the system cuda.

Oh, I added that yesterday, because I thought it was forgotten. Should it be rephrased to: "WARNING: There is no need to load cuda modules when using SBGrid software, because they are included."? Also, if one loads a cuda module, is there a risk it will conflict with SBgrid? That is, do we need to warn against loading them?

Since you're much more experience with this, would you mind updating that section? Because, I'm mostly guessing and winging it here.

@HenrikBengtsson
Copy link
Contributor Author

Also, when using SBGrid, do the user have to declare -l compute_cap=<version> as mentioned on https://wynton.ucsf.edu/hpc/scheduler/gpu.html#gpu-relevant-resource-requests?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants