Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quickstart instructions are outdated they ask to install unneeded stuff (many GB) #106

Open
set-soft opened this issue May 12, 2023 · 0 comments

Comments

@set-soft
Copy link

The quickstart instructions, last modified by @sunway513 6 months ago says: Step 1: Install amdgpu.

Which kernel needs it? I'm running Debian 11.7 (current stable, soon to be replaced by 12), which is quite old and conservative, and this step isn't needed. The kernel is 5.10.178 and it has the amdgpu module.

I installed the amdgpu 5.18.2.22.40.50303 and couldn't find any difference. And it was really hard to install because the DKMS script is designed in a way that too many object files are linked in the last stage (instead of using intermediate libs, like the competition does). This is aggravated by the extra long name of the package (come on, AMD should use a release name, like 5.18.2, the "22.40.50303" should be something internal). The result is a command line with one single argument exceeding 128 kB (a ridiculous limitation in the Linux kernel). So I had to do some tricks. All to get nothing interesting.

And things can go wrong here, when I tried to install 6.0.5.50500-1581431.20.04 I found AMD made changes to the page flip API, making it incompatible. The Debian video driver worked, but my log was flooded with error messages about page flip errors. I can understand that a major release change (5 to 6) implies some sort of incompatibility (IMHO unneeded, how much code is needed to keep it compatible? is a kernel module of various MB!).

Conclusion: asking people to install an amdgpu is:

  1. Not needed in modern Linux systems (in the worst case name which kernels needs it)
  2. Potentially a source of problems if you end installing a module that isn't compatible

But the instructions are even worst because they say you need to:

# Install the ROCm rock-dkms kernel modules, reboot required
sudo apt-get update
wget https://repo.radeon.com/amdgpu-install/5.3/ubuntu/focal/amdgpu-install_5.3.50300-1_all.deb 
sudo apt-get install ./amdgpu-install_5.3.50300-1_all.deb
sudo amdgpu-install --usecase=rocm

This will install the whole ROCm stack, not just the amdgpu module!!
The amdgpu-install script has a usecase to install the amdgpu module, is amdgpu if I recall correctly.

Why should you install all ROCm stuff in the host? Avoiding it is the main reason to create a docker image. Am I wrong?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant