Skip to content

Clear error when install_cuda_host_injections fails to run easybuild #496

@Flamefire

Description

@Flamefire

The script first tries to run eb CUDA-x.y.eb and if that fails runs eb -S '^CUDA-* and if that fails it just shows a generic error.

However if the actual failure is with eb itself that message is misleading. E.g.:

# /cvmfs/software.eessi.io/versions/2023.06/scripts/gpu_support/nvidia/install_cuda_host_injections.sh --cuda-version 12.8.6 --temp-dir /tmp/ --accept-cuda-eula                                                          
Attempting to load an EasyBuild module to do actual install
ERROR: You seem to be running EasyBuild with root privileges which is not wise, so let's end this here.
ERROR: You seem to be running EasyBuild with root privileges which is not wise, so let's end this here.
ERROR: The easyconfig CUDA-12.8.6.eb was not found in EasyBuild version:
  This is EasyBuild 5.1.0 (framework: 5.1.0, easyblocks: 5.1.0) on host c144.
You either need to give a different version of CUDA to install _or_ 
use a different version of EasyBuild for the installation.

I've seen a similar failure when something else fails during initialization, e.g. pre-creating the modules/all folder.

It would be good if the script could detect such cases and report an error of the kind "EasyBuild failed to run, check the above output for reasons"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions