Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dkms.in: Handle error cases in autoinstall #262

Merged

Conversation

SparkingSpork
Copy link
Contributor

Currently, autoinstall has two issues with error handling:

  1. If a module fails to install, the log still acts as though it installed and continues with that assumption.
  2. If any module fails to install or isn't installed, the exit code is still 0 and indicates success.

These two commits correct these behaviors to reflect the actual errors that occur.

Testing:

Ran locally and confirmed the error cases are now handled.

@evelikov
Copy link
Collaborator

Thanks for the fixes - can you add some tests? It will help visualise the changes and should be less likely to break in the future.

@SparkingSpork
Copy link
Contributor Author

Thanks for the fixes - can you add some tests? It will help visualise the changes and should be less likely to break in the future.

I've added some failing test modules and tests to the bash script. Let me know if there's anything further required.

@SparkingSpork SparkingSpork force-pushed the dev/error-on-autoinstall-failure branch from 8ae1626 to 1163414 Compare October 25, 2022 22:25
@SparkingSpork SparkingSpork force-pushed the dev/error-on-autoinstall-failure branch from 1163414 to aeb2ece Compare October 26, 2022 20:05
@SparkingSpork
Copy link
Contributor Author

SparkingSpork commented Oct 26, 2022

@evelikov - It looks like the changes in #265 broke all tests for me. Even when I switch back to the master branch, clean, and rebuild, I can't run the tests. I don't want this PR to get more complicated than it already is by trying to fix those tests here. Any chance of either getting the tests fixed or reverting those changes for now?

Edit to add:
I'm on Debian GNU/Linux 11 (bullseye) with kernel 5.10.0-19-amd64

Edit 2:
OK, it appears that if I run sudo make install the new tests pass. Is that the intended way for the tests to run? I see the PR builds run the tests that way. I've been running with just a local copy in the repo root directory and the new tests behave differently in that case. For example, running the Test framework file hijacking test:

  • When installed, nothing is printed and the test passes.
  • When run from the repo instead, do_status() is hijacked! is printed and the test fails.

@SparkingSpork SparkingSpork force-pushed the dev/error-on-autoinstall-failure branch from aeb2ece to 059efff Compare October 27, 2022 18:32
Charlie Johnston added 5 commits October 31, 2022 10:01
The autoinstall functionality was treating modules as
installed even if the install failed. This change
updates the logic to try installing the module but
not treat it as successfully installed if an error
occurs during the install attempt.

Signed-off-by: Charlie Johnston <[email protected]>
…tall.

If a module fails to autoinstall due to errors during install or missing
dependencies, dkms currently still returns an exit code of 0. This change
ensures that a failing install or missing dependency returns a non-zero
exit code and reports an error instead.

Signed-off-by: Charlie Johnston <[email protected]>
This change updates helper methods in run-test.sh such
that modules other than dkms_test could be used for new
tests in the future. The helper methods have been updated
to allow arbitrary module names as inputs.

Signed-off-by: Charlie Johnston <[email protected]>
The dkms status case is special in these tests as the
tests filter to just the status of the module in quesiton.
This means instead of checking a dkms command for the
expected output, it was checking a helper method's output.
This change adds a new helper method specifically for dkms
status and checking its output.

Signed-off-by: Charlie Johnston <[email protected]>
The logic to genericize the expected output via sed was
previously only used for run_with_expected_output and
not by run_with_expected_error. Splitting that into a
helper method allows it to be used in both places.

This change also uses a variable for the log file names
so that only one place needs updated in each helper.

Signed-off-by: Charlie Johnston <[email protected]>
@SparkingSpork SparkingSpork force-pushed the dev/error-on-autoinstall-failure branch from 059efff to 654b39c Compare October 31, 2022 15:13
@evelikov
Copy link
Collaborator

Looks great thank you. Assuming the CI doesn't flag any issues, will merge this shortly - don't want another set of merge conflicts.

@SparkingSpork
Copy link
Contributor Author

Looks great thank you. Assuming the CI doesn't flag any issues, will merge this shortly - don't want another set of merge conflicts.

Hmm, looks like this line gets printed in the output on the Ubuntu tests in the CI:
+python3: can't open file '/usr/share/apport/package-hooks/dkms_packages.py': [Errno 2] No such file or directory
I'll add that to the genericize output as that seems unrelated to the tests.

@SparkingSpork
Copy link
Contributor Author

Looks great thank you. Assuming the CI doesn't flag any issues, will merge this shortly - don't want another set of merge conflicts.

Hmm, looks like this line gets printed in the output on the Ubuntu tests in the CI: +python3: can't open file '/usr/share/apport/package-hooks/dkms_packages.py': [Errno 2] No such file or directory I'll add that to the genericize output as that seems unrelated to the tests.

@evelikov - Added a filter for the error output in the CI.

@evelikov
Copy link
Collaborator

evelikov commented Oct 31, 2022

The CI is failing - if I'm reading it correctly, that's because the autoinstall commands do not provide a kernel_ver, so dkms tries all kernels present. Whereby the 5.15.0-1022-azure is missing its headers.

See the successful test "Running dkms autoinstall" aka run_with_expected_output dkms autoinstall -k "${KERNEL_VER}" for reference

@SparkingSpork
Copy link
Contributor Author

The CI is failing - if I'm reading it correctly, that's because the autoinstall commands do not provide a kernel_ver, so dkms tries all kernels present. Whereby the 5.15.0-1022-azure is missing its headers.

See the successful test "Running dkms autoinstall" aka run_with_expected_output dkms autoinstall -k "${KERNEL_VER}" for reference

@evelikov - Done.

@SparkingSpork SparkingSpork force-pushed the dev/error-on-autoinstall-failure branch from dbd7812 to 5835499 Compare October 31, 2022 17:24
@SparkingSpork
Copy link
Contributor Author

Forget to escape the sed command string, should be good now.

@evelikov
Copy link
Collaborator

The pesky python3 warning is still there - the rest have been fixed.

@SparkingSpork SparkingSpork force-pushed the dev/error-on-autoinstall-failure branch from 5835499 to e62e368 Compare October 31, 2022 17:47
@SparkingSpork
Copy link
Contributor Author

The pesky python3 warning is still there - the rest have been fixed.

Should be fixed now. Tested it manually since I wasn't seeing the issue on my machine.

Charlie Johnston added 2 commits November 4, 2022 17:41
This change adds tests for cases where dkms autoinstall
will fail and return an error. This required adding two
new test modules to cover the case where a module fails
to build and the case where a module is missing a
dependency.

Signed-off-by: Charlie Johnston <[email protected]>
@evelikov evelikov force-pushed the dev/error-on-autoinstall-failure branch from e62e368 to a39fec7 Compare November 4, 2022 17:50
@evelikov evelikov merged commit 3d55cf1 into dell:master Nov 4, 2022
@SparkingSpork
Copy link
Contributor Author

@evelikov - Thank you for getting this completed! I was on vacation so definitely appreciate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants