Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix potential race condition in _rmapiRmControl #656

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Cwndmiao
Copy link

@Cwndmiao Cwndmiao commented Jun 5, 2024

`
diff --git a/src/nvidia/src/kernel/rmapi/control.c b/src/nvidia/src/kernel/rmapi/control.c
index 0ed2e1e7..140386ba 100644
--- a/src/nvidia/src/kernel/rmapi/control.c
+++ b/src/nvidia/src/kernel/rmapi/control.c
@@ -427,13 +427,6 @@ _rmapiRmControl(NvHandle hClient, NvHandle hObject, NvU32 cmd, NvP64 pUserParams
}
}

  • // Potential race condition if run lockless?
  • if (serverutilGetClientUnderLock(hClient) == NULL)
  • {
  •    rmStatus = NV_ERR_INVALID_CLIENT;
    
  •    goto done;
    
  • }
  • // only kernel clients can issue raised IRQL or lock bypass cmds
    // bypass client priv check for internal calls done on behalf of lower priv
    // clients
    `

serverutilGetClientUnderLock() in _rmapiRmControl() is useless, and may cause race condition if someone else is creating/destroying RmClient simultaneously.

659d989b9690c9bb8672c5ff63d5ea5a

@Cwndmiao Cwndmiao marked this pull request as ready for review June 5, 2024 08:25
@mtijanic
Copy link
Collaborator

mtijanic commented Jun 5, 2024

Thanks for pointing this out! While not entirely useless (it validates hClient) that call is quite unfortunate, both in the race condition/crash potential and in that it performs a somewhat expensive lookup only to discard the result.

I'm not sure simply removing it is the right way to go, as then apps can poke around the state quite a bit without allocating the client (e.g. you could get all the cached data).

So while this path definitely needs to be refactored, I'm gonna have to dwell on this for a bit and get back to you in a few days. Thanks again!

@mtijanic mtijanic self-assigned this Jun 5, 2024
@CLAassistant
Copy link

CLAassistant commented Jun 6, 2024

CLA assistant check
All committers have signed the CLA.

@mtijanic
Copy link
Collaborator

mtijanic commented Jun 6, 2024

Hey, I think we'll merge this change as-is, and then maybe handle the rest as a separate thing, or not at all. I'll start the process of applying it internally, but it will likely only show up in r565.xx release. This PR will have to stay open until then. Sorry for the slow and arcane process.

And thanks again for the PR, this is really appreciated! Anything else you come across, please let us know, as PR or bug report.

@Cwndmiao
Copy link
Author

Cwndmiao commented Jun 6, 2024

Thanks for your timely reply :)

@mtijanic mtijanic added Implemented Fixed, in test prior to release integration labels Jun 6, 2024
@mtijanic
Copy link
Collaborator

Unfortunately, had to revert it from 565.xx because it consistently breaks certain Windows tests. Filed internal bug 4749826 to root cause that, and we'll re-apply the change when the issue is understood.

(apparently something in Windows usermode depends on this behavior. Possibly it sends a control that is invalid in multiple ways - e.g. bad hClient and bad parameters - and it expects a specific error status. With this change, if there are multiple problems with a call, we might fail for other reasons before returning NV_ERR_INVALID_CLIENT)

@mtijanic mtijanic added NV-Triaged An NVBug has been created for dev to investigate and removed Implemented Fixed, in test prior to release integration labels Jul 16, 2024
Copy link

@hema203 hema203 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NV-Triaged An NVBug has been created for dev to investigate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants