-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move allocated pci deletion in defer of del cmd #278
base: master
Are you sure you want to change the base?
Move allocated pci deletion in defer of del cmd #278
Conversation
The allocated pci is supposed to be deleted when the del command is completed. However, placing this logic at the end of the function does not always guaranteed its execution. For instance, if ResetVfConfig or ReleaseVF fail, the err variable is overwritten which causes the cache to be deleted in the defer function and the next del retry to exit without deleting the allocated pci. By placing this logic along the deletion of the cache will ensure that all operations in the del command are done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general there is a protection for that in the allocation function.
if the namespace doesn't exist we will clean the allocation for new pods
if we failed on finishing the Del function and the namespace still exist we will have a generic problem in cmdAdd for a new pod saying just failed to find the vf to mode
First of all, it is not correct to rely on protections in the add command. We need to delete the allocation when the del command finishes for all cases. That's with the aim of having consistency. It is not correct to delete for some cases and for others not. Also, note that the error when the delete allocation fails is useless, the err var is not overwritten which will lead a retry that will surely fail (most likely in ReleaseVf). The cache will be deleted and the next retry will exit again without deleting the allocation. Second, by introducing this synchronization logic (not to run add before del), we are seeing new issues when for some reason the namespace is not deleted but all networking inside is removed. Before, the plugin simply continue but now it stays stuck in the add command, even though, it is safe to continue. This won't solve the above, but it will help reduce it by having the previous behavior when the del command correctly finishes for all cases. |
Hi @mlguerrero12 thanks for the comment! the thing is that if we failed on moving back the vf because of an issue (and we already encounter some of them) it will not be right to delete the lock because again the next pod that wants to use this will not work. I agree we can improve when to remove the allocation file if the VF is not in the namespace anymore but for some bug the network namespace is still preset we can remove the lock. WDYT? |
@SchSeba, I think we are mixing up the issues here. I agree that we need to find a way to inspect a namespace that failed to be deleted due to a bug to see if the networking has been removed. The protection that is currently present in the add command is supposed to be when the del command fails to complete. When the del command correctly terminates, the lock is supposed to be deleted. However, this is not always the case since the deletion of the lock is done at the end of the function. For instance, the del command terminates in line 205, 226 and 254 and the lock is not deleted. Do you have a valid reason for this? Also, if the lock deletion fails, the next retry will fail in ReleaseVF for no dpdk drivers. So, it's useless. What I want to achieve in this PR is to consistently delete the lock for all cases that the del command terminates and not rely on the protection in the add command which is supposed to be only when the del command is skipped. |
I get your point. but for this to work I think we need also another step we need to change the places where we are creating a new
this way we will really clean this up only if there was not error in the Delete function no? |
Based on the comments around the pci allocation code. The main purpose is to have a synchronization mechanism to not execute an add command before a del command. In config.go, line 41. " // This is to prevent issues where kubelet request to delete a pod and in the same time a new pod using the same Line 71, " // To prevent a locking of a PCI address for every pciAddress file we also add the netns path where it's been used Now, it seems you want to use this as well to detect errors in the del command and force a check in the add command. This is not mentioned anywhere. I thought the protection was only when the del command is not executed. |
Hi @mlguerrero12 thanks for the comment! you are right maybe we can add a better comment for this. also we saw use-cases where crio is not able to remove the network namespace and the VF remains on that namespace instead of moving back into the host network. in the case without this block, we will get again the generic error in the new pod creation saying enable to find vf interface |
The allocated pci is supposed to be deleted when the del command is completed. However, placing this logic at the end of the function does not always guaranteed its execution. For instance, if ResetVfConfig or ReleaseVF fail, the err variable is overwritten which causes the cache to be deleted in the defer function and the next del retry to exit without deleting the allocated pci.
By placing this logic along the deletion of the cache will ensure that all operations in the del command are done.