- 
                Notifications
    
You must be signed in to change notification settings  - Fork 2.2k
 
fix runc's poststart behaviour doesn't match the runtime-spec #4348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
6ffc50a    to
    307c2b6      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code should be moved to container.Exec (keep it inside libcontainer).
| 
           Please write a more detailed commit message. "fix poststart error" is not a good one.  | 
    
307c2b6    to
    a78a083      
    Compare
  
    
          
 done.thank you. please review my pr. @kolyshkin @cyphar  | 
    
c6c42bc    to
    b766494      
    Compare
  
    b766494    to
    db988e0      
    Compare
  
    There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a test case to avoid the similar error in future.
You can see:
#4323 (comment)
| return err | ||
| } | ||
| if c.config.Hooks != nil { | ||
| if err := c.config.Hooks.Run(configs.Poststart, s); err != nil { | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the runtime-spec said:
The poststart hooks MUST be invoked by the runtime. If any poststart hook fails, the runtime MUST log a warning, but the remaining hooks and lifecycle continue as if the hook had succeeded.
So maybe this is another bug, we should write a new function to run poststart and poststop hooks.
@kolyshkin WDYT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If yes, please help to open a new issue to track this bug, thanks.
      
        
              This comment was marked as outdated.
        
          
      
    
    This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the current code logs a warning (like the spec says) but also terminates container's init and returns an error (unlike the spec says).
I'm not sure whether we should change the spec or runc behavior. In theory spec should be left as is, and runc should be changed to follow the spec. In practice, though, changing runc might result in backward incompatibilities and complexities in upper runtime. So yes, we have a choice to make here.
It's complicated 😿
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I researched the history a little bit.
- The initial implementation of poststart hooks was done by @mrunalp in 2015 (commit 452e8a7, PR Add poststart hooks #392, merged Nov 2015). If a hook was returning an error, the start was terminated and the error was returned.
 - The "start must continue even if poststart hook failed" clause was added to runtime-spec by @RenaudWasTaken in Add create-runtime, create-container, start-container hooks runtime-spec#1008 (merged Dec 30 2019).
 - More hooks were added to runc by @RenaudWasTaken (commit ccdd757, PR Add CreateRuntime, CreateContainer and StartContainer Hooks #2229, merged Jun 2020). The same commit touched the poststart code but didn't change it to not return an error if a hook failed.
 
In my mind, this is the spec that needs to be changed. The reasons are:
- initial implementation predates the spec wording by a few years;
 - the wording in the spec was never implemented in runc;
 - returning an error (and stopping the container) seems like a more versatile approach (a hook can choose whether to return an error or not).
 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how to stop container by using SIGTERM ,SIGKILL or both use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably SIGTERM followed by SIGKILL or just SIGKILL.
db988e0    to
    0f17a69      
    Compare
  
    0f17a69    to
    3d9a118      
    Compare
  
    | 
           already backport #4351  | 
    
| 
           The commit description still doesn't say what it does and why.  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR removes the part where the container is killed when a hook fails. This should have been caught by an integration test, alas, it doesn't check that.
Opened #4352
d2fac01    to
    adcf828      
    Compare
  
    5d066a8    to
    df2c44a      
    Compare
  
    df2c44a    to
    5a2aeea      
    Compare
  
    Signed-off-by: ningmingxiao <[email protected]>
5a2aeea    to
    9bfaca5      
    Compare
  
    | 
           Moved the milestone to 1.3.0. Dealing with this is going to take a while, and we should just get 1.2.0 out (this is a pre-existing issue anyway).  | 
    
| 
           A draft until opencontainers/runtime-spec#1262 is decided on  | 
    
| 
           Damn, I missed this because it wasn't in the -rc.1 milestone. I guess we'll need to wait for 1.4.0-rc.1 before fixing this.  | 
    
| 
           The spec PR is not yet merged, moving to 1.4. Now to 1.4.0-rc.1, so hopefully you don't miss it @cyphar :)  | 
    
| 
           Ok thanks  | 
    
fix:#4347