-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
peer: Synchronize net.Conn with a mutex #3478
Conversation
The access guarded by an atomic int32 was incorrect. For example, access to the p.conn could be performed as long as p.connected was non-zero, but p.connected would be incremented before p.conn was ever assigned by AssociateConnection. While here, also add missing mutex locking protecting timeConnected and na.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any issue with this being made harder to misuse, particularly because it's exported code, but I don't see any issue with how it is used currently in dcrd.
It is only ever initialized via inboundPeerConnected and outboundPeerConnected.
For inboundPeerConnected
, it's given a net.Conn
from the listener and it then calls peer.NewInboundPeer
which only creates the struct, followed by the call to sp.AssociateConnection
which sets the connected atomic and then assigns the conn.
Similarly, for outboundPeerConnected
, it's given a *connmgr.ConnReq
and net.Conn
from the connection manager and it then calls peer.NewOutboundPeer
which, similar to peer.NewInboundPeer
, only creates the struct and does some other misc setup not involving the connection, followed by the call to sp.AssociateConnection
which, the same as before, sets the connected atomic and then assigns the conn.
You are correct that there could theoretically be a race between setting the atomic and then assigning the conn if there were multiple goroutines racing on them, but no other goroutines aside from the one launched inside AssociateConnection
itself have access to the peer until after AssocateConnection
returns in either case, so there would be no race.
It is only after the initial fields are set in AssociateConnection
that it launches the internal peer.start()
in a goroutine and returns at which point the newly created and initialized peer is then further used in another goroutine running sp.Run()
and finally returned to other code for use.
It's for the same reason the stats mutex is not really needed for the timeConnected
and na
fields. No other goroutines are racing on them at the point they are initialized.
That's right, it's because these are exported methods. I didn't see a race with how dcrd is using them. Speaking of, I found it odd that since both inboundPeerConnected and outboundPeerConnected already have the connection available, why wouldn't peer.New{Inbound,Outbound}Peer require passing this to the constructor? It's only test code (banmanager) that ever creates (outbound) peers without an underlying connection. |
And my initial concern that got me looking into this code is that peer.Disconnect, which is allowed to be called multiple times, would only actually close the underlying connection if it had already been associated. But if this order was called:
then neither the AssociateConnection call, nor the final disconnect would close the underlying connection. |
Aside from requiring a new major module version since changing it would be a breaking API change, it's mostly just an artifact of much older code. Originally, all of this code lived in the main server code as opposed to a separate package (along with all of the connection management and listening code too) and so it wasn't really written with the goal of being a separate package. The person who split it out pretty closely followed how things were already implemented as opposed to a more clean room package design, so there are certainly various aspects that could be improved. Back at that time, the initial outbound connection was made and assigned by On the other hand, Rather obviously, having |
Right. That doesn't actually happen with dcrd's usage, but since it's exported code, that sequence of events is indeed possible if a caller were to try to disconnect in between calling the constructor and associating the connection for some reason. This PR accomplishes making it more robust against misuse without having to change the API and require a new major module version. |
The access guarded by an atomic int32 was incorrect. For example, access to the p.conn could be performed as long as p.connected was non-zero, but p.connected would be incremented before p.conn was ever assigned by AssociateConnection.