Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storage: Add NVMe and SDC storage connectors #14710

Merged
merged 14 commits into from
Jan 29, 2025

Conversation

MusicDin
Copy link
Member

@MusicDin MusicDin commented Dec 20, 2024

This PR contains cherry-picks from #14700 to include only connectors and PowerFlex changes (without Pure Storage bits)

The aim of the connector is to handle connections with the target storage. Currently, connectors support NVMe, and SDC.

Once PowerFlex will use Connect/Diconnect, the following connector functions can be removed:

  • ConnectAll
  • DisconnectAll
  • SessionID

Closes #14708

@MusicDin
Copy link
Member Author

MusicDin commented Dec 20, 2024

As previously discussed with @roosterfish, I have also replaced nvme "disconnect-all" and "connect-all" by connecting to just a specific target. This prevents disconnecting all nvme volumes, which would affect other storage drivers as well. However, needs to be checked that NQN is correctly parsed and passed to the connector.

Copy link
Contributor

@roosterfish roosterfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, looking great already.

I'll grab your branch and make some tests on the PowerFlex cluster for validation. I think the only change that has to be performed upfront is using the actual (Connect|Disconnect)All funcs to resemble the original behavior.
When we have merged the PR I can take care of switching to the Connect/Disconnect instead.

lxd/storage/connectors/connector_nvme.go Show resolved Hide resolved
lxd/storage/drivers/driver_powerflex_utils.go Outdated Show resolved Hide resolved
lxd/storage/drivers/driver_powerflex.go Show resolved Hide resolved
lxd/storage/drivers/driver_powerflex_utils.go Show resolved Hide resolved
lxd/storage/drivers/driver_powerflex_utils.go Outdated Show resolved Hide resolved
lxd/storage/drivers/driver_powerflex_utils.go Outdated Show resolved Hide resolved
lxd/storage/drivers/driver_powerflex_utils.go Outdated Show resolved Hide resolved
lxd/storage/drivers/driver_powerflex_utils.go Outdated Show resolved Hide resolved
lxd/storage/drivers/driver_powerflex_utils.go Outdated Show resolved Hide resolved
@MusicDin MusicDin force-pushed the feat/storage-connector branch from 67f9bf3 to 4846572 Compare December 20, 2024 18:19
@MusicDin
Copy link
Member Author

@roosterfish Thanks for the comments. I've addressed all of them, and reverted fixes for ConnectAll -> Connect, and DisconnectAll -> Disconnect to retain the previous implementation.

@roosterfish
Copy link
Contributor

I've addressed all of them, and reverted fixes

Thanks, I'll run it through some tests on the PowerFlex env and report back here.

lxd/storage/connectors/utils.go Outdated Show resolved Hide resolved
lxd/storage/connectors/utils.go Outdated Show resolved Hide resolved
@MusicDin MusicDin force-pushed the feat/storage-connector branch from 4846572 to 0e0ad2b Compare January 10, 2025 13:00
@MusicDin MusicDin force-pushed the feat/storage-connector branch 7 times, most recently from e7953f9 to 33ad86e Compare January 17, 2025 12:45
@MusicDin MusicDin force-pushed the feat/storage-connector branch 3 times, most recently from 520b075 to eac5ab5 Compare January 21, 2025 11:30
@MusicDin MusicDin marked this pull request as ready for review January 21, 2025 13:38
@MusicDin MusicDin force-pushed the feat/storage-connector branch 3 times, most recently from 9388a02 to 0652fc7 Compare January 21, 2025 16:41
@MusicDin MusicDin marked this pull request as draft January 21, 2025 16:47
@MusicDin MusicDin force-pushed the feat/storage-connector branch from 0652fc7 to 4e96af0 Compare January 21, 2025 16:58
Copy link
Member

@tomponline tomponline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nearly there, few small changes to make. Ta

@MusicDin MusicDin force-pushed the feat/storage-connector branch 4 times, most recently from 35c0582 to c426055 Compare January 28, 2025 14:39
cancel()

// Wait until all connection attempts have finished.
wg.Wait()
Copy link
Member

@tomponline tomponline Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once the wait group has finished in the go routine above, the lock is released.
Is it OK that we can call c.Disconnect below after this when potentially another caller has taken the lock and is in the process of connecting again? @roosterfish @MusicDin ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mh, in this situation the cleanup call to Disconnect will rip out the connection from another caller because Disconnect obtains the lock and tries to disconnect the session if found. @MusicDin please confirm this.
But it doesn't anymore have the knowledge about whether or not the connection was made on its own behalf.

Maybe a solution could be to have an atomic counter that tracks the "connection demands" for all the vols (grouped by session) and only if this counter is zero Disconnect will actually get rid of the connection?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this one is tricky. It could happen. But how long should we keep it?

Copy link
Member Author

@MusicDin MusicDin Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a counter would be a viable solution. But when do we release the initial lock/reduce the counter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MusicDin @roosterfish and I discussed it and we'll merge this as-is, and then @roosterfish is going to see if he can address this with his other up coming PowerFlex fixes.

That way we can unblock the Pure driver.

@MusicDin MusicDin force-pushed the feat/storage-connector branch from c426055 to b969d1d Compare January 28, 2025 15:14
@MusicDin MusicDin force-pushed the feat/storage-connector branch from b969d1d to 7d53cf4 Compare January 29, 2025 12:57
MusicDin and others added 11 commits January 29, 2025 13:03
Previously the mapVolume returned a reverter that disconnected NVMe session
in case of an error. Since mapVolume acquires a lock, calling disconnect
outside of the mapVolume or unmapVolume functions may result in a race
condition because another operation could establish a connection before the
reverter is called. In such case, the reverter would carelessly just terminate
the newly estabished connection.

Therefore, the outerReverter ensures that instead of disconnecting from the volume
the unmapVolume is called, which acquires the lock and ensures the disconnect is
called only if the session is not in used.

Co-authored-by: Julian Pelizäus <[email protected]>
Co-authored-by: Din Music <[email protected]>
Signed-off-by: Din Music <[email protected]>
@MusicDin MusicDin force-pushed the feat/storage-connector branch from 7d53cf4 to 343edef Compare January 29, 2025 13:03
Copy link
Contributor

@roosterfish roosterfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ran another bunch of tests on NVMe/SDC with VMs and containers. All good and cleaned up afterwards. LGTM!

Copy link
Member

@tomponline tomponline left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

@tomponline tomponline merged commit 5622894 into canonical:main Jan 29, 2025
28 checks passed
@MusicDin MusicDin deleted the feat/storage-connector branch January 29, 2025 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants