Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mutexmap #95

Merged
merged 17 commits into from
May 23, 2024
Merged

Mutexmap #95

merged 17 commits into from
May 23, 2024

Conversation

elena-kolevska
Copy link
Contributor

@elena-kolevska elena-kolevska commented Apr 28, 2024

Description

This PR adds a map of mutexes that provide an inner and outer locking mechanism to support keeping separate locks on dynamic objects, as is the case with namespaced stream pools, for example.

Checklist

Please make sure you've completed the relevant tasks for this PR, out of the following list:

  • Code compiles correctly
  • Created/updated tests

Signed-off-by: Elena Kolevska <[email protected]>
Signed-off-by: Elena Kolevska <[email protected]>
Signed-off-by: Elena Kolevska <[email protected]>
Signed-off-by: Elena Kolevska <[email protected]>
Signed-off-by: Elena Kolevska <[email protected]>
@elena-kolevska elena-kolevska marked this pull request as ready for review May 14, 2024 00:44
@elena-kolevska elena-kolevska requested review from a team as code owners May 14, 2024 00:44
@ItalyPaleAle
Copy link
Contributor

In general, the code seems good... But what is this going to be used for? We already have other libraries used in the codebase that support thread-safe maps and do that in a more performant way too. For example, we already use github.com/alphadose/haxmap in other parts of Dapr (go.mod here). Although it has some caveat, it's a lot faster, thanks to being lock-free. Could that be used here too?

Co-authored-by: Cassie Coyle <[email protected]>
Signed-off-by: Elena Kolevska <[email protected]>
@codecov-commenter
Copy link

Codecov Report

Attention: Patch coverage is 74.57627% with 30 lines in your changes are missing coverage. Please review.

Project coverage is 79.83%. Comparing base (0c7cfce) to head (d43a33c).
Report is 2 commits behind head on main.

Files Patch % Lines
concurrency/atomicmap.go 66.66% 22 Missing ⚠️
concurrency/mutexmap.go 84.61% 7 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #95      +/-   ##
==========================================
- Coverage   79.94%   79.83%   -0.11%     
==========================================
  Files          56       62       +6     
  Lines        4382     4760     +378     
==========================================
+ Hits         3503     3800     +297     
- Misses        733      805      +72     
- Partials      146      155       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

concurrency/mutexmap.go Outdated Show resolved Hide resolved
concurrency/mutexmap.go Show resolved Hide resolved
concurrency/atomicmap.go Show resolved Hide resolved
Signed-off-by: Elena Kolevska <[email protected]>
Signed-off-by: Elena Kolevska <[email protected]>
@elena-kolevska
Copy link
Contributor Author

Thanks @ItalyPaleAle , haxmap looks great, I wasn't aware of it. I updated the PR in runtime to use it instead and removed the atomic map from kit. I'd like to keep the mutexmap though, because it has some more specialised methods that keep the code in runtime cleaner.

@elena-kolevska elena-kolevska changed the title Mutexmap & AtomicMaps Mutexmap May 19, 2024
.gitignore Outdated
**/.DS_Store
.idea
.vscode
.vs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: newline

concurrency/mutexmap.go Show resolved Hide resolved
Signed-off-by: Elena Kolevska <[email protected]>
concurrency/mutexmap.go Outdated Show resolved Hide resolved
Co-authored-by: Josh van Leeuwen <[email protected]>
Signed-off-by: Elena Kolevska <[email protected]>
JoshVanL
JoshVanL previously approved these changes May 20, 2024
@ItalyPaleAle
Copy link
Contributor

Thanks @ItalyPaleAle , haxmap looks great, I wasn't aware of it. I updated the PR in runtime to use it instead and removed the atomic map from kit

Just keep in mind one important thing about haxmap. If an item is added to the map while it's being resized internally, the item shows up only after the resizing is done. See here

Would this be a problem?

Signed-off-by: Elena Kolevska <[email protected]>
@elena-kolevska
Copy link
Contributor Author

If an item is added to the map while it's being resized internally, the item shows up only after the resizing is done.

Thanks for pointing that out. Does this also relate to GetOrSet? Will it block until the resize is complete, or?
Also... the issue doesn't seem to have reached a consensus on the cause of this error. Someone else suggested it's related to deletes. Do you remember more details? I know it's been a while, but I think this is important to get right 🙏

@ItalyPaleAle
Copy link
Contributor

It won't block. How haxmap works is that it allocates memory to store the items. If more items are needed than the memory that's available, it will need to allocate a new block of memory, copy the items, and then move the pointer to the new memory.

Haxmap doesn't block. As the docs say:

If a resizing operation is happening concurrently while calling Set() then the item might show up in the map only after the resize operation is finished

If you know in advance how many items you'll have, you can pre-allocate the haxmap with enough capacity for them.

@elena-kolevska
Copy link
Contributor Author

Unfortunately, we don't know the number of elements, it's going to depend on how users use the system (specifically, for now, it will be the number of namespaces). I suppose we could set the initial map size to a higher starting number of ~50, or make the size a placement service start up parameter, but I don't really like either of these options.

I'm concerned that a non-replicated data structure can't give me a read-your-write consistency. I didn't dig deep into the paper and the inner structure, and I'm sure if it were easy, it would have been done, but this doesn't look like an unsolvable problem. For example, the resize could be triggered in the background when the size reaches X% of full capacity, and even if we still get to a point where a write has to happen before the resize completes, the write operation shouldn't be acknowledged to the user before the resize has completed, for correctness.

Since we can have multiple sidecars from the same namespace trying to send their updates at the same time, we might indeed have a situation where this issue affects us.

Let me know if I'm understanding this correctly, or missing something.

@elena-kolevska
Copy link
Contributor Author

Looking at this issue now, there also seems to be a race on the GetOrSet operation, where a delete wouldn't be observed if it's concurrent to a GetOrSet. It's still eventually consistent, but we need strong consistency here.
If everyone's ok with it, I would like to move back to the sync.Mutex based map we had before, even though it's sad to give up on the performance benefits of haxmap.
The library is really interesting though, and maybe we could contribute to it in the future and address these issues.

@artursouza
Copy link
Member

artursouza commented May 21, 2024

+1 to use sync.Mutex map.

artursouza
artursouza previously approved these changes May 21, 2024
@artursouza artursouza dismissed their stale review May 21, 2024 14:26

Elena needs to undo the changes to move back to sync.Mutex map.

This reverts commit 20ca9ad.

Signed-off-by: Elena Kolevska <[email protected]>
@elena-kolevska
Copy link
Contributor Author

Thanks @artursouza , I reverted it.

yaron2
yaron2 previously approved these changes May 21, 2024
JoshVanL
JoshVanL previously approved these changes May 21, 2024
concurrency/atomicmap.go Outdated Show resolved Hide resolved
Signed-off-by: Elena Kolevska <[email protected]>
@elena-kolevska elena-kolevska dismissed stale reviews from JoshVanL and yaron2 via 8e24aad May 21, 2024 15:08
@artursouza artursouza merged commit 106329e into dapr:main May 23, 2024
8 checks passed
@ItalyPaleAle
Copy link
Contributor

@artursouza you merged this PR when there was only 1 vote. The comment should not have been resolved as the conversation was not over.

If the problem was the need to support both int64 and uint32, you don't need to duplicate the code. You can still use the atomic pattern, internally using only atomic.Int64. Since uint32 can safely be contained within int64, you can still use a generic class and then cast the response appropriatley.

@artursouza
Copy link
Member

@artursouza you merged this PR when there was only 1 vote. The comment should not have been resolved as the conversation was not over.

If the problem was the need to support both int64 and uint32, you don't need to duplicate the code. You can still use the atomic pattern, internally using only atomic.Int64. Since uint32 can safely be contained within int64, you can still use a generic class and then cast the response appropriatley.

Can you provide the code for that? I did hit limitations in the casting capabilities of Go - maybe there is a viable path. I am happy to take another PR on this but I don't think this possible improvement justifies blocking progress on 1.14, given the deadline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants