Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slices move position of MappedByte underneath user. #512

Closed
TwoClocks opened this issue Apr 4, 2023 · 7 comments
Closed

Slices move position of MappedByte underneath user. #512

TwoClocks opened this issue Apr 4, 2023 · 7 comments
Assignees

Comments

@TwoClocks
Copy link

If you take a slice (bytesForRead()) from a MappedBytes there are instances where if you move the slice, the position in the underlay buffer changes.

See PR #511 for an example.

@JerryShea
Copy link
Contributor

@nicktindall could you have a quick look at this?

@nicktindall
Copy link
Contributor

nicktindall commented Apr 12, 2023

It looks like this is potentially a bug in the creation of the slice, and perhaps a quirk of the API. I will write up some tests to pin down what the issue is. One thing that's almost certainly an issue is how the bytes are created when the MappedBytes#isClear(), we use bytesStore.writeLimit() to set the write limit and if no chunk has been loaded bytesStore is a NoBytesStore, so writeLimit is 0 (that's why doing a read first alleviates it).

The failing attempt to read a long from the underlying bytes could just be a misuse of the API. There have been no relative writes to that instance at that point so readLimit and writeLimit are all zero. I'm not sure what API says those limits should be, but I know they're not persisted so it's likely they'd need to be set appropriately before attempts are made to read/write.

@peter-lawrey
Copy link
Member

It lazily picks the first chunk and some methods might not be handling that correctly.
It is possible this doesn't show as a problem for the order we call methods, but it should work in any order.

@yevgenp
Copy link
Contributor

yevgenp commented Oct 3, 2023

The test provided by issue reporter @TwoClocks is incorrect in it's expectation. Though the Bytes slice is obtained from MappedBytes root, it's contract (explicitly mentioned in the JavaDocs and README file in this repository) prohibits reading from an empty buffer (BytesStore) (i.e. when slice.readPosition() >= slice.writePosition()).
That's ensured by every implementation of Bytes/BytesStore (except for UncheckedBytes/UncheckedNativeBytes due to performance reasons) to delegate readPosition() calls to AbstractBytes.readPosition method which contains common validation logic.
The above contract/behavior is verified by BytesTest.testReadPosition() test.

@yevgenp yevgenp closed this as completed Oct 3, 2023
@TwoClocks
Copy link
Author

This explanation doesn't really make much sense.

  1. When opening an existing mapped file, the buffer isn't empty. It should contain the contents of the existing file. Are you saying your library doesn't do that? That's very strange behavior. The underlying OS calls don't mark the buffer as empty, and neither does any other mapping library in existence.

  2. If your library is going to take this very weird approach to mapped files, it should be consistent. Why would one readPositon variant work, and the other fail?

That's a bug no matter what.

Personally, I'd fix it so it works the way people would expect. But if you want to be contrarian, you should at least make it fail consistently.

@yevgenp
Copy link
Contributor

yevgenp commented Oct 5, 2023

Bytes and MappedBytes are completely different abstractions/interfaces though both extend AbstractBytes.
Bytes is an abstraction layer over a ByteBuffer, byte[], POJO, or native memory. It's contract is similar to java.nio.ByteBuffer and can't be read unless is written first.
MappedBytes is a specialized implementation of AbstractBytes that wraps memory-mapped data for random file access.
Though MappedBytes.bytesForRead() returns Bytes instance, it's not really useful for a random file access (as per design).

Bytes and MappedBytes serve different purposes. Just use MappedBytes for file related access.

Please read respective JavaDocs/README for details.

@TwoClocks
Copy link
Author

Then you should remove bytesForRead() from MappedBytes. If you're returning something that has UB in a specific instance, then that's a bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants