-
-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce copies when reading files in pyio, match behavior of _io #129005
Comments
`os.read` allocated and filled a buffer by calling `read(2)`, than that data was copied into the user provied buffer. Read directly into the caller's buffer instead by using `os.readv`. `self.read()` was doing the closed and readable checks so move those into `readinto`
`os.read` allocated and filled a buffer by calling `read(2)`, than that data was copied into the user provied buffer. Read directly into the caller's buffer instead by using `os.readv`. `self.read()` was doing the closed and readable checks so move those into `readinto`
`os.read` allocated and filled a buffer by calling `read(2)`, than that data was copied into the user provied buffer. Read directly into the caller's buffer instead by using `os.readv`. `self.read()` was doing the closed and readable checks so move those into `readinto`
`os.read` allocated and filled a buffer by calling `read(2)`, than that data was copied into the user provied buffer. Read directly into the caller's buffer instead by using `os.readv`. `self.read()` was doing the closed and readable checks so move those into `readinto`
I took a tangent and looked at the code complexity of adding
Happy to work on either course, slight preference for Know moving to a full PR for os.readinto would need to add tests and news, just focusing on "adding os.readinto" vs. "using os.readv". cc: @tomasr8 , @vstinner , @gpshead (reviewers where os.read vs. _Py_read / readinto has come up). |
Can you please open a separated issue for |
`os.read` allocated and filled a buffer by calling `read(2)`, than that data was copied into the user provied buffer. Read directly into the caller's buffer instead by using `os.readinto`. `os.readinto` uses `PyObject_GetBuffer`` to make sure the passed in buffer is writeable and bytes-like, drop the manual check.
`os.read` allocated and filled a buffer by calling `read(2)`, than that data was copied into the user provied buffer. Read directly into the caller's buffer instead by using `os.readinto`. `os.readinto` uses `PyObject_GetBuffer` to make sure the passed in buffer is writeable and bytes-like, drop the manual check.
`os.read()` allocated and filled a buffer by calling `read(2)`, than that data was copied into the user provied buffer. Read directly into the caller's buffer instead by using `os.readinto()`. `os.readinto()` uses `PyObject_GetBuffer()` to make sure the passed in buffer is writeable and bytes-like, drop the manual check.
Both now use a pre-allocated buffer of length `bufsize`, fill it using a readinto, and have matching "expand buffer" logic. On my machine this takes: `./python -m test -M8g -uall test_largefile -m test_large_read -v` from ~3.7 seconds to ~3.3 seconds
Slicing buf and appending chunk would always result in a copy. Commonly in a readall there is no already read data in buf, and the amount of data read may be large, so the copy is expensive.
This aligns the memory usage between _pyio and _io. Both use the same amount of memory.
The full set of changes on my Linux machine debug build, reduces |
Both now use a pre-allocated buffer of length `bufsize`, fill it using a readinto, and have matching "expand buffer" logic. On my machine this takes: `./python -m test -M8g -uall test_largefile -m test_large_read -v` from ~3.7 seconds to ~3.4 seconds
Both now use a pre-allocated buffer of length `bufsize`, fill it using a readinto(), and have matching "expand buffer" logic. On my machine this takes: `./python -m test -M8g -uall test_largefile -m test_large_read -v` from ~3.7 seconds to ~3.4 seconds.
Slicing buf and appending chunk would always result in a copy. Commonly in a readall() there is no already read data in buf, and the amount of data read may be large, so the copy is expensive.
This aligns the memory usage between _pyio and _io. Both now use the same amount of memory when reading a file.
The _pyio changes seems to have broken a number of bots; currently investigating
https://buildbot.python.org/#/builders/338/builds/7968/steps/6/logs/stdio (Failed on latest commit in main, 10ee2d9, but I think relates to these changes) cc: @vstinner |
…hon#129454)" This reverts commit e1c4ba9.
I think I found the issue in #129458. I tried using
(Pipes in the test are a case where we don't have a Validated by adding an assert locally, which fails every time: result[bytes_read:bufsize] = b'\0'
assert len(result) == bufsize, f"Should have expanded in size. {len(result)=}, {bufsize=}"
I don't see a clear way to resize the bytearray without making a whole array of null bytes which get copied into it... The tests just use the C API to resize: https://github.com/python/cpython/blob/main/Lib/test/test_capi/test_bytearray.py#L134-L160 Would adding a cc: @vstinner |
Move to a linear slice append with an iterator which has a length hint. This is more expensive then PyByteArray_Resize, but I think as efficient as can get without a new bytearray Python API to resize. The previous code didn't append as I had intended: ```python a = bytearray() >>> a[0:5] = b'\0' >>> a bytearray(b'\x00') >>> a[5:16] = b'\01' >>> a bytearray(b'\x00\x01') >>> len(a) 2 ```
Feature or enhancement
Proposal:
Currently
_pyio
uses ~2x as much memory to read all data from a file compared to _io. This is because it makes more than one copy of the data.Details from test_fileio run
Plan:
os.readv()
os.readinto()
to do readinto like C_Py_read
used by_io
does.os.read()
can't take a buffer to use. This aligns behavior between_io.FileIO.readall
and_pyio.FileIO.readall
.os.readv
works well today and takes a caller allocated buffer rather than needing to add a newos
API.readv(2)
mirrors the behavior and errors ofread(2)
, so this should keep the same end behavior._pyio.BufferedIO
to not force a copy of the buffer for readall when its internal buffer is empty. Currently it always slices its internal buffer then adds the result of_pyio.FileIO.readall
to it.For iterating, I'm using a small tracemalloc script to find where copies are:
Loose Ends
os.readv
seems to be well supported but is currently guarded by a configure check. I'd like to just make pyio requirereadv
, but can do conditional code if needed. If makingreadv
non-optional generally is feasible, happy to work on that.os.readv
is not supported on WASI, so need to add conditional code.Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response
Linked PRs
_pyio.FileIO.readall()
#129496The text was updated successfully, but these errors were encountered: