Add read_entire_file_as_cstring to os2 #5647

thetarnav · 2025-09-02T11:38:11Z

Add extra byte to the allocated read_entire_file buffer so it could be safely cast to a cstring.
Just so you can do cstring(raw_data(file_data)) without copying.

Kelimion · 2025-09-02T13:26:23Z

It would be very surprising for read_entire_file to return more than the file, and can lead to problems with binary file formats, or cause sanity checks to now fail on the file length or content hash.

It would be better to have a complementary read_entire_file_as_cstring that zero pads and returns as cstring instead of []u8. And if you add two zero bytes, you can then also cheaply cast it to utf-16.

JesseRMeyer · 2025-09-02T13:30:20Z

Alternatively, a proc group entry could be added that accepts a user provided buffer to write the file data into, which would be pre-sized to accommodate the null terminated character.

gingerBill · 2025-09-02T14:49:02Z

I think this is probably a bad idea, especially to do by default. If anything, we should have a separate procedure as @Kelimion states like read_entire_file_as_cstring or something just have a buffer to read into with that NUL terminator.

Kelimion · 2025-09-02T15:14:24Z

I think this is probably a bad idea, especially to do by default. If anything, we should have a separate procedure as @Kelimion states like read_entire_file_as_cstring or something just have a buffer to read into with that NUL terminator.

I'm leaning toward the buffer for its flexibility. Something like read_file that takes a file descriptor and buffer and returns the number of bytes read and an OS error. Allows you to read into a buffer with 1 extra byte, two extra bytes, or even read several files into a larger buffer at an increasing offset to concatenate them.

thetarnav · 2025-09-02T16:00:53Z

It would be very surprising for read_entire_file to return more than the file, and can lead to problems with binary file formats, or cause sanity checks to now fail on the file length or content hash.

The returned data is unchanged, a null byte is only added to the backing memory, after the file data. Which in a lot of cases is already there due to aligning. This just guarantees that it's there so you can cast safely.

It would be better to have a complementary read_entire_file_as_cstring that zero pads and returns as cstring instead of []u8. And if you add two zero bytes, you can then also cheaply cast it to utf-16.

That's fine as well. I only modified read_entire_file as it was a small change.

I think this is probably a bad idea, especially to do by default. If anything, we should have a separate procedure as @Kelimion states like read_entire_file_as_cstring or something just have a buffer to read into with that NUL terminator.

I'm leaning toward the buffer for its flexibility. Something like read_file that takes a file descriptor and buffer and returns the number of bytes read and an OS error. Allows you to read into a buffer with 1 extra byte, two extra bytes, or even read several files into a larger buffer at an increasing offset to concatenate them.

Isn't that just read?

Kelimion · 2025-09-02T16:31:04Z

I'm leaning toward the buffer for its flexibility. Something like read_file that takes a file descriptor and buffer and returns the number of bytes read and an OS error. Allows you to read into a buffer with 1 extra byte, two extra bytes, or even read several files into a larger buffer at an increasing offset to concatenate them.

Isn't that just read?

read_entire_file loops on read because it may not read the entire file at once, and "read_file" would do the same and thus differ from plain read.

It's not a bad idea to make read_entire_file a procedure group with the original procedure, and a "read_file" or read_entire_file_into_buffer that also loops until read fails (with EOF or otherwise) or the buffer is full.

The returned data is unchanged, a null byte is only added to the backing memory, after the file data. Which in a lot of cases is already there due to aligning. This just guarantees that it's there so you can cast safely.

I'm curious to know what @gingerBill thinks about this distinction. I still feel like it should be its own procedure.

thetarnav · 2025-09-02T18:13:58Z

read_entire_file loops on read because it may not read the entire file at once, and "read_file" would do the same and thus differ from plain read.

It's not a bad idea to make read_entire_file a procedure group with the original procedure, and a "read_file" or read_entire_file_into_buffer that also loops until read fails (with EOF or otherwise) or the buffer is full.

So the input buffer param would be ^[dynamic]u8 or some large enough []u8?
read_entire_file uses both methods depending on if the file size can be obtained or not.
If it's ^[dynamic]u8 then I don't mind. Although at this point it might as well accept a string builder :D

Should read_entire_file_as_cstring return the size as well? I guess if I need a cstring to pass it to some c funciton, I don't need a size, but it could be useful if I also want to use the string somewhere else and the size calculated anyway.
The patched read_entire_file is nice enough to provide both.

gingerBill · 2025-09-04T09:37:08Z

Another issues: adding an extra NUL terminator does cause issues with things like the mem.Tracking_Allocator. So it'll result in many false positives in those cases.

thetarnav · 2025-09-14T10:16:44Z

I've changed it to add an explicit read_entire_file_as_cstring procedure, which seems to work fine with a tracking allocator.

Tetralux · 2025-09-17T19:59:49Z

One concern I have here is that if there are any null bytes in the middle of the file then this is stuffed.

Add read_entire_file_as_cstring to os2

816696e

thetarnav force-pushed the read_entire_file-null-terminated branch from 235825e to 816696e Compare September 14, 2025 10:11

thetarnav changed the title ~~Make read_entire_file buffer null terminated~~ Add read_entire_file_as_cstring to os2 Sep 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add read_entire_file_as_cstring to os2 #5647

Add read_entire_file_as_cstring to os2 #5647

Uh oh!

thetarnav commented Sep 2, 2025

Uh oh!

Kelimion commented Sep 2, 2025 •

edited

Loading

Uh oh!

JesseRMeyer commented Sep 2, 2025

Uh oh!

gingerBill commented Sep 2, 2025

Uh oh!

Kelimion commented Sep 2, 2025

Uh oh!

thetarnav commented Sep 2, 2025

Uh oh!

Kelimion commented Sep 2, 2025 •

edited

Loading

Uh oh!

thetarnav commented Sep 2, 2025

Uh oh!

gingerBill commented Sep 4, 2025

Uh oh!

thetarnav commented Sep 14, 2025

Uh oh!

Tetralux commented Sep 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Add read_entire_file_as_cstring to os2 #5647

Are you sure you want to change the base?

Add read_entire_file_as_cstring to os2 #5647

Uh oh!

Conversation

thetarnav commented Sep 2, 2025

Uh oh!

Kelimion commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JesseRMeyer commented Sep 2, 2025

Uh oh!

gingerBill commented Sep 2, 2025

Uh oh!

Kelimion commented Sep 2, 2025

Uh oh!

thetarnav commented Sep 2, 2025

Uh oh!

Kelimion commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thetarnav commented Sep 2, 2025

Uh oh!

gingerBill commented Sep 4, 2025

Uh oh!

thetarnav commented Sep 14, 2025

Uh oh!

Tetralux commented Sep 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Kelimion commented Sep 2, 2025 •

edited

Loading

Kelimion commented Sep 2, 2025 •

edited

Loading