-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xsv table displays incorrectly #151
Comments
My guess is that this is happening because of the buffer size. That is, once the output exceeds the internal buffer size, the alignment resets. When I get a chance, I'll take a look at seeing what can be done to fix this. If there is a problem, it would need to be fixed in https://github.com/BurntSushi/tabwriter most likely. It's been a long time since I wrote that code, so I can't remember whether this is "by design" or not. |
I wasn't sure whether to file the bug here or there, but I don't really know how to write Rust so I figured I'd let you know here. It limits the usefulness of (I Really like xsv though, thanks for the work and thanks for looking at the issue so quickly) |
@llimllib A possible work-around, if you're on a Unix-like system and have the |
Hah, I went to |
Unfortunately the It seems like the "phase change" can occur in the middle of a line, and then the width of subsequent columns gets out of phase. Below it occurs in line 3, and then in the lines below, the IP address column is extremely wide. Until the next phase change, when this repeats.
|
@ursetto You probably need to use There is nothing significant about when rows get out of alignment. It's not connected to the data. It's just an internal buffer size, at which point, alignment is reset. The reason why this bug exists is because the underlying implementation does not allow itself to use memory without bound. |
I'm guessing you are saying to replace
This seems to work fine as long as you can find a delimiter not in your data. |
I recently encountered this issue too. Having taken a look, @BurntSushi was right - the cause is an internal buffer being filled, then flushed to the underlying TabWriter, which causes the layout to break. The buffer in question is the internal buffer of The effect of this flushing of TabWriter can be more-easily demonstrated if we reduce the buffer size used by the CSV writer down here, from 32K to 14, then: # 14th byte output to the tab writer is z
$ echo 'a,b\n1234,x\ny,z' | ./target/debug/xsv table
a b
1234 x
y z
# 14th byte output to the tab writer is the \t inserted after y
$ echo 'a,b\n12345,x\ny,z' | ./target/debug/xsv table
a b
12345 x
yz
# 14th byte output to the tab writer is y
$ echo 'a,b\n123456,x\ny,z' | ./target/debug/xsv table
a b
123456 x
y z
# 14th byte output to the tab writer is the \n after x
$ echo 'a,b\n1234567,x\ny,z' | ./target/debug/xsv table
a b
1234567 x
y z
Having made some tweaks in a local build, this issue is resolved if the csv-writer does not flush the underlying writer (TabWriter) when its internal buffer becomes full, but simply writes out the buffer to the underlying writer (leaving it to flush as-and-when necessary) - I'll raise an issue/PR against the csv library to do this |
Thanks for the much more detailed investigation, @owst |
This fixes a bug where the CSV writer was erroneously flushing the underlying writer when its internal buffer was full. Instead, when the internal buffer is full, it should simply write the contents of the buffer to the underlying writer. There is no need to flush it. Indeed, this causes other problems such as those observed in BurntSushi/xsv#151. We are careful to ensure that calling `flush` explicitly still calls `flush` on the underlying writer. Fixes #173
While BurntSushi/rust-csv#174 fixes the underlying issue, I suspect it will be a bit of time before I get around to another |
Appreciate all your work, thanks for the update |
Thanks @BurntSushi! FYI to build from source locally I had to give a more-specific diff --git a/Cargo.toml b/Cargo.toml
index 051fb3d..eea20da 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -30,7 +30,7 @@ opt-level = 3
[dependencies]
byteorder = "1"
crossbeam-channel = "0.2.4"
-csv = "1"
+csv = "1.1"
csv-index = "0.1.5"
docopt = "1"
filetime = "0.1" |
@BurntSushi Any chance this will be fixed in the master branch sometime in the future? |
Given data of very regular format,
xsv table
has a phase change that is irregular. I've reduced it to a test csv where it happens around line 575:The test file is available here: https://gist.githubusercontent.com/llimllib/91576d1fdbb1a0564932924af9313894/raw/3e716a3cd8aadd141a93e7683a5626876f5b548b/test.csv
The text was updated successfully, but these errors were encountered: