Description
Current behavior 😯
The gix-fs::fs snapshot::journey
test case was introduced in f93aa61 (#1884) along with the into_owned_or_cloned
method that it tests. That test case fails intermittently.
This was first observed in #1892, but it is not related to the changes in that PR. It seems to happen occasionally on all platforms, though I have not found it to fail on macOS, only on Ubuntu and Windows. I suspect that this is merely because we have fewer macOS jobs than Ubuntu or Windows jobs, but I do not know that to be why it hasn't happened on macOS. All failures observed so far have been on CI. I have not experimented locally with it.
Although #1892 makes a change to a Windows job that is one of the jobs where the failure was observed, the change is made for its future effect, and it does not change what software currently gets installed. More importantly, the observed failure in that job on Windows (as well as in another separate Windows job) is from before that PR. The failures observed in that PR have been on Ubuntu (and also in multiple jobs).
Besides being observed in that PR, I have observed failures on the main branch of my fork, where I have rerun jobs to check for it, and unintentionally in #1864 after a rebase that causes it to have f93aa61 in its history. It does not appear that failure is more or less likely on any branch compared to any other branch (for branches that have the test in their history).
The failures look like:
FAIL [ 0.005s] gix-fs::fs snapshot::journey
──── STDOUT: gix-fs::fs snapshot::journey
running 1 test
test snapshot::journey ... FAILED
failures:
failures:
snapshot::journey
test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 26 filtered out; finished in 0.00s
──── STDERR: gix-fs::fs snapshot::journey
thread 'snapshot::journey' panicked at gix-fs/tests/fs/snapshot.rs:31:5:
assertion `left == right` failed: it picks up the change
left: "content"
right: "change"
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
The failing part is:
gitoxide/gix-fs/tests/fs/snapshot.rs
Lines 29 to 31 in 0bf1d5b
Specifically, that assert_eq!
fails, due to the test observing the old contents content
, rather than the new contents change
.
Expected behavior 🤔
The test should pass. I don't know if the problem is the test or the code under test though. It seems like the test straightforwardly asserts that any write that is completed from the perspective of Rust code that runs in sequence with a subsequent access to the snapshot should be observed in that subsequent access. If so, then that suggests the bug could be in the code under test.
Git behavior
Not applicable.
Steps to reproduce 🕹
Running the test repeatedly seems to work to reproduce it, at least on CI. Usually the test passes but sometimes it fails.
Observed failures on CI so far, in the order they have been observed, have been:
-
https://github.com/EliahKagan/gitoxide/actions/runs/13918804232/job/38947009079#step:10:1326
intest-32bit
onarm32v7
-
https://github.com/GitoxideLabs/gitoxide/actions/runs/13918804658/job/38948596656?pr=1892#step:8:1283
intest-fast
onubuntu-latest
-
https://github.com/GitoxideLabs/gitoxide/actions/runs/13918804658/job/38954967812?pr=1892#step:10:1326
intest-32bit
oni386
-
https://github.com/EliahKagan/gitoxide/actions/runs/13913874365/job/38956318291#step:7:1373
intest-fixtures-windows
-
https://github.com/EliahKagan/gitoxide/actions/runs/13913874365/job/38963529809#step:10:1324
intest-32bit
onarm32v7
-
https://github.com/EliahKagan/gitoxide/actions/runs/13913874365/job/38970448502#step:8:1333
intest-fast
onwindows-latest
-
https://github.com/GitoxideLabs/gitoxide/actions/runs/13927061201/job/38974254636?pr=1864#step:8:1284
intest-fast
onubuntu-24.04-arm
I originally "reported" this in #1892 (comment) but I figured it should have an issue, since its connection to that PR is coincidental.
Activity
test-fixtures-windows
#1892bash_program()
helper ingix-testtools
a little more robust #1864Byron commentedon Mar 19, 2025
Thanks for reporting and collecting all available information, and of course, sorry for the hassle :/.
When I introduced the test I was lucky enough to believe it works consistently, but thinking about it, the implementation is flawed.
I think the issue is here:
gitoxide/gix-fs/tests/fs/snapshot.rs
Lines 7 to 9 in f93aa61
And more specifically, here:
gitoxide/gix-fs/tests/fs/snapshot.rs
Lines 44 to 54 in f93aa61
The comparison in line 53 would also be true if nanoseconds aren't supported, and coincidentally the time switched from over from one second to another.
Fix is on the way.
gix-fs
(#1896) #1897improve detection of nanosecond support in `gix-fs` (#1896)
Merge pull request #1897 from GitoxideLabs/fix-ci
EliahKagan commentedon Mar 19, 2025
Which filesystems have nanosecond timestamps? If I understand you correctly, the test should be skipped on systems that don't have nanosecond timestamps but was sometimes wrongly not skipped on such systems. Failures occurred just about everywhere except macOS. Do all the non-macOS systems we test lack nanosecond timestamps?
Byron commentedon Mar 19, 2025
It seemed so. If you take a look at the PR that fixed it, I also hope you will conclude that this actually is a fix.
I think despite what's written there, it could also be false-negative if the operation takes exactly 1s, or multiple of a second. It can't be false-positive, I think (unless the filesystem is broken).
EliahKagan commentedon Mar 19, 2025
The same failure still occurs. Specifically, I rebased #1862 and #1864 onto main again, and the
test-fast
job onwindows-latest
failed on 720a23f in #1864 with:The commit with the failure, 720a23f, has 2c9b214 (#1897) in its history, and contains the code that was intended as a fix:
gitoxide/gix-fs/tests/fs/snapshot.rs
Lines 53 to 56 in 720a23f
So #1897 didn't have the intended effect, and this bug remains unfixed.
Edit: I've looked up the semantics of the operations performed in #1897, and also did some experiments with the function changed there and with some other code that is effectively an instrumented version of it, to measure how often it returns
true
on systems with only millisecond filesystem timestamp precision, as well as to examine the timestamps themselves and infer a likely mechanism. See #1897 (review) for details.