emmylua_check produces nondeterministic diagnostics across repeated runs on an unchanged workspace
Environment:
- emmylua_check 0.23.1
- Linux
- text output mode
- same workspace, same config, no file changes between runs
Running emmylua_check repeatedly on an unchanged workspace produces different warning counts and different per-file diagnostic sections.
Example from three consecutive runs on the same tree:
- run 1: 70 warnings, 2 hints
- run 2: 72 warnings, 2 hints
- run 3: 63 warnings, 2 hints
This is not just output ordering. Entire diagnostic sections for some files appear/disappear between runs.
Expected behavior: identical diagnostics on repeated runs against the same workspace/config
Actual behavior: warning counts change between runs diagnostics for whole files can appear in one run and disappear in the next
How to reproduce:
Run emmylua_check multiple times against the same unchanged workspace.
Compare the outputs.
Example:
for i in 1 2 3; do
emmylua_check -c .emmyrc.json src > "run-$i.txt" 2>&1 || true
done
sha256sum run-*.txt
diff -u run-1.txt run-2.txt
diff -u run-2.txt run-3.txt
What I already ruled out:
- this is not caused by my wrapper script; the wrapper only execs:
emmylua_check -c .emmyrc.json src
- the workspace was unchanged between runs
- setting TOKIO_WORKER_THREADS=1 reduced the spread but did not eliminate it
Why I suspect an internal race/order-dependence:
emmylua_check diagnoses files concurrently from a shared analysis object:
for file_id in need_check_files.clone() {
let sender = sender.clone();
let analysis = analysis.clone();
tokio::spawn(async move {
let cancel_token = CancellationToken::new();
let diagnostics = analysis.diagnose_file(file_id, cancel_token);
sender.send((file_id, diagnostics)).await.unwrap();
});
}
So each file is diagnosed in its own task, all sharing the same Arc. The observed behavior looks like either:
- a race in shared lazy caches, or
- order-dependent analysis results across files
It would be useful to know whether diagnose_file is intended to be safely parallel on a shared analysis instance, or whether this should be serialized.
A --jobs 1 / serial-diagnosis mode would also help as a workaround if full determinism is not guaranteed today.
emmylua_check produces nondeterministic diagnostics across repeated runs on an unchanged workspace
Environment:
Running emmylua_check repeatedly on an unchanged workspace produces different warning counts and different per-file diagnostic sections.
Example from three consecutive runs on the same tree:
This is not just output ordering. Entire diagnostic sections for some files appear/disappear between runs.
Expected behavior: identical diagnostics on repeated runs against the same workspace/config
Actual behavior: warning counts change between runs diagnostics for whole files can appear in one run and disappear in the next
How to reproduce:
Run emmylua_check multiple times against the same unchanged workspace.
Compare the outputs.
Example:
What I already ruled out:
Why I suspect an internal race/order-dependence:
emmylua_check diagnoses files concurrently from a shared analysis object:
So each file is diagnosed in its own task, all sharing the same Arc. The observed behavior looks like either:
It would be useful to know whether diagnose_file is intended to be safely parallel on a shared analysis instance, or whether this should be serialized.
A
--jobs 1/ serial-diagnosis mode would also help as a workaround if full determinism is not guaranteed today.