Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running clj-kondo with --parallel sometimes produce different caches #2391

Open
1 task done
mrkam2 opened this issue Sep 5, 2024 · 9 comments
Open
1 task done

Running clj-kondo with --parallel sometimes produce different caches #2391

mrkam2 opened this issue Sep 5, 2024 · 9 comments

Comments

@mrkam2
Copy link
Contributor

mrkam2 commented Sep 5, 2024

  • I have read the Clojure etiquette and will respect it when communicating on this platform.

version

2023.12.15

platform

Linux

problem

When we run clj-kondo in CI it occasionally produces linting flakes.

Here are some examples:
error: Unresolved symbol: db
error: Unresolved symbol: _

In CI, we start with a clean copy of our repository (we remove all untracked files with git clean -fdqx) and then run

clj-kondo --lint <classpath> --copy-configs --dependencies --parallel --config-dir <config-dir>

and then

clj-kondo --lint <paths> --config-dir <config-dir>`.

With this setup we get false linter warnings on the same code once every ~200 runs. I collected clj-kondo cache from those failures and compared to the cache when there are no failures. Here are a few examples of the differences:

$ git diff clj-kondo-cache-30891054 clj-kondo-cache-30920595
diff --git a/clj-kondo-cache-30891054/a.b.c.db-test.transit.json b/clj-kondo-cache-30920595/a.b.c.db-test.transit.json
index 31a8539..3505d60 100755
--- a/clj-kondo-cache-30891054/a.b.c.db-test.transit.json
+++ b/clj-kondo-cache-30920595/a.b.c.db-test.transit.json
@@ -1 +1 @@
-["^ ","~$db-target->model-target",["^ ","~:fixed-arities",["~#set",[1]],"~:private",true,"~:ns","~$a.b.c.db-test","~:name","^0","~:type","~:fn","~:col",1,"~:top-ns","^5","~:row",12],"~$db-representation->model-representation",["^ ","^1",["^2",[1]],"^3",true,"^4","^5","^6","^<","^7","^8","^9",1,"^:","^5","^;",18],"~$model-target->db-target",["^ ","^1",["^2",[1]],"^3",true,"^4","^5","^6","^=","^7","^8","^9",1,"^:","^5","^;",26],"~$model-representation->db-representation",["^ ","^1",["^2",[1]],"^3",true,"^4","^5","^6","^>","^7","^8","^9",1,"^:","^5","^;",31],"~$remove!-test",["^ ","^;",38,"^9",1,"^1",["^2",[1]],"^6","^?","^4","^5","^:","^5","^7","^8"],"~$batch-upsert!-test",["^ ","^;",87,"^9",1,"^1",["^2",[1]],"^6","^@","^4","^5","^:","^5","^7","^8"],"~:filename","a/b/c/test/a/b/c/db_test.clj"]
\ No newline at end of file
+["^ ","~$db-target->model-target",["^ ","~:fixed-arities",["~#set",[1]],"~:private",true,"~:ns","~$a.b.c.db-test","~:name","^0","~:type","~:fn","~:col",1,"~:top-ns","^5","~:row",12],"~$db-representation->model-representation",["^ ","^1",["^2",[1]],"^3",true,"^4","^5","^6","^<","^7","^8","^9",1,"^:","^5","^;",18],"~$model-target->db-target",["^ ","^1",["^2",[1]],"^3",true,"^4","^5","^6","^=","^7","^8","^9",1,"^:","^5","^;",26],"~$model-representation->db-representation",["^ ","^1",["^2",[1]],"^3",true,"^4","^5","^6","^>","^7","^8","^9",1,"^:","^5","^;",31],"~:filename","a/b/c/test/a/b/c/db_test.clj"]
\ No newline at end of file
$ git diff clj-kondo-cache-30891054 clj-kondo-cache-30900245
diff --git a/clj-kondo-cache-30891054/d.jobs.db-test.transit.json b/clj-kondo-cache-30900245/d.jobs.db-test.transit.json
index aae1ca4..514ff9b 100755
--- a/clj-kondo-cache-30891054/d.jobs.db-test.transit.json
+++ b/clj-kondo-cache-30900245/d.jobs.db-test.transit.json
@@ -1 +1 @@
-["^ ","~$job-id",["^ ","~:row",12,"~:col",1,"~:name","^0","~:ns","~$d.jobs.db-test","~:top-ns","^5","~:type","~:string"],"~$job-status-tests",["^ ","^1",14,"^2",1,"~:fixed-arities",["~#set",[1]],"^3","^9","^4","^5","^6","^5","^7","~:fn"],"~$add-failure-test",["^ ","^1",33,"^2",1,"^:",["^;",[1]],"^3","^=","^4","^5","^6","^5","^7","^<"],"~:filename","d/clj/jobs/test/d/jobs/db_test.clj"]
\ No newline at end of file
+["^ ","~$job-id",["^ ","~:row",12,"~:col",1,"~:name","^0","~:ns","~$d.jobs.db-test","~:top-ns","^5","~:type","~:string"],"~:filename","d/clj/jobs/test/d/jobs/db_test.clj"]
\ No newline at end of file

I also found that if I remove --parallel switch from the first command then the linting is stable as observed on over 1500 repeats (while with --parallel it fails after 100-200 repeats).

repro

I can't provide the exact repro since this is observed on the closed repository.

expected behavior

Absence of the flakes, cache to be identical on each run.

@borkdude
Copy link
Member

borkdude commented Sep 5, 2024

Can you try with the newest version? Preferably the version on master since I fixed something post-release.

E.g. use version 2024.08.30-20240905.181846-5

@mrkam2
Copy link
Contributor Author

mrkam2 commented Sep 6, 2024

Can you try with the newest version? Preferably the version on master since I fixed something post-release.

E.g. use version 2024.08.30-20240905.181846-5

Thanks, I started my test job to see if it reproduces.

@mrkam2
Copy link
Contributor Author

mrkam2 commented Sep 6, 2024

Can you try with the newest version? Preferably the version on master since I fixed something post-release.
E.g. use version 2024.08.30-20240905.181846-5

Thanks, I started my test job to see if it reproduces.

It still failed after 282 repeats. Although in another instance (of the same thing), it is still running without failures for 873 repeats. So overall, there is 1 failure in 1155 repeats.

The diff looks similar:

$ git diff clj-kondo-cache-30891054 clj-kondo-cache-31098793
diff --git a/clj-kondo-cache-30891054/test.transit.json b/clj-kondo-cache-31098793/test.transit.json
index 106308f..47164ad 100755
--- a/clj-kondo-cache-30891054/test.transit.json
+++ b/clj-kondo-cache-31098793/test.transit.json
@@ -1 +1 @@
-["^ ","~$configure-test-app!",["^ ","~:row",24,"~:col",1,"~:fixed-arities",["~#set",[1]],"~:name","^0","~:ns","~$test","~:top-ns","^7","~:type","~:fn"],"~$post-http*",["^ ","^1",56,"^2",1,"~:varargs-min-arity",2,"^5","^;","^6","^7","^8","^7","^9","^:"],"~$mock-ms",["^ ","^1",30,"^2",1,"^3",["^4",[1]],"^5","^=","^6","^7","^8","^7","~:arities",["^ ","~i1",["^ ","~:ret","~:number"]],"^9","^:"],"~$get-http*",["^ ","^1",61,"^2",1,"^<",2,"^5","^A","^6","^7","^8","^7","^9","^:"],"~$run!-mock",["^ ","^1",37,"^2",1,"^<",0,"^5","^B","^6","^7","^8","^7","^>",["^ ","~:varargs",["^ ","^?","~:seq","~:min-arity",0]],"^9","^:"],"~$h-r-m",["^ ","^1",127,"^2",1,"^3",["^4",[1]],"^5","^F","^6","^7","^8","^7","^9","^:"],"~$run!-mock-rm",["^ ","^1",42,"^2",1,"^<",0,"^5","^G","^6","^7","^8","^7","^>",["^ ","^C",["^ ","^?","^D","^E",0]],"^9","^:"],"~$getting-prod-run-id",["^ ","^1",343,"^2",1,"^3",["^4",[1]],"^5","^H","^6","^7","^8","^7","^9","^:"],"~$parse-route",["^ ","^1",49,"^2",1,"^3",["^4",[1]],"^5","^I","^6","^7","^8","^7","^>",["^ ","~i1",["^ ","^?","~:string"]],"^9","^:"],"~$hypertuning",["^ ","^1",66,"^2",1,"^3",["^4",[1]],"^5","^K","^6","^7","^8","^7","^9","^:"],"~:filename","m/core_test.clj","~$h-with-past-results",["^ ","^1",178,"^2",1,"^3",["^4",[1]],"^5","^M","^6","^7","^8","^7","^9","^:"],"~$build-mock-result",["^ ","^1",170,"^2",1,"^3",["^4",[1]],"^5","^N","^6","^7","^8","^7","^>",["^ ","~i1",["^ ","^?",["^ ","^9","~:map","~:val",["^ ","~:context_parameters",["^ ","^1",171,"~:end-row",172,"^2",24,"~:end-col",48,"~:tag",["^ ","^9","^O","^P",["^ ","~:mversion",["^ ","^1",171,"^R",171,"^2",40,"^S",72],"~:end-date",["^ ","^1",172,"^R",172,"^2",35,"^S",47,"^T","^J"]]]],"~:hp",["^ ","^1",173,"^R",173,"^2",21,"^S",58,"^T",["^ ","^9","^O","^P",["^ ","~:l2",["^ ","^1",173,"^R",173,"^2",26,"^S",28],"~:lr",["^ ","^1",173,"^R",173,"^2",44,"^S",57]]]],"~:metrics",["^ ","^1",174,"^R",175,"^2",13,"^S",73,"^T",["^ ","^9","^O","^P",["^ ","~:im",["^ ","^1",174,"^R",175,"^2",32,"^S",72]]]],"~:r_config",["^ ","^1",176,"^R",176,"^2",27,"^S",36]]]]],"^9","^:"],"~$h-a-prod-m",["^ ","^1",248,"^2",1,"^3",["^4",[1]],"^5","^11","^6","^7","^8","^7","^9","^:"]]
\ No newline at end of file
+["^ ","~$configure-test-app!",["^ ","~:row",24,"~:col",1,"~:fixed-arities",["~#set",[1]],"~:name","^0","~:ns","~$test","~:top-ns","^7","~:type","~:fn"],"~$post-http*",["^ ","^1",56,"^2",1,"~:varargs-min-arity",2,"^5","^;","^6","^7","^8","^7","^9","^:"],"~$mock-ms",["^ ","^1",30,"^2",1,"^3",["^4",[1]],"^5","^=","^6","^7","^8","^7","~:arities",["^ ","~i1",["^ ","~:ret","~:number"]],"^9","^:"],"~$get-http*",["^ ","^1",61,"^2",1,"^<",2,"^5","^A","^6","^7","^8","^7","^9","^:"],"~$run!-mock",["^ ","^1",37,"^2",1,"^<",0,"^5","^B","^6","^7","^8","^7","^>",["^ ","~:varargs",["^ ","^?","~:seq","~:min-arity",0]],"^9","^:"],"~$run!-mock-rm",["^ ","^1",42,"^2",1,"^<",0,"^5","^F","^6","^7","^8","^7","^>",["^ ","^C",["^ ","^?","^D","^E",0]],"^9","^:"],"~$parse-route",["^ ","^1",49,"^2",1,"^3",["^4",[1]],"^5","^G","^6","^7","^8","^7","^>",["^ ","~i1",["^ ","^?","~:string"]],"^9","^:"],"~:filename","m/core_test.clj","~$build-mock-result",["^ ","^1",170,"^2",1,"^3",["^4",[1]],"^5","^J","^6","^7","^8","^7","^>",["^ ","~i1",["^ ","^?",["^ ","^9","~:map","~:val",["^ ","~:context_parameters",["^ ","^1",171,"~:end-row",172,"^2",24,"~:end-col",48,"~:tag",["^ ","^9","^K","^L",["^ ","~:mversion",["^ ","^1",171,"^N",171,"^2",40,"^O",72],"~:end-date",["^ ","^1",172,"^N",172,"^2",35,"^O",47,"^P","^H"]]]],"~:hp",["^ ","^1",173,"^N",173,"^2",21,"^O",58,"^P",["^ ","^9","^K","^L",["^ ","~:l2",["^ ","^1",173,"^N",173,"^2",26,"^O",28],"~:lr",["^ ","^1",173,"^N",173,"^2",44,"^O",57]]]],"~:metrics",["^ ","^1",174,"^N",175,"^2",13,"^O",73,"^P",["^ ","^9","^K","^L",["^ ","~:im",["^ ","^1",174,"^N",175,"^2",32,"^O",72]]]],"~:r_config",["^ ","^1",176,"^N",176,"^2",27,"^O",36]]]]],"^9","^:"]]
\ No newline at end of file

Looks like it is a bit more stable.

@borkdude
Copy link
Member

borkdude commented Sep 6, 2024

Can you paste both transit files as an attachment? I'll just convert them to edn and then watch the diff, perhaps it helps... not sure.

@mrkam2
Copy link
Contributor Author

mrkam2 commented Sep 9, 2024

it is still running without failures for 873 repeats

This also failed on 1008th attempt. But surprisingly in this case there were not differences in the cache. There were failures in three files. All related to the use of context and GET macros defined and configured like this:

(defmacro context
  {:clj-kondo/lint-as 'compojure.core/context}
  [path args & routes]
  `(#'make-context
    ~path
    ~(#'compojure/context-route path)
    (fn [request#]
      (compojure/let-request [~args request#]
        (compojure/routes ~@routes)))))
(defmacro GET "Generate a `GET` route."
  {:clj-kondo/lint-as 'compojure.core/GET}
  [path args & body]
  (#'metrics-compile-route :get path args body))

Which made me think that maybe this is related to using compojure.core as a target for clj-kondo/lint-as? I recall this is not a correct target for the lint-as configuration. I'll give a try removing it.

@mrkam2
Copy link
Contributor Author

mrkam2 commented Sep 9, 2024

For the previous three failures that did result in a different cache files, I realized that all of them are related to the macros that look like this:

(defmacro defdbtest
  {:clj-kondo/lint-as 'clojure.core/defn}
  [name binding & forms]
  (let [db-sym (first binding)]
    `(clojure.test/deftest ~name
       (some-setup ~db-sym ~test-db-name
         ~@forms))))

Overall, there seem to be several reasons for the flakes I'm seeing, but the way they manifest themselves looks identical. To pinpoint the problem, I might need to check some internal state of clj-kondo in those different runs. Maybe the results of the :analyze run and compare them to the expected values not only when clj-kondo fails, but for all the runs as there could possibly be other variations not always manifested in a particular flake. For example, clj-kondo may be arriving at different internal states at different runs but those are not resulting in any error reports. Thoughts?

@borkdude
Copy link
Member

So far I haven't any thoughts, but if your suggestion is that it might have something to do with inline configs? This could be the case, thanks for the additional information so far.

Deraen pushed a commit to Deraen/clj-kondo that referenced this issue Sep 24, 2024
@borkdude
Copy link
Member

You could try to remove the inline configs (or just change the keyword into :clj-kondo/lint-asx or so) and then see if that makes a difference. You could just move this configuration to your local .clj-kondo/config.edn file and see if that stabilises things. If so, then we know where to look at. I looked at the code for inline configs but nothing immediately stands out to me to be thread-unsafe.

@mrkam2
Copy link
Contributor Author

mrkam2 commented Sep 27, 2024

You could try to remove the inline configs (or just change the keyword into :clj-kondo/lint-asx or so) and then see if that makes a difference. You could just move this configuration to your local .clj-kondo/config.edn file and see if that stabilises things. If so, then we know where to look at. I looked at the code for inline configs but nothing immediately stands out to me to be thread-unsafe.

Thanks for looking into it. For now, we'll be running it without --parallel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants