perf: Avoid iterating over already processed files with gcov #611

krlmlr · 2025-06-27T15:53:52Z

This is the reason that covr is running so slow in igraph. Reduced the check time from over one hour (timeout) to a whopping 8 minutes. Other projects with many C/C++ files will also benefit.

https://github.com/igraph/rigraph/actions/runs/15930556545/job/44938446998

jimhester · 2025-06-27T16:48:08Z

tests/testthat/test-Compiled.R

  # This header contains a C++ template, which requires you to run gcov for
  # each object file separately and merge the results together.


I think the problem with this change is this comment, if headers are included in multiple files they get tracked only once if you try to process all the object files at once, so you need to run gcov for each object file separately if you want accurate counting. This means you can have tests which exercise a header via one object file, and it might show up as uncovered depending on the order of the object files.

This is also why the coverage counts have now changed in the test.

I believe gcov is still run separately for each file, it's only that parse_gcov() (in R) is run once for each file, and not n times for the first file and once for the last file (giving a total of O(n^2) runs). I admit, the patch is a little difficult to read. What am I missing?

For each compilation unit you need to run gcov, then parse the results, then remove the output files. You can't just run gcov multiple times, then parse all the output files after the fact because the output files get overwritten by each individual run of gcov and you lose coverage information.

It is possible there is another better way to do this, but this was the only way I found back when I wrote this code originally.

Thanks: I thought I saw increasing runtime with each file: https://github.com/igraph/rigraph/actions/runs/15911464511/job/44880531901 . Let me do some experiments and get back to you.

krlmlr · 2025-06-27T19:00:58Z

It seems that with clean = FALSE we're processing more and more files with each new object. Possible solutions:

Collect gcov artifacts in separate subdirectories
Only look for changed files (could be unreliable with some file systems)

…ALSE`

krlmlr · 2025-06-27T20:23:10Z

Now:

Entirely running inside src/, shorter path names
Moving generated files out of sight with clean = FALSE
More verbosity

krlmlr · 2025-06-29T18:58:28Z

No more changes to test files
Example run (12 minutes instead of >1 hour): https://github.com/igraph/rigraph/actions/runs/15950882297/job/44990920986#step:9:8982

mcol · 2025-07-01T22:17:40Z

R/compiled.R

@@ -72,27 +72,56 @@ run_gcov <- function(path, quiet = TRUE, clean = TRUE,
     return()
  }

-  gcov_inputs <- list.files(path, pattern = rex::rex(".gcno", end), recursive = TRUE, full.names = TRUE)
+  res <- withr::local_dir(src_path)


This seems unused?

krlmlr mentioned this pull request Jun 27, 2025

Extend file_coverage to allow faster C file coverage? #605

Open

krlmlr requested a review from jimhester June 27, 2025 16:19

jimhester reviewed Jun 27, 2025

View reviewed changes

krlmlr changed the title ~~fix: Avoid iterating over already processed files with gcov~~ perf: Avoid iterating over already processed files with gcov Jun 27, 2025

perf: Avoid accumulation of work for each output file with `clean = F…

4969c8d

…ALSE`

krlmlr force-pushed the b-painter branch from 08946da to 4969c8d Compare June 27, 2025 20:21

Cleaner output

1a15fb3

mcol reviewed Jul 1, 2025

View reviewed changes

Unused

03f917d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: Avoid iterating over already processed files with gcov #611

perf: Avoid iterating over already processed files with gcov #611

krlmlr commented Jun 27, 2025 •

edited

Loading

Uh oh!

jimhester Jun 27, 2025 •

edited

Loading

Uh oh!

krlmlr Jun 27, 2025

Uh oh!

jimhester Jun 27, 2025

Uh oh!

jimhester Jun 27, 2025

Uh oh!

krlmlr Jun 27, 2025

Uh oh!

krlmlr commented Jun 27, 2025

Uh oh!

krlmlr commented Jun 27, 2025

Uh oh!

krlmlr commented Jun 29, 2025

Uh oh!

mcol Jul 1, 2025

Uh oh!

Uh oh!

		# This header contains a C++ template, which requires you to run gcov for
		# each object file separately and merge the results together.

perf: Avoid iterating over already processed files with gcov #611

Are you sure you want to change the base?

perf: Avoid iterating over already processed files with gcov #611

Conversation

krlmlr commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jimhester Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

krlmlr Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

jimhester Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

jimhester Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

krlmlr Jun 27, 2025

Choose a reason for hiding this comment

Uh oh!

krlmlr commented Jun 27, 2025

Uh oh!

krlmlr commented Jun 27, 2025

Uh oh!

krlmlr commented Jun 29, 2025

Uh oh!

mcol Jul 1, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

krlmlr commented Jun 27, 2025 •

edited

Loading

jimhester Jun 27, 2025 •

edited

Loading