Skip to content

feat: finer dependency analysis between libraries (#4572)#14021

Closed
robinbb wants to merge 10 commits intoocaml:mainfrom
robinbb:robinbb-issue-4572
Closed

feat: finer dependency analysis between libraries (#4572)#14021
robinbb wants to merge 10 commits intoocaml:mainfrom
robinbb:robinbb-issue-4572

Conversation

@robinbb
Copy link
Copy Markdown
Collaborator

@robinbb robinbb commented Apr 3, 2026

Implements per-module library dependency filtering as proposed in #4572. When
library A depends on library B, Dune currently assumes every module in A
depends on every module in B — causing full recompilation of A whenever any
module in B changes. This PR uses ocamldep's per-module output to declare
cross-library file dependencies only for libraries a module actually
references.

This PR builds on prerequisite PRs:

Depends on #14030 and #14129.

The changes in this PR (on top of the above):

  • A new Lib_index (in lib_file_deps.ml) maps library entry module names
    back to their libraries. It is computed once per stanza and stored in
    Compilation_context.t.
  • read_immediate_deps_raw_of reads raw ocamldep output (including
    cross-library references that were previously silently dropped), filters
    out stanza-internal names, and looks up the remainder in the index to
    determine which libraries are actually needed.
  • Conservative fallbacks ensure correctness: special module kinds, link-time
    generated modules, and unresolvable references fall back to all-library
    deps.
  • Baseline test expectations are updated to show the improved behaviour.

Resolves #4572

@robinbb robinbb force-pushed the robinbb-issue-4572 branch 2 times, most recently from 13c8e53 to 618ea7f Compare April 3, 2026 03:40
@rgrinberg
Copy link
Copy Markdown
Member

Thanks for working on this. I expect users are going to be quite pleased to have this feature. Some preliminary questions:

Includes.make no longer bundles Hidden_deps with -I flags. Include
flags remain shared across all modules in a stanza; file dependencies are
now declared per-module in build_cm.

Why are the include flags are needed if we don't expect the library to be used by the module?

For each module, read_immediate_deps_raw_of reads the raw ocamldep output
(including cross-library references that were previously silently dropped),
filters out stanza-internal names, and looks up the remainder in the index
to determine which libraries are actually needed.

Do we have tests for when an internal module name is shadowing a library?

Conservative fallbacks ensure correctness: alias and wrapped-compat modules
skip library deps entirely; special module kinds, link-time generated
modules, and unresolvable references fall back to all-library deps.

Is it possible to move these changes to a separate PR and rebase this work on top of that?

@robinbb robinbb force-pushed the robinbb-issue-4572 branch 3 times, most recently from 7373af6 to 1892da3 Compare April 3, 2026 22:31
@robinbb
Copy link
Copy Markdown
Collaborator Author

robinbb commented Apr 3, 2026

Is it possible to move these changes to a separate PR and rebase this work on top of that?

@rgrinberg I have tried to do so with this PR #14030, depending on what you really meant by "these changes". :-)

@robinbb
Copy link
Copy Markdown
Collaborator Author

robinbb commented Apr 3, 2026

Thanks for working on this. I expect users are going to be quite pleased to have this feature.

You're very welcome, and thank you for your prompt attention to it!

@robinbb robinbb force-pushed the robinbb-issue-4572 branch from 1892da3 to 76a7875 Compare April 3, 2026 22:47
@robinbb
Copy link
Copy Markdown
Collaborator Author

robinbb commented Apr 3, 2026

Do we have tests for when an internal module name is shadowing a library?

I guess not. I have made #14031 for this. Please have a look.

Comment thread src/dune_rules/module_compilation.ml Outdated
@robinbb robinbb force-pushed the robinbb-issue-4572 branch 5 times, most recently from c2b7f96 to e22e456 Compare April 6, 2026 17:06
@nojb
Copy link
Copy Markdown
Collaborator

nojb commented Apr 6, 2026

Thanks for working on this patch @robinbb!

I tried this in our codebase by adding a new (unused) module to a library which is close to the root of the dependency tree, and doing dune build. Then I modified the unused module and redid dune build. At this point I expected for there not to be any recompilation at all, since the modified module is not used anywhere in the codebase. However, I observe some recompilation of modules (in other libraries) which do not depend on it. The libraries involved are all unwrapped.

Do you have any insight as to what may be causing the extra recompilation?

@Alizter Alizter requested review from Alizter and art-w April 6, 2026 17:22
@nojb
Copy link
Copy Markdown
Collaborator

nojb commented Apr 6, 2026

Do you have any insight as to what may be causing the extra recompilation?

I think one case where unexpected extra recompilation is triggered is when there a library consisting of a single module. If memory serves there is an optimization that avoids calling ocamldep in this case, but that may be defeating the optimization in this patch.

@Alizter
Copy link
Copy Markdown
Collaborator

Alizter commented Apr 6, 2026

Transparent module aliases hide library references from ocamldep when a module re-exports a library wrapper via module M = Mylib, ocamldep will miss the dependency on Mylib.

Here is the regression:


  $ cat > dune-project << EOF
  > (lang dune 3.0)
  > EOF
  $ mkdir lib && cat > lib/dune << EOF
  > (library (name mylib))
  > EOF
  $ cat > lib/mylib.ml << EOF
  > let v = 42
  > EOF
  $ cat > lib/mylib.mli << EOF
  > val v : int
  > EOF
  $ cat > dune << EOF
  > (executable (name main) (libraries mylib))
  > EOF
  $ cat > re.ml << EOF
  > module M = Mylib
  > EOF
  $ cat > re.mli << EOF
  > module M = Mylib
  > EOF
  $ cat > main.ml << EOF
  > let () = print_int Re.M.v
  > EOF

  $ dune build ./main.exe

Change mylib's interface:

  $ cat > lib/mylib.mli << EOF
  > val v : int
  > val w : string
  > EOF
  $ cat > lib/mylib.ml << EOF
  > let v = 42
  > let w = ""
  > EOF

The incremental build must succeed:

  $ dune build ./main.exe
  File "_none_", line 1:
  Error: Files .main.eobjs/native/dune__exe__Main.cmx and lib/mylib.cmxa
         make inconsistent assumptions over interface Mylib
  [1]

This probably also occurs for any module from Mylib not just the wrapper, so affects unwrapped, wrapped and singleton buildable stanzas. Before this change, the coarse glob would handle this unknowingly.

@nojb
Copy link
Copy Markdown
Collaborator

nojb commented Apr 6, 2026

Do you have any insight as to what may be causing the extra recompilation?

I think one case where unexpected extra recompilation is triggered is when there a library consisting of a single module. If memory serves there is an optimization that avoids calling ocamldep in this case, but that may be defeating the optimization in this patch.

Continuing to investigate this, an .mli in a library A containing only

val foo: Buffer.t

triggers the recompilation of the .cmi every time when a module in a library on which A depends on is changed. From a distance this looks like some missing logic to handle dependencies on stdlib modules.

@robinbb robinbb force-pushed the robinbb-issue-4572 branch 4 times, most recently from 4ff661e to eecf3ac Compare April 10, 2026 06:14
@art-w
Copy link
Copy Markdown
Collaborator

art-w commented Apr 10, 2026

Very cool work indeed!

Is you new PR #14116 superseding this one?

Either way, I think it's worth spending more time to improve the test suite, because it's easy to accidentally rely on subtle behaviors that are more permissive than the general case. For example, the -opaque flag makes it possible to skip some recompilations on benign changes, even though these recompilations would normally be required.

Here's a small reproduction which should help debug the issue where a library re-exports one of its dependencies with a module alias. This cram test also shows the subtleties in the setup that could make it pass (when it shouldn't!)


First we need a library where we'll perform the changes:

$ mkdir impl
$ cat > impl/dune <<EOF
> (library (name impl))
> EOF
$ cat > impl/impl.ml <<EOF
> let foo = "initial build"
> EOF

Then another library, which exposes an alias to impl:

$ mkdir alias
$ cat > alias/dune <<EOF
> (library (name alias) (libraries impl))
> EOF
$ cat > alias/alias.ml <<EOF
> module Impl = Impl
> EOF

And a binary which depends on Alias to access Impl:

$ mkdir bin
$ cat > bin/dune <<EOF
> (executable (name main) (libraries alias))
> EOF
$ cat > bin/main.ml <<EOF
> let () = print_endline Alias.Impl.foo
> EOF

Compilation is different when the executable is defined in a single file, so the bug doesn't trigger unless we also add an empty unused file: (!!)

$ touch bin/unused.ml

The first build succeeds:

$ dune exec ./bin/main.exe
initial build

Now we introduce a "soft" update, such that impl.cmi isn't modified:

$ cat > impl/impl.ml <<EOF
> let foo = "second build, no change to cmi"
> EOF

In that specific case, we are allowed to not recompile bin/main.ml because the -opaque flag guarantees we are not dependent on the implementation of Impl: (the linker checks that the cmi digest hasn't changed)

$ dune exec ./bin/main.exe
second build, no change to cmi

But if the cmi of Impl changed, then skipping the recompilation of main.ml is wrong: (the linker check fails)

$ cat > impl/impl.ml <<EOF
> let new_value = 42
> let foo = "third build, forced a cmi update"
> EOF

$ dune exec ./bin/main.exe
File "_none_", line 1:
Error: Files bin/.main.eobjs/native/dune__exe__Main.cmx and impl/impl.cmxa
       make inconsistent assumptions over interface Impl
[1]

Regarding revdeps testing, one issue to keep in mind is that testing a clean build (and not an incremental build) is likely to hide the failures:

$ dune clean
$ dune exec ./bin/main.exe
third build, forced a cmi update

@robinbb robinbb force-pushed the robinbb-issue-4572 branch from 12e0764 to d6ae7e8 Compare April 10, 2026 15:00
@robinbb
Copy link
Copy Markdown
Collaborator Author

robinbb commented Apr 10, 2026

Is you new PR #14116 superseding this one?

No, not necessarily. I have 2 PRs because I need to test more than one thing at a time, and I need access to the CI invocations. (Very difficult to replicate the CI corner cases that are failing, now, for this effort. Non-corner cases are passing.)

@robinbb
Copy link
Copy Markdown
Collaborator Author

robinbb commented Apr 10, 2026

Either way, I think it's worth spending more time to improve the test suite, because it's easy to accidentally rely on subtle behaviors that are more permissive than the general case. For example, the -opaque flag makes it possible to skip some recompilations on benign changes, even though these recompilations would normally be required.

Yes, agreed - we need more tests. The absence of test coverage is what is making it so difficult to get this right. It is difficult to see just how many different scenarios are actually working (but non-optimal), in the status quo. I have submitted many augmentations already to the test suite in dependency PRs to this one (14017, 14031, 14100, 14101, 14103) in an effort to reduce the risk, already. More still are required.

I see your advice for a suggested new test(s). I'll try to recreate that in a cram test. Possibly in a different PR (onto which I will rebase this branch).

@robinbb robinbb force-pushed the robinbb-issue-4572 branch from 0cd6556 to 8d0f084 Compare April 10, 2026 16:44
rgrinberg pushed a commit that referenced this pull request Apr 10, 2026
…14103)

## Summary
Add baseline test documenting that single-module library consumers are
unnecessarily recompiled when an unused dependency module changes. This
happens because dune skips ocamldep for singleton stanzas, so per-module
filtering cannot determine which libraries the module actually
references.

The code change to remove the singleton skip is deferred to the
per-module
filtering PR (#14021) where it is actually needed and can be tested
end-to-end.

Ref: #4572

## Test plan
- [x] `dune runtest
test/blackbox-tests/test-cases/per-module-lib-deps/single-module-lib.t`
passes

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
@robinbb robinbb force-pushed the robinbb-issue-4572 branch from 8d0f084 to eecf3ac Compare April 10, 2026 17:21
robinbb added 9 commits April 10, 2026 16:10
…ml#4572)

Add test verifying that when a library re-exports a dependency via a
module alias (module Impl = Impl), incremental builds correctly
recompile consumers when the re-exported library's .cmi changes.

Soft changes (implementation only, no .cmi change) can safely skip
recompilation due to -opaque. Hard changes (.cmi modified) must
trigger recompilation to avoid inconsistent interface assumptions.

Reproduction case from @art-w.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Move Hidden_deps (library file dependencies) out of Includes.make and
into per-module computation in build_cm. This is a pure refactor with no
behavioral change — all modules still depend on all libraries in the
stanza's requires.

This separation is necessary for issue ocaml#4572: Includes carries -I flags
(which must be shared across all modules) while Hidden_deps can vary
per-module. With this refactor, a future change can use ocamldep output
to filter Hidden_deps per-module without affecting -I flags.

- Includes.make no longer takes ~opaque or bundles Hidden_deps
- New deps_of_entries in lib_file_deps.ml handles opaque logic
- Alias and Wrapped_compat modules skip library deps (matching their
  existing Includes.empty behavior)
- deps_with_exts removed (sole caller was the old opaque branch)

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
…issue ocaml#4572)

Add a new function `read_immediate_deps_raw_of` to the ocamldep module that
returns raw module names (as Module_name.Set.t) from ocamldep output, without
resolving them to Module.t values.

This function will be used to determine which external library modules a
module actually references, enabling finer-grained dependency tracking where
modules are only recompiled when libraries they actually use change.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Change deps_of_entries from Lib.t list to (Lib.t * Module.t option) list.
Add deps_of_module (unused). Update all callers to pass (lib, None).
No behavioral change — the None path is functionally identical.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Add Lib_index module to lib_file_deps.ml/mli. Not used anywhere yet.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Add lib_index field and accessor to Compilation_context. Wired up
with Lib_index.create but not yet used in module_compilation.ml.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Split lib_cm_deps into lib_cm_deps_arg (static Command.Args.t) and
lib_cm_deps_dyn (Action_builder, no-op for now). Add can_filter
(always false). No behavioral change — lib_cm_deps_arg carries the
same Hidden_deps as before, lib_cm_deps_dyn is a no-op.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Add real can_filter computation (Melange excluded, dep_graph_dir check,
module kind check). The value is not used yet — lib_cm_deps_arg still
provides static deps for all modules. This tests whether merely computing
the can_filter conditions has side effects.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
Enable can_filter and add full filtering logic in lib_cm_deps_dyn
(Lib_index lookup, ocamldep refs, transitive deps, -open extraction).
Static lib_cm_deps_arg remains for ALL modules — never removed.

Both paths fire: modules get a superset of deps. The filtering
optimization does not activate (tests unchanged), but the dynamic
machinery runs for real. This tests whether the dyn_deps computation
itself causes CI failures.

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
@robinbb robinbb force-pushed the robinbb-issue-4572 branch 3 times, most recently from c60087d to 652ef16 Compare April 10, 2026 21:29
@robinbb
Copy link
Copy Markdown
Collaborator Author

robinbb commented Apr 10, 2026

Is you new PR #14116 superseding this one?

No, not necessarily. I have 2 PRs because I need to test more than one thing at a time, and I need access to the CI invocations. (Very difficult to replicate the CI corner cases that are failing, now, for this effort. Non-corner cases are passing.)

In the end, yes, it is.

When can_filter is true, remove static Hidden_deps so dyn_deps becomes
the sole library dep provider. Modules that don't reference a library
are no longer recompiled when that library changes.

Also includes -open flag extraction and -as-argument-for/-parameter
module name handling to ensure filtering doesn't drop implicit deps.

Test expectations updated. Reporting-of-cycles @package-cycle updated.
Alias-reexport test updated to document known limitation.

Resolves ocaml#4572

Signed-off-by: Robin Bate Boerop <me@robinbb.com>
@robinbb robinbb force-pushed the robinbb-issue-4572 branch from 652ef16 to fe0ce36 Compare April 10, 2026 22:00
@robinbb
Copy link
Copy Markdown
Collaborator Author

robinbb commented Apr 10, 2026

Deprecated in favour of #14116

@robinbb robinbb closed this Apr 10, 2026
@robinbb robinbb deleted the robinbb-issue-4572 branch April 10, 2026 22:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Finer dependency analysis between libraries

7 participants