Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question regarding build classpath resolution problem with Tycho 4 #4653

Closed
zladdi opened this issue Jan 27, 2025 · 9 comments · Fixed by #4658
Closed

Question regarding build classpath resolution problem with Tycho 4 #4653

zladdi opened this issue Jan 27, 2025 · 9 comments · Fixed by #4658

Comments

@zladdi
Copy link

zladdi commented Jan 27, 2025

Hello,

Context

At the company I work for, we have a long-standing Tycho project (> 10 years) that has some peculiarities. And before I ask the development team (which I am not part of) to address certain issues with the project setup, I wanted to ask here to get some clarifications.

The project consists of many (we are talking 100-eds of) Eclipse plugins, some of which may reference (via manifest entry Require-Bundle) each other. In some cases, the dependee plugin exports packages from an embedded JAR (whose relative path is specified in the Bundle-Classpath entry of its manifest) which are then imported and classes from which are referenced in the dependent's bundle.

The problem

Up until Tycho 3.0.5, the above setup built without problems and worked during runtime, too. But we now how to upgrade to Tycho 4 (at least 4.0.8, but we plan on getting the latest, which is 4.0.10 at time of writing). However, we cannot do this, as we run into build classpath problems - i.e. the packages/classes referenced by a plugin that reside in JARs embedded in a required bundle cannot be resolved anymore (i.e. errors of the form The import x.y.z cannot be resolved and The type x.y.z.T1 cannot be resolved. It is indirectly referenced from required type u.v.w.T2).

Small reproducible example

To demonstrate the exact problem and error messages we get, I have created a quite small (but not minimal) reproducible example here: https://github.com/zladdi/tycho-4-classpath-issue/

The setup is as follows:

  • Plugin dependee uses classes from commons-lang3-3.17.0.jar, which is embedded in the plugin under lib/ (i.e. Bundle-ClassPath: lib/commons-lang3-3.17.0.jar), and exports the package org.apache.commons.lang3 (via Export-Package entry in the plugin's manifest)
  • Plugin dependent requires the plugin dependee (manifest entry Require-Bundle: dependee) and imports the package org.apache.commons.lang3 (manifest entry Import-Package: org.apache.commons.lang3)
  • Within the dependent plugin there is some code referencing directly and indirectly (via return type of a class from plugin dependee) classes from package org.apache.commons.lang3
  • There is a parent/aggregator POM that contains the Tycho configuration and is used to build both plugins in one Reactor build

Attempting to build the the parent/aggregator project with the following command:

mvn -B clean verify -DskipTests -Dtycho.localArtifacts=ignore

fails with Tycho 4 (but succeeds with Tycho 3.0.5 and e.g. 2.7.5, for that matter) with the following errors, where <parent-of-repo> is the folder containing this repository/the aggregator/parent project:

[ERROR] Failed to execute goal org.eclipse.tycho:tycho-compiler-plugin:4.0.10:compile (default-compile) on project dependent: Compilation failure: Compilation failure:
[ERROR] <parent-of-repo>/tycho-4-classpath-issue/dependent/src/DependentExample.java:[1]
[ERROR]         import org.apache.commons.lang3.StringUtils;
[ERROR]                ^^^^^^^^^^
[ERROR] The import org.apache cannot be resolved
[ERROR] <parent-of-repo>/tycho-4-classpath-issue/dependent/src/DependentExample.java:[9]
[ERROR]         boolean sIsBlank = StringUtils.isBlank(s);
[ERROR]                            ^^^^^^^^^^^
[ERROR] StringUtils cannot be resolved
[ERROR] <parent-of-repo>/tycho-4-classpath-issue/dependent/src/DependentExample.java:[12]
[ERROR]         System.out.println("bitfield is: " + dependeeEx.getBitField());
[ERROR]                                              ^^^^^^^^^^^^^^^^^^^^^^^^
[ERROR] The type org.apache.commons.lang3.BitField cannot be resolved. It is indirectly referenced from required type dependee.DependeeExample
[ERROR] <parent-of-repo>/tycho-4-classpath-issue/dependent/src/DependentExample.java:[12]
[ERROR]         System.out.println("bitfield is: " + dependeeEx.getBitField());
[ERROR]                                                         ^^^^^^^^^^^
[ERROR] The method getBitField() from the type DependeeExample refers to the missing type BitField
[ERROR] 4 problems (4 errors)
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <args> -rf :dependent

A few notes:

  • As in the example, we are using the eager resolver, i.e.
<plugin>
    <groupId>org.eclipse.tycho</groupId>
    <artifactId>target-platform-configuration</artifactId>
    <version>${tycho-version}</version>
    <configuration>
        <requireEagerResolve>true</requireEagerResolve>
    ...

While the example project would build successfully when omitting this configuration option, the actual project won't. The reason for this is that with Tycho 4 we get an error of the form The projects in the reactor contain a cyclic reference: ... when using the default resolver. We will address this problem separately with the development team, but it could take quite some time until they will actually address this. That is why we switched to using <requireEagerResolve>true</requireEagerResolve>, which does not report that problem, in the first place

  • As mentioned above, this is a long-standing project, and the approach taken back in the day was a manifest-first approach. We want to keep it that way, as this would require quite some rework on the project, which, as mentioned before, is rather big in terms of number of plugins.

Some more info acquired during build debugging

I noticed the following log output when running the above Maven command with the -X switch

[DEBUG] Cannot fetch artifact info for dependee and location <parent-of-repo>/tycho-4-classpath-issue/ependee/target/dependee-0.0.1-SNAPSHOT.jar, using raw jar item for classpath

This message is logged at tycho-core-4.0.10.jar:org.eclipse.tycho.core.osgitools.OsgiBundleProject line 234. I stepped through the build in the debugger to find out why this happens. At OsgiBundleProject line 217, there is the code that fetches the artifact descriptor for a given dependency:

ArtifactDescriptor otherArtifact = getArtifact(artifacts, location, entry.getSymbolicName());

When building dependent and the entry in above code corresponds to dependee, the return value of getArtifact is null, which leads to above debug-level log message. Digging deeper into getArtifact, the key under which dependee is stored in a (linked hash) map that holds all artifacts is <parent-of-repo>/tycho-4-classpath-issue/dependee, i.e. the Maven project's base directory, while the lookup key is <parent-of-repo>/tycho-4-classpath-issue/dependee/target/dependee-0.0.1-SNAPSHOT.jar, i.e. the path to the Maven project's target artifact. As such, the lookup in getArtifact will not find the location of dependee and return null.

Question

Could someone please explain if the change in build classpath resolution behavior from Tycho 3 to 4 for above use case is intentional or a bug?

  • And if it is intentional: is there a way to get the constellation of class/package references working in Tycho 4 with the eager resolver, besides the obvious solutions of refactoring to not reference classes within an embedded JAR of one plugin by another plugin or fixing the circular-dependency problem encountered by the new default resolver?
@zladdi zladdi changed the title Clarification requested regarding build classpath resolution problem with Tycho 4 Question regarding build classpath resolution problem with Tycho 4 Jan 27, 2025
@laeubi
Copy link
Member

laeubi commented Jan 28, 2025

Hi @zladdi mayn thanks for your detailed report and example!

Embedding external jars like commons-lang3 is really bad and one should almost never need such things these days, so you should really work towards getting rid of it, if you see any issue with that let us know because this will simplify a lot of things and will make upgrading much easier.

In any case of course in general it should still work and I'll try to take a look into it the next days, if you like you can try to transform your example into an integration-test to demonstrate the issue that will ensure such issue do not get unnoticed in the future.

@zladdi
Copy link
Author

zladdi commented Jan 28, 2025

Thank you for the very fast response, @laeubi .

Agreed, the embedding of JAR files is bad practice. Both our team and the development team are aware of this painful technical debt in the project.

I appreciate the help I will try to transform the example into an integration test.

Thanks for all the great work by you and all the other Tycho contributors!

@laeubi
Copy link
Member

laeubi commented Jan 28, 2025

I have took a quick look and found out that the issue is caused by a slightly different classpath.

While in the working case one see the following classpath:

[DEBUG] Resolved 
[DEBUG]  dependee/lib/commons-lang3-3.17.0.jar[+dependee/*:+org/apache/commons/lang3/*:?**/*]
[DEBUG]  dependee/target/classes[+dependee/*:+org/apache/commons/lang3/*:?**/*]
[DEBUG]  dependent/target/classes

And with the failing case one can see

[DEBUG] Resolved
[DEBUG]  dependee/target/dependee-0.0.1-SNAPSHOT.jar[+dependee/*:+org/apache/commons/lang3/*:?**/*]
[DEBUG]  dependent/target/classes

so the issue is that we use the final build jar in one case and then it can't find the embedded jar of course. While it is strange that it happens explicitly with the <requireEagerResolve>true</requireEagerResolve> (I would have expected that this works "better" for such projects) I think the issue can in general arise also when a jar is already build and then consumed but Tycho should already handle the case then special.

I need to investigate deeper, but think the example itself is already small enough to be used as an integration case, so I just can encourage you to just integrate it then we should be able to verify if any fix makes the situation better here.

laeubi added a commit to laeubi/tycho that referenced this issue Jan 29, 2025
Currently we cache the "known location" for an artifact, but for a
reactor project this location can change (e.g. when it is finally packed
to a jar). This leads to the unfortunate situation that we probably no
longer find it in the map.

This now first lookup the project with the map key, but if it is not
found performs a deeper analysis of the artifacts and check if the
location actually maps to the artifact location of the reactor project.
If that is the case we use that instead of return nothing.

Fix eclipse-tycho#4653
@laeubi
Copy link
Member

laeubi commented Jan 29, 2025

I now found the culprit, and why it is a problem with eager resolve, fix is here:

  • Find reactor projects by location even if their file location changed #4658

  • If we do eager resolve, the the project is cached under a different key as we later find it (when it was packaged as a real jar), this then leads to a cache miss and we add the jar as a plain one and then we also do not perform expansion of the inner jars.

  • If eager resolve is false, then this binding is only performed later (when the file is actually already build) and everything works.

With the fix it works in both modes, I still want to encourage you to provide an integration-test to demonstrate the issue so we are sure such things do not happen in the future.

@laeubi laeubi closed this as completed in 4d75930 Jan 29, 2025
eclipse-tycho-bot pushed a commit that referenced this issue Jan 29, 2025
Currently we cache the "known location" for an artifact, but for a
reactor project this location can change (e.g. when it is finally packed
to a jar). This leads to the unfortunate situation that we probably no
longer find it in the map.

This now first lookup the project with the map key, but if it is not
found performs a deeper analysis of the artifacts and check if the
location actually maps to the artifact location of the reactor project.
If that is the case we use that instead of return nothing.

Fix #4653

(cherry picked from commit 4d75930)
@laeubi
Copy link
Member

laeubi commented Jan 30, 2025

Please try out the current tycho snapshot build if it also fixes the problem in your larger project setup!

@zladdi
Copy link
Author

zladdi commented Jan 30, 2025

Hi @laeubi
Thank you very much. Sorry for my sluggish replying - other stuff at work had higher priority.
I plan to try out the snapshot tomorrow, as well as providing the integration test.
Thank you again for the great work!

@laeubi
Copy link
Member

laeubi commented Jan 30, 2025

Sorry for my sluggish replying - other stuff at work had higher priority.

No problem @zladdi the report here was very well written and the example was helpful to identify the issue quickly, well done!

Thank you again for the great work!

If your company want to help with the development of Tycho itself, you might want to consider sponsor me to help with ongoing open source work.

@zladdi
Copy link
Author

zladdi commented Jan 31, 2025

Hi @laeubi
Since the build for the PR for branch tycho-4.0.x seems to have failed (PR #4659), and the currently latest 4.0.11-SNAPSHOT version does not seem to contain the fix (I got the same error message with that verison), I tried building with the latest 5.0.0-SNAPSHOT version. Unexpectedly, this actually worked and the build for the company-internal large project succeeded.

I also noticed that with version 4.0.11-SNAPSHOT OSGi execution environment filters seem to be checked more strictly than e.g. in version 4.0.10 (I had to update the Bundle-RequiredExecutionEnvironment manifest field of all our plugins to get the build past errors of the form [...] requires Execution Environment that matches (&(osgi.ee=JavaSE)(version=17)) but the current resolution context uses [a.jre.javase 11.0.0])). So, it is good to know this early that we will have to update a bit more once 4.0.11 is released.

I am presently finalizing the port of the example project to the Tycho integration tests. Looks good so far, but want to make another round of tests before I push to my fork and create a PR. Hopefully at the beginning of next week I ll have everything ready.

If your company want to help with the development of Tycho itself, you might want to consider sponsor me to help with ongoing open source work.

I will see what I can do, as I am just an engineer at the bottom of the corporate ladder. But maybe I can convince the higher-ups to contribute. Let's keep our fingers crossed 🤞

@laeubi
Copy link
Member

laeubi commented Jan 31, 2025

About the BREE error see: https://github.com/eclipse-tycho/tycho/blob/tycho-4.0.x/RELEASE_NOTES.md#4011

I'll try to take a look at the build sadly there are some infra issues at the moment :-(

laeubi added a commit that referenced this issue Jan 31, 2025
Currently we cache the "known location" for an artifact, but for a
reactor project this location can change (e.g. when it is finally packed
to a jar). This leads to the unfortunate situation that we probably no
longer find it in the map.

This now first lookup the project with the map key, but if it is not
found performs a deeper analysis of the artifacts and check if the
location actually maps to the artifact location of the reactor project.
If that is the case we use that instead of return nothing.

Fix #4653

(cherry picked from commit 4d75930)
eclipse-tycho-bot pushed a commit that referenced this issue Jan 31, 2025
Currently we cache the "known location" for an artifact, but for a
reactor project this location can change (e.g. when it is finally packed
to a jar). This leads to the unfortunate situation that we probably no
longer find it in the map.

This now first lookup the project with the map key, but if it is not
found performs a deeper analysis of the artifacts and check if the
location actually maps to the artifact location of the reactor project.
If that is the case we use that instead of return nothing.

Fix #4653

(cherry picked from commit 4d75930)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants