Native Image Committer Community Meeting 2022-02-10 #4310
Unanswered
christianwimmer
asked this question in
Show and tell
Replies: 1 comment 1 reply
-
As suggested by @olpaw in graalvm/mx#213 (comment) I would like to discuss the possibility of moving the mx projects For a deep dive topic I would vote for "Module system implementation: state and implications" if time permits Thanks |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
List of all past and upcoming meetings: #3933
New and Noteworthy
GraalVM 22.1 feature freeze is on March 11.
JDK 8 removal
[GR-36671] [GR-35917] Re-introduce JDK8OrEarlier/JDK11OrLater and extend build output. #4277 Frameworks like Quarkus depend on internal SVM classes.
Compatibility improvements
[GR-36408] [GR-36500] Add a mode where the reference handling can be triggered manually. #4296 By default, a reference handling is now done in a separate thread. Without a separate thread, reference handling needs to be initiated manually by the application at a point where ideally no locks are held, to avoid potential deadlocks.
[GR-36601] Make ResourceURLConnection module aware. #4281
Proper getStackAccessControlContext implementation #3854
d52b7f1 Added support for ThreadMXBean.getThreadAllocatedBytes
Apple M1 support
[GR-36766] Darwin-aarch64: add support for MAP_JIT/pthread_jit_write_protect_np #4286
[GR-36135] Set parameters for aarch64 native abi correctly. #4234
Tracing agent
[GR-27173] Automatic conditional configuration generation. #4270
[GR-36755] Prevent conflicting writes to configuration file directories. #4307
Static analysis
[GR-25050] Concurrent heap scanning for points-to analysis. #4204 The new image heap scanning, as presented in the December meeting, is now merged.
Image build time
[GR-36568] Economy compiler configuration for SVM image building #4308
Also working on performance optimizations for the points-to analysis
JFR
Implementation of GC phase pause events #4272
Adding support for JFR ExecutionSample event. #4005
04668a0 Improve JFR testing infrastructure
8676f60 Moved all JFR-related files to com.oracle.svm.core
Other stuff
[GR-35915] Search for vulnerable log4j libraries in native images #4205
[GR-34749] Continuation support independent of Project Loom. #4114
Deep Dive: How static analysis and AOT compilation get their data, and how substitutions are implemented
For the most up-to-date version and clickable links, this description is also the JavaDoc of the class HostedUniverse
The native image generator uses multiple layers of implementations of JVMCI interfaces for types, methods and fields. In this documentation, we use the term "elements" to refer to types, methods, and fields. All elements of one particular layer are called a "universe". There are 4 layers in use:
Not covered in this documentation is the "substrate universe", i.e., elements that are used for JIT compilation at image run time when a native image contains the GraalVM compiler itself. JIT compilation is only used when a native image contains a Truffle language, e.g., the JavaScript implementation that is part of GraalVM. For "normal" applications, all code is AOT compiled and no JIT compilation is necessary.
Navigating the Layers
Elements of higher layers wrap elements of lower layers. There is generally "get wrapped" method available, e.g., HostedMethod.getWrapped() and AnalysisMethod.getWrapped(). The conversion form a lower layer to a higher layer is done via the universe classes, e.g., lookup(JavaMethod) and AnalysisUniverse.lookup(JavaMethod).
There is no standard way to navigate the substitution layer, because each element there has a different behaviour. In general it should be avoided as much as possible to reach directly into the substitution layer. But sometimes it is unavoidable, and then code needs to be written for a specific substitution element. For example, when it is necessary to introspect a method substitution, a direct cast to SubstitutionMethod is necessary.
JVMCI vs. Reflection
The JVMCI interfaces are similar to the reflection objects: ResolvedJavaType for Class, ResolvedJavaMethod for Method, ResolvedJavaField for Field. But using the JVCMI interfaces has many advantages over reflection. It provides access to VM-level information such as the bytecode array of a method; the constant pool of a class; the offset of a field. But more importantly, it is not necessary that there is an actual bytecode representation (and therefore a reflection object) of a JVMCI element. We make use of that in the substitution layer.
In general, it is always easy and possible to convert a reflection object to a JVMCI object. MetaAccessProvider has all the appropriate lookup methods for it. In JVMCI itself, there is no link back from a JVMCI object to a reflection object. But in the native image generator, it turned out to be necessary to sometimes convert back to a reflection object because not all necessary information is available via JVMCI. This can be done via the interfaces OriginalClassProvider, OriginalMethodProvider, and OriginalFieldProvider. The elements from the analysis universe and hosted universe implement these interfaces. But it is very important to state again: it is not necessary that there is an actual bytecode representation for JVCMI elements. This means there are JVMCI objects that do not have a corresponding reflection object. All code that uses reflection objects must therefore be prepared that the returned reflection object is null. And due to the substitution layer, any information returned by the reflection object can be different compared to the JVMCI object. Even the most trivial things like the name of an element.
The HotSpot Universe
Most elements in a native image originate from .class files. The native image generator does not contain a class file parser, so the only way information from class files flows in is via JVMCI from the Java HotSpot VM. Since JVMCI is VM independent, in theory any other Java VM that implements JVMCI could be the source of information. In practice, the Java HotSpot VM is the only known and supported VM for now. Still, it is frowned upon to reaching look into any JVMCI object of the HotSpot universe. Many of the HotSpot implementation classes are not public anyway, but even the public ones must not be used directly.
Using the HotSpot universe keeps a lot of complexity out of the native image generator. Here are some examples of code that does not exist in the native image generator:
The Analysis Universe
The AnalysisUniverse manages the types, methods, and fields that the static analysis operates on. These elements store information used during static analysis as well as the static analysis results, for example AnalysisType.isReachable() returns if that type was seen as reachable by the static analysis.
A static analysis implements BigBang. Currently, the only analysis in the project is PointsToAnalysis, but ongoing research projects investigate different kinds of static analysis. Therefore, the element types are extensible, for example there is PointsToAnalysisMethod as the implementation class used by PointsToAnalysis. Using these implementation classes should be avoided as much as possible, to keep the static analysis implementation exchangeable.
The elements in the analysis universe generally do not change the behavior of the elements they wrap. One could therefore argue that there should not be a analysis universe at all, and information used and computed by the static analysis should be stored in classes that do not extend the JVMCI interfaces. It is however quite convenient to have parsed Graal IR graphs that reference JVMCI objects from a consistent universe. The analysis universe therefore acts as a unifying layer above the quite unstructured substitution layer. And there are a few places where analysis elements do not delegate to the wrapped layer, for example to query if a type is initialized. The analysis layer also implements caches for a few operations that are expensive in the HotSpot layer, to reduce the time spent in the static analysis.
The Hosted Universe
The HostedUniverse manages the types, methods, and fields that the ahead-of-time (AOT) compilation operates on. These elements are created by the UniverseBuilder after the static analysis has finished. They store information such as the layout of objects (offsets of fields), the vtable used for virtual method calls, or information for is-assignable type checks in AOT compiled code.
For historic reasons, HostedType has subclasses for different kinds of types: HostedInstanceClass, HostedArrayClass, HostedInterface, and HostedPrimitiveType. There is not necessity to keep this class hierarchy, but also no need to remove it.
Having a separate analysis universe and hosted universe complicates some things. For example, graphs parsed for static analysis need to be "transplanted" from the analysis universe to the hosted universe (see code around CompileQueue.replaceAnalysisObjects). But the separate universes make AOT compilation more flexible because elements can be duplicated as necessary. For example, a method can be compiled with different optimization levels or for different optimization contexts. One concrete example are methods compiled as deoptimization entry points. Therefore, no code must assume a 1:1 relationship between analysis and hosted elements, but a 1:n relationship where there are multiple hosted elements for a single analysis element.
In theory, only analysis elements that are found reachable by the static analysis would need a corresponding hosted element. But in practice, this optimization did not work and therefore UniverseBuilder creates hosted elements also for unreachable analysis elements. It is therefore safe to assume that HostedUniverse returns a hosted element for every analysis element that is passed as an argument.
HostedUniverse returns the hosted element that was created by UniverseBuilder for the corresponding analysis element. If multiple hosted elements exist for an analysis element, the additional elements must be maintained in a secondary storage used by the AOT compilation phases that need them. For example, the mapping between a regular method and a deoptimization entry point method is maintained in CompilationInfo.
The Substitution Layer
The substitution layer is a not-so-well-defined set of elements that sit between the HotSpot universe and the analysis universe. These elements do not form a complete universe. This means that for the majority of elements that are not affected by any substitution, the analysis element directly wraps the HotSpot element. For example, for most types AnalysisType.getWrapped()} returns a HotSpotResolvedJavaType.
Substitutions are processed by a chain of SubstitutionProcessor that are typically registered by a Feature via FeatureImpl.DuringSetupAccessImpl.registerSubstitutionProcessor (note that this is not an API exposed to application developers). Pairs of lookup/resolve methods perform the substitution.
The annotations like Substitute, Alias, TargetClass (and several more) are processed by one particular implementation of SubstitutionProcessor: AnnotationSubstitutionProcessor. This a prominent and flexible substitution processor, but by far not the only one. Since many substitution processors are chained, there can also be chains of elements between a HotSpot element and a analysis element.
Elements produced by a substitution processor usually do one of the following things:
Substitution processors can modify many aspects of elements, but there are also hard limitations: they cannot modify aspects that are not implemented by the native image generator itself, but accessed via the HotSpot universe. For example, they cannot modify virtual method resolution and subtype checks (see the list in the section about the HotSpot universe). In general it is safe to say that substituted elements can change any behavior of one particular element, but not how multiple elements interact with each other (because substitutions are not a complete universe).
For example, SubstitutionType changes a lot of aspects that are local to an existing type (name, instance fields, ...). But it would be quite impossible to inject a new synthetic type into a class hierarchy because that type would not participate properly in virtual method resolution or subtype checks.
Open Discussion
Possible deep dive topics for next meeting
Please send suggestions, or "upvote" a suggestion, by adding a comment to this discussion.
Beta Was this translation helpful? Give feedback.
All reactions