Native Image Committer Community Meeting 2022-02-10 #4310

christianwimmer · 2022-02-10T05:28:45Z

christianwimmer
Feb 10, 2022

List of all past and upcoming meetings: #3933

New and Noteworthy

GraalVM 22.1 feature freeze is on March 11.
JDK 8 removal
[GR-36671] [GR-35917] Re-introduce JDK8OrEarlier/JDK11OrLater and extend build output. #4277 Frameworks like Quarkus depend on internal SVM classes.
Compatibility improvements
[GR-36408] [GR-36500] Add a mode where the reference handling can be triggered manually. #4296 By default, a reference handling is now done in a separate thread. Without a separate thread, reference handling needs to be initiated manually by the application at a point where ideally no locks are held, to avoid potential deadlocks.
[GR-36601] Make ResourceURLConnection module aware. #4281
Proper getStackAccessControlContext implementation #3854
d52b7f1 Added support for ThreadMXBean.getThreadAllocatedBytes
Apple M1 support
[GR-36766] Darwin-aarch64: add support for MAP_JIT/pthread_jit_write_protect_np #4286
[GR-36135] Set parameters for aarch64 native abi correctly. #4234
Tracing agent
[GR-27173] Automatic conditional configuration generation. #4270
[GR-36755] Prevent conflicting writes to configuration file directories. #4307
Static analysis
[GR-25050] Concurrent heap scanning for points-to analysis. #4204 The new image heap scanning, as presented in the December meeting, is now merged.
Image build time
[GR-36568] Economy compiler configuration for SVM image building #4308
Also working on performance optimizations for the points-to analysis
JFR
Implementation of GC phase pause events #4272
Adding support for JFR ExecutionSample event. #4005
04668a0 Improve JFR testing infrastructure
8676f60 Moved all JFR-related files to com.oracle.svm.core
Other stuff
[GR-35915] Search for vulnerable log4j libraries in native images #4205
[GR-34749] Continuation support independent of Project Loom. #4114

Deep Dive: How static analysis and AOT compilation get their data, and how substitutions are implemented

For the most up-to-date version and clickable links, this description is also the JavaDoc of the class HostedUniverse

The native image generator uses multiple layers of implementations of JVMCI interfaces for types, methods and fields. In this documentation, we use the term "elements" to refer to types, methods, and fields. All elements of one particular layer are called a "universe". There are 4 layers in use:

The "HotSpot universe": the original source of elements, as parsed from class files
The "substitution layer" to modify some of the elements coming from class files, without modifying class files
The "analysis universe": elements that the static analysis operates on
The "hosted universe": elements that the AOT compilation operates on

Not covered in this documentation is the "substrate universe", i.e., elements that are used for JIT compilation at image run time when a native image contains the GraalVM compiler itself. JIT compilation is only used when a native image contains a Truffle language, e.g., the JavaScript implementation that is part of GraalVM. For "normal" applications, all code is AOT compiled and no JIT compilation is necessary.

Navigating the Layers

Elements of higher layers wrap elements of lower layers. There is generally "get wrapped" method available, e.g., HostedMethod.getWrapped() and AnalysisMethod.getWrapped(). The conversion form a lower layer to a higher layer is done via the universe classes, e.g., lookup(JavaMethod) and AnalysisUniverse.lookup(JavaMethod).

There is no standard way to navigate the substitution layer, because each element there has a different behaviour. In general it should be avoided as much as possible to reach directly into the substitution layer. But sometimes it is unavoidable, and then code needs to be written for a specific substitution element. For example, when it is necessary to introspect a method substitution, a direct cast to SubstitutionMethod is necessary.

JVMCI vs. Reflection

The JVMCI interfaces are similar to the reflection objects: ResolvedJavaType for Class, ResolvedJavaMethod for Method, ResolvedJavaField for Field. But using the JVCMI interfaces has many advantages over reflection. It provides access to VM-level information such as the bytecode array of a method; the constant pool of a class; the offset of a field. But more importantly, it is not necessary that there is an actual bytecode representation (and therefore a reflection object) of a JVMCI element. We make use of that in the substitution layer.

In general, it is always easy and possible to convert a reflection object to a JVMCI object. MetaAccessProvider has all the appropriate lookup methods for it. In JVMCI itself, there is no link back from a JVMCI object to a reflection object. But in the native image generator, it turned out to be necessary to sometimes convert back to a reflection object because not all necessary information is available via JVMCI. This can be done via the interfaces OriginalClassProvider, OriginalMethodProvider, and OriginalFieldProvider. The elements from the analysis universe and hosted universe implement these interfaces. But it is very important to state again: it is not necessary that there is an actual bytecode representation for JVCMI elements. This means there are JVMCI objects that do not have a corresponding reflection object. All code that uses reflection objects must therefore be prepared that the returned reflection object is null. And due to the substitution layer, any information returned by the reflection object can be different compared to the JVMCI object. Even the most trivial things like the name of an element.

The HotSpot Universe

Most elements in a native image originate from .class files. The native image generator does not contain a class file parser, so the only way information from class files flows in is via JVMCI from the Java HotSpot VM. Since JVMCI is VM independent, in theory any other Java VM that implements JVMCI could be the source of information. In practice, the Java HotSpot VM is the only known and supported VM for now. Still, it is frowned upon to reaching look into any JVMCI object of the HotSpot universe. Many of the HotSpot implementation classes are not public anyway, but even the public ones must not be used directly.

Using the HotSpot universe keeps a lot of complexity out of the native image generator. Here are some examples of code that does not exist in the native image generator:

A parser for the binary format of .class files.
Code to resolve and interpret constant pool entries.
Code to resolve virtual method calls, i.e., to compute the actual method invoked given a base class method and a concrete implementation type.
Code for subtype check, i.e., is type A assignable from type B.

The Analysis Universe

The AnalysisUniverse manages the types, methods, and fields that the static analysis operates on. These elements store information used during static analysis as well as the static analysis results, for example AnalysisType.isReachable() returns if that type was seen as reachable by the static analysis.

A static analysis implements BigBang. Currently, the only analysis in the project is PointsToAnalysis, but ongoing research projects investigate different kinds of static analysis. Therefore, the element types are extensible, for example there is PointsToAnalysisMethod as the implementation class used by PointsToAnalysis. Using these implementation classes should be avoided as much as possible, to keep the static analysis implementation exchangeable.

The elements in the analysis universe generally do not change the behavior of the elements they wrap. One could therefore argue that there should not be a analysis universe at all, and information used and computed by the static analysis should be stored in classes that do not extend the JVMCI interfaces. It is however quite convenient to have parsed Graal IR graphs that reference JVMCI objects from a consistent universe. The analysis universe therefore acts as a unifying layer above the quite unstructured substitution layer. And there are a few places where analysis elements do not delegate to the wrapped layer, for example to query if a type is initialized. The analysis layer also implements caches for a few operations that are expensive in the HotSpot layer, to reduce the time spent in the static analysis.

The Hosted Universe

The HostedUniverse manages the types, methods, and fields that the ahead-of-time (AOT) compilation operates on. These elements are created by the UniverseBuilder after the static analysis has finished. They store information such as the layout of objects (offsets of fields), the vtable used for virtual method calls, or information for is-assignable type checks in AOT compiled code.

For historic reasons, HostedType has subclasses for different kinds of types: HostedInstanceClass, HostedArrayClass, HostedInterface, and HostedPrimitiveType. There is not necessity to keep this class hierarchy, but also no need to remove it.

Having a separate analysis universe and hosted universe complicates some things. For example, graphs parsed for static analysis need to be "transplanted" from the analysis universe to the hosted universe (see code around CompileQueue.replaceAnalysisObjects). But the separate universes make AOT compilation more flexible because elements can be duplicated as necessary. For example, a method can be compiled with different optimization levels or for different optimization contexts. One concrete example are methods compiled as deoptimization entry points. Therefore, no code must assume a 1:1 relationship between analysis and hosted elements, but a 1:n relationship where there are multiple hosted elements for a single analysis element.

In theory, only analysis elements that are found reachable by the static analysis would need a corresponding hosted element. But in practice, this optimization did not work and therefore UniverseBuilder creates hosted elements also for unreachable analysis elements. It is therefore safe to assume that HostedUniverse returns a hosted element for every analysis element that is passed as an argument.

HostedUniverse returns the hosted element that was created by UniverseBuilder for the corresponding analysis element. If multiple hosted elements exist for an analysis element, the additional elements must be maintained in a secondary storage used by the AOT compilation phases that need them. For example, the mapping between a regular method and a deoptimization entry point method is maintained in CompilationInfo.

The Substitution Layer

The substitution layer is a not-so-well-defined set of elements that sit between the HotSpot universe and the analysis universe. These elements do not form a complete universe. This means that for the majority of elements that are not affected by any substitution, the analysis element directly wraps the HotSpot element. For example, for most types AnalysisType.getWrapped()} returns a HotSpotResolvedJavaType.

Substitutions are processed by a chain of SubstitutionProcessor that are typically registered by a Feature via FeatureImpl.DuringSetupAccessImpl.registerSubstitutionProcessor (note that this is not an API exposed to application developers). Pairs of lookup/resolve methods perform the substitution.

The annotations like Substitute, Alias, TargetClass (and several more) are processed by one particular implementation of SubstitutionProcessor: AnnotationSubstitutionProcessor. This a prominent and flexible substitution processor, but by far not the only one. Since many substitution processors are chained, there can also be chains of elements between a HotSpot element and a analysis element.

Elements produced by a substitution processor usually do one of the following things:

A substitution element wraps one other element and changes some aspects. For example, LambdaSubstitutionType does not much more than changing the name of its wrapped type, and injecting an annotation (so that the type appears as it implements an annotation that is actually not present in the .class file)
A substitution element wraps two other elements and combines aspects. For example, SubstitutionMethod wraps a SubstitutionMethod.getOriginal() method and a SubstitutionMethod.getAnnotated() method and then forwards calls to either of these depending on the operation (name and signature come from the original method, the bytecode from the annotated method).
A substitution element produces a new element that has no bytecode representation. For example, FactoryMethod is a synthetic method that combines the allocation and the constructor invocation of a type.

Substitution processors can modify many aspects of elements, but there are also hard limitations: they cannot modify aspects that are not implemented by the native image generator itself, but accessed via the HotSpot universe. For example, they cannot modify virtual method resolution and subtype checks (see the list in the section about the HotSpot universe). In general it is safe to say that substituted elements can change any behavior of one particular element, but not how multiple elements interact with each other (because substitutions are not a complete universe).

For example, SubstitutionType changes a lot of aspects that are local to an existing type (name, instance fields, ...). But it would be quite impossible to inject a new synthetic type into a class hierarchy because that type would not participate properly in virtual method resolution or subtype checks.

Open Discussion

Possible deep dive topics for next meeting

Please send suggestions, or "upvote" a suggestion, by adding a comment to this discussion.

Reflection configuration: "query only" vs. "allow invocation" configuration for methods
Conditional reflection configuration
Module system implementation: state and implications

zakkak · 2022-02-24T10:11:54Z

zakkak
Feb 24, 2022
Collaborator

As suggested by @olpaw in graalvm/mx#213 (comment) I would like to discuss the possibility of moving the mx projects com.oracle.svm.graal and com.oracle.svm.truffle out of the SVM distribution, in a separate distribution. This would essentially allow distributions that focus solely on the native-image capabilities of GraalVM to reduce their dependencies.

For a deep dive topic I would vote for "Module system implementation: state and implications" if time permits

Thanks

1 reply

christianwimmer Feb 28, 2022
Author

Confirmed, @olpaw will prepare some material about the module system support, and we can discuss the distribution structure.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Native Image Committer Community Meeting 2022-02-10 #4310

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Native Image Committer Community Meeting 2022-02-10 #4310

christianwimmer Feb 10, 2022

New and Noteworthy

Deep Dive: How static analysis and AOT compilation get their data, and how substitutions are implemented

Navigating the Layers

JVMCI vs. Reflection

The HotSpot Universe

The Analysis Universe

The Hosted Universe

The Substitution Layer

Open Discussion

Possible deep dive topics for next meeting

Replies: 1 comment · 1 reply

zakkak Feb 24, 2022 Collaborator

christianwimmer Feb 28, 2022 Author

christianwimmer
Feb 10, 2022

Replies: 1 comment 1 reply

zakkak
Feb 24, 2022
Collaborator

christianwimmer Feb 28, 2022
Author