8303762: Optimize vector slice operation with constant index using VPALIGNR instruction by jatin-bhateja · Pull Request #24104 · openjdk/jdk

jatin-bhateja · 2025-03-18T20:51:46Z

Patch optimizes Vector. slice operation with constant index using x86 ALIGNR instruction.
It also adds a new hybrid call generator to facilitate lazy intrinsification or else perform procedural inlining to prevent call overhead and boxing penalties in case the fallback implementation expects to operate over vectors. The existing vector API-based slice implementation is now the fallback code that gets inlined in case intrinsification fails.

Idea here is to add infrastructure support to enable intrinsification of fast path for selected vector APIs, else enable inlining of fall-back implementation if it's based on vector APIs. Existing call generators like PredictedCallGenerator, used to handle bi-morphic inlining, already make use of multiple call generators to handle hit/miss scenarios for a particular receiver type. The newly added hybrid call generator is lazy and called during incremental inlining optimization. It also relieves the inline expander to handle slow paths, which can easily be implemented library side (Java).

Vector API jtreg tests pass at AVX level 2, remaining validation in progress.

Performance numbers:


System : 13th Gen Intel(R) Core(TM) i3-1315U

Baseline:
Benchmark                                                (size)   Mode  Cnt      Score   Error   Units
VectorSliceBenchmark.byteVectorSliceWithConstantIndex1     1024  thrpt    2   9444.444          ops/ms
VectorSliceBenchmark.byteVectorSliceWithConstantIndex2     1024  thrpt    2  10009.319          ops/ms
VectorSliceBenchmark.byteVectorSliceWithVariableIndex      1024  thrpt    2   9081.926          ops/ms
VectorSliceBenchmark.intVectorSliceWithConstantIndex1      1024  thrpt    2   6085.825          ops/ms
VectorSliceBenchmark.intVectorSliceWithConstantIndex2      1024  thrpt    2   6505.378          ops/ms
VectorSliceBenchmark.intVectorSliceWithVariableIndex       1024  thrpt    2   6204.489          ops/ms
VectorSliceBenchmark.longVectorSliceWithConstantIndex1     1024  thrpt    2   1651.334          ops/ms
VectorSliceBenchmark.longVectorSliceWithConstantIndex2     1024  thrpt    2   1642.784          ops/ms
VectorSliceBenchmark.longVectorSliceWithVariableIndex      1024  thrpt    2   1474.808          ops/ms
VectorSliceBenchmark.shortVectorSliceWithConstantIndex1    1024  thrpt    2  10399.394          ops/ms
VectorSliceBenchmark.shortVectorSliceWithConstantIndex2    1024  thrpt    2  10502.894          ops/ms
VectorSliceBenchmark.shortVectorSliceWithVariableIndex     1024  thrpt    2   9756.573          ops/ms

With opt:
Benchmark                                                (size)   Mode  Cnt      Score   Error   Units
VectorSliceBenchmark.byteVectorSliceWithConstantIndex1     1024  thrpt    2  34122.435          ops/ms
VectorSliceBenchmark.byteVectorSliceWithConstantIndex2     1024  thrpt    2  33281.868          ops/ms
VectorSliceBenchmark.byteVectorSliceWithVariableIndex      1024  thrpt    2   9345.154          ops/ms
VectorSliceBenchmark.intVectorSliceWithConstantIndex1      1024  thrpt    2   8283.247          ops/ms
VectorSliceBenchmark.intVectorSliceWithConstantIndex2      1024  thrpt    2   8510.695          ops/ms
VectorSliceBenchmark.intVectorSliceWithVariableIndex       1024  thrpt    2   5626.367          ops/ms
VectorSliceBenchmark.longVectorSliceWithConstantIndex1     1024  thrpt    2    960.958          ops/ms
VectorSliceBenchmark.longVectorSliceWithConstantIndex2     1024  thrpt    2   4155.801          ops/ms
VectorSliceBenchmark.longVectorSliceWithVariableIndex      1024  thrpt    2   1465.953          ops/ms
VectorSliceBenchmark.shortVectorSliceWithConstantIndex1    1024  thrpt    2  32748.061          ops/ms
VectorSliceBenchmark.shortVectorSliceWithConstantIndex2    1024  thrpt    2  33674.408          ops/ms
VectorSliceBenchmark.shortVectorSliceWithVariableIndex     1024  thrpt    2   9346.148          ops/ms

Please share your feedback.

Best Regards,
Jatin

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8303762: Optimize vector slice operation with constant index using VPALIGNR instruction (Enhancement - P4)

Reviewers

Xiaohong Gong (@XiaohongGong - Committer)
Eric Fang (@erifan - Author) 🔄 Re-review required (review applies to ae242926)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24104/head:pull/24104
$ git checkout pull/24104

Update a local copy of the PR:
$ git checkout pull/24104
$ git pull https://git.openjdk.org/jdk.git pull/24104/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 24104

View PR using the GUI difftool:
$ git pr show -t 24104

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24104.diff

Using Webrev

Link to Webrev Comment

…ALIGNR instruction

bridgekeeper · 2025-03-18T20:52:50Z

👋 Welcome back jbhateja! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2025-03-18T20:53:17Z

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

openjdk · 2025-03-18T20:53:55Z

@jatin-bhateja The following labels will be automatically applied to this pull request:

core-libs
graal
hotspot

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

jatin-bhateja · 2025-03-18T20:54:23Z

/label add hotspot-compiler-dev

openjdk · 2025-03-18T20:55:02Z

@jatin-bhateja
The hotspot-compiler label was successfully added.

bridgekeeper · 2025-05-13T21:01:30Z

@jatin-bhateja This pull request has been inactive for more than 8 weeks and will be automatically closed if another 8 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

jatin-bhateja · 2025-05-18T01:41:49Z

/keepalive

openjdk · 2025-05-18T01:42:45Z

@jatin-bhateja The pull request is being re-evaluated and the inactivity timeout has been reset.

openjdk · 2025-05-18T01:44:01Z

@jatin-bhateja this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout JDK-8303762
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

bridgekeeper · 2025-07-13T06:02:16Z

@jatin-bhateja This pull request has been inactive for more than 8 weeks and will be automatically closed if another 8 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

jatin-bhateja · 2025-07-25T02:55:41Z

Performance after AVX2 backend modifications

Benchmark                                                (size)   Mode  Cnt      Score   Error   Units
VectorSliceBenchmark.byteVectorSliceWithConstantIndex1     1024  thrpt    2  51644.530          ops/ms
VectorSliceBenchmark.byteVectorSliceWithConstantIndex2     1024  thrpt    2  48171.079          ops/ms
VectorSliceBenchmark.byteVectorSliceWithVariableIndex      1024  thrpt    2   9662.306          ops/ms
VectorSliceBenchmark.intVectorSliceWithConstantIndex1      1024  thrpt    2  14358.347          ops/ms
VectorSliceBenchmark.intVectorSliceWithConstantIndex2      1024  thrpt    2  14619.920          ops/ms
VectorSliceBenchmark.intVectorSliceWithVariableIndex       1024  thrpt    2   6675.824          ops/ms
VectorSliceBenchmark.longVectorSliceWithConstantIndex1     1024  thrpt    2    818.911          ops/ms
VectorSliceBenchmark.longVectorSliceWithConstantIndex2     1024  thrpt    2   4778.321          ops/ms
VectorSliceBenchmark.longVectorSliceWithVariableIndex      1024  thrpt    2   1612.264          ops/ms
VectorSliceBenchmark.shortVectorSliceWithConstantIndex1    1024  thrpt    2  35961.146          ops/ms
VectorSliceBenchmark.shortVectorSliceWithConstantIndex2    1024  thrpt    2  39072.170          ops/ms
VectorSliceBenchmark.shortVectorSliceWithVariableIndex     1024  thrpt    2  11209.685          ops/ms

mlbridge · 2025-07-25T13:50:54Z

Webrevs

jatin-bhateja · 2026-01-23T06:30:33Z

Hi @XiaohongGong , your comments have been addressed.
Hi @sviswa7, can you kindly review x86 part.

XiaohongGong

LGTM! Thanks for your updating!

sviswa7 · 2026-01-26T17:43:41Z

Hi @XiaohongGong , your comments have been addressed. Hi @sviswa7, can you kindly review x86 part.

Thanks @jatin-bhateja. I will take a look next week.

jatin-bhateja · 2026-02-03T12:54:30Z

Hi @sviswa7 , Please let me know if you have comments on x86 backend part.

jatin-bhateja · 2026-02-16T11:05:35Z

Hi @sviswa7, @merykitty Can you please review x86 backed implementation.

merykitty

I still think it is not a good solution to add an intrinsic method for this operation. We should add constant info to TypeVect and transform the slice pattern into VectorSliceNode. I think it is adequate to add bit info (zeros and ones) to each TypeVect instance so that we can do decent inference without too much additional memory overhead.

merykitty · 2026-02-16T14:02:13Z

src/hotspot/share/opto/callGenerator.cpp

    JVMState* new_jvms = inline_cg()->generate(jvms);
+    // Attempt inlining fallback implementation in case of
+    // intrinsification failure.
+    if (new_jvms == nullptr && is_vector_late_inline()) {


This may be problematic if the intrinsification does not succeed because the arguments have not been constant-folded. It is because the order in which methods are processed during incremental inline is not deterministic.

Hi @merykitty , Intrinsicification failure due to any such reason is same with and without this patch, in case of slice intrinsic failure we simply inline the fallback implementation which is comprised of vector APIs, VectorSupport* entry points of APIs should then go through intrinsification attempts independently and may succeeded or fail if constraints are not met.

CallStaticJavaNode::Ideal will enqueue the call for incremental inline again when it is invoked. That means if the method fails to get intrinsified at first, then its arguments constant-fold, then the intrinsification may succeed upon retry.

I think there is no harm in-lining fallback after first un-successful attempt of intrinsification for sliceOp, as fallback is composed of vectorAPI and we are giving them opportunity for intrinsificaiton, this save costly boxing operation and performance will be at par with what we have today. WDYT ?

But this will affect other intrinsics, too, they are not implemented using other vector API operations.

We only perform fallback inlining on first intrinsification failure for sliceOp, this is a very localized change.
https://github.com/jatin-bhateja/jdk/blob/1dfff5589c8b6c83dfc9810bddbb676c7982c904/src/hotspot/share/opto/callGenerator.cpp#L455

Thanks for pointing out the change. I think that's more hacky than I have expected.

This is selective enablement of inlining for intrinsic failures which uses vector API in the fall back implimentation.

jatin-bhateja · 2026-02-16T15:12:54Z

I still think it is not a good solution to add an intrinsic method for this operation. We should add constant info to TypeVect and transform the slice pattern into VectorSliceNode. I think it is adequate to add bit info (zeros and ones) to each TypeVect instance so that we can do decent inference without too much additional memory overhead.

What you are asking for is a bigger generic change which can be taken up as a separate RFE once this is committed.

merykitty · 2026-02-17T04:56:54Z

What you are asking for is a bigger generic change which can be taken up as a separate RFE once this is committed.

I think there is no need to rush this functionality, and this will become unnecessary when TypeVect can track const-ness of the value, so we should not proceed with this change.

jatin-bhateja · 2026-02-17T10:47:59Z

What you are asking for is a bigger generic change which can be taken up as a separate RFE once this is committed.

I think there is no need to rush this functionality, and this will become unnecessary when TypeVect can track const-ness of the value, so we should not proceed with this change.

Slice operations are used in simdjson UtfValidator and its better to push this patch rather than holding it for some future optimization.
simdjson/simdjson-java#68

merykitty · 2026-02-17T12:16:43Z

I don't agree with this change because the benefit is little, the intrinsic does not stand in the future, and the implementation is hacky and not trivial. I think I can accept this PR if:

It can be proved that the intrinsic is necessary even when we have constant information for TypeVect.
The fall back inlining can be made more generally applicable, and it is reliable under the non-determinism of incremental inlining.

jatin-bhateja · 2026-02-19T06:21:39Z

I don't agree with this change because the benefit is little, the intrinsic does not stand in the future, and the implementation is hacky and not trivial. I think I can accept this PR if:

It can be proved that the intrinsic is necessary even when we have constant information for TypeVect.

The fall back inlining can be made more generally applicable, and it is reliable under the non-determinism of incremental inlining.

Hi @merykitty,

Thank you for the detailed feedback. I’ve looked into the alternative you suggested—adding constant/lane info to TypeVect and having the compiler recognize the slice pattern (rearrange + blend with constant shuffle and mask) and replace it with VectorSliceNode, without a dedicated intrinsic—and I’d like to explain why I still believe the hybrid call generator is the more practical choice for this change, while keeping the door open for the pattern-based approach later.

Why the pattern-based approach is still very non-trivial

Even after we have constant shuffle and constant mask recognizing the slice idiom and recovering origin remains highly non-trivial:

The shuffle is not a single constant node. It is the output of iotaShuffle(origin, 1, true), which is implemented by several nodes in the IR. So we must match that iotaShuffle subgraph and find the node whose input is the scalar origin (and require that to be ConI),
The mask is compare-with-constant (e.g. iota.compare(LT, filter) with filter = vlen - origin) .
**Actual patten match involving constant shuffle / mask ** depends on getting the blend/rearrange wiring right (which input is vec1 vs vec2, both rearranges using the same shuffle, mask semantics matching slice).

So even with TypeVect constant info, we still need very complex / non-trivial pattern match probable the biggest pattern match in idiealization

Why the hybrid call generator is a better fit for this change

Bounded complexity: We have one clear boundary—the slice call—and one check—origin->is_con(). If true, we emit VectorSliceNode; otherwise we inline the Java fallback. No graph pattern matching, no decoding of shuffle/mask vectors, no dependency on the exact shape of iotaShuffle or compare nodes.
Same user benefit when it works: In both approaches, constant-index slice gets VectorSliceNode → VPALIGNR. With the hybrid approach, if we don’t intrinsify (e.g. origin not yet constant), we still inline the fallback and the subsequent Vector API calls can be optimized as today.

Proposed path

I think the most practical path is to proceed with the hybrid call generator for this PR so we can ship a reliable slice optimization (including for use cases like simdjson) without taking on the full cost and risk of TypeVect constant info and slice-pattern matching right now. Once this is in, we have a clear baseline: constant-index slice is optimized; variable index uses the inlined fallback.

If we later introduce TypeVect constant info and a more generic framework for recognizing vector idioms (e.g. patterns that consume constant shuffle/mask or that match subgraphs like iotaShuffle), we can revisit slice: add a pattern-based recognition that either supplements or eventually replaces the intrinsic. Please note x86 back end implimentation will still be usable if we later remove intrinsic and use complex patten matching to deduce VectorSliceNode

Thanks again for the review.

merykitty · 2026-02-19T06:54:40Z

Even after we have constant shuffle and constant mask recognizing the slice idiom and recovering origin remains highly non-trivial

I think you are misunderstanding, when TypeVect has constant information, the shuffle indices and the blend mask will constant fold, and there is no iotaShuffle subgraph or the compare-with-constant node there to inspect to extract the origin.

jatin-bhateja · 2026-02-19T07:35:50Z

Even after we have constant shuffle and constant mask recognizing the slice idiom and recovering origin remains highly non-trivial

I think you are misunderstanding, when TypeVect has constant information, the shuffle indices and the blend mask will constant fold, and there is no iotaShuffle subgraph or the compare-with-constant node there to inspect to extract the origin.

Given that vector lane itself are variable an application of constant shuffle will not fold rearrange, so a complex pattern match to infer VectorSliceNode (with constant index) probably in blend idealization will still have to deal with a graph pallet consisting of blend with constant mask, two re-arrange nodes with constant shuffles, and some other non-folded nodes.

    /*package-private*/
    final
    @ForceInline
    IntVector sliceTemplate(int origin, Vector<Integer> v1) {
        IntVector that = (IntVector) v1;
        that.check(this);
        Objects.checkIndex(origin, length() + 1);
        IntVector iotaVector = (IntVector) iotaShuffle().toBitsVector();
        IntVector filter = broadcast((int)(length() - origin));
        VectorMask<Integer> blendMask = iotaVector.compare(VectorOperators.LT, filter);
        AbstractShuffle<Integer> iota = iotaShuffle(origin, 1, true);
        return that.rearrange(iota).blend(this.rearrange(iota), blendMask);
    }

I think its reasonable to proceed with the hybrid call generator for this PR so we can ship a reliable slice optimization (including for use cases like simdjson) without taking on the full cost and risk of TypeVect constant info and slice-pattern matching right now. In future once constant TypeVect are in production we can replace existing intrinsification approach with patten match and still use the backend implementation as it is.

Lets think over it and also seek other's opinion (@sviswa7)

XiaohongGong · 2026-02-23T06:43:51Z

I’m fine with using the more straightforward approach to intrinsify the slice API when the origin is a constant. In my view, this could also benefit other APIs and future optimizations (for example, #28520), since slice is a general vector operation. Relying on pattern matching makes the compiler implementation significantly more complex in my opinion.

Regarding inlining of the fallback implementation, I think we do need such a mechanism to handle APIs that fail to inline on the first attempt, given that the current fallback overhead is much heavier and leads to worse performance. And I agree with @merykitty that a more generic solution would be more preferable.

jatin-bhateja · 2026-02-24T10:06:08Z

I’m fine with using the more straightforward approach to intrinsify the slice API when the origin is a constant. In my view, this could also benefit other APIs and future optimizations (for example, #28520), since slice is a general vector operation. Relying on pattern matching makes the compiler implementation significantly more complex in my opinion.

Regarding inlining of the fallback implementation, I think we do need such a mechanism to handle APIs that fail to inline on the first attempt, given that the current fallback overhead is much heavier and leads to worse performance. And I agree with @merykitty that a more generic solution would be more preferable.

Hi @merykitty , @XiaohongGong , Based on the feedback received, I have modified the patch to not inline on first intrinsic failure, instead I now collect such CallGenerators and only towards the end on incremental inlining I inline expands the fallback implementation on the lines of _string_late_inlines.

This will give opportunity to create constant context for VectorSlice intrinsification and if that fails we inline the fallback implimentation to avoid any costly boxing penalties.

src/hotspot/share/opto/compile.hpp

erifan · 2026-02-25T05:35:41Z

src/hotspot/share/opto/compile.cpp

+  while (_late_inlines.length() > 0) {
+    igvn_worklist()->ensure_empty(); // should be done with igvn
+
+    while (inline_incrementally_one()) {


Is it possible to generate _vector_late_inlines candidate again in inline_incrementally_one?

Compile::inline_vector_fallback(PhaseIterGVN& igvn) inlines fallback implementation and if fallback is composed of intrinsics which are recorded in _late_inlines it attempts intrinsification. We don't append to _vector_late_inlines while executing fallback generator
https://github.com/jatin-bhateja/jdk/blob/444a35685eed8442b721795824cf0a43c7bb02f8/src/hotspot/share/opto/callGenerator.cpp#L722

XiaohongGong · 2026-02-25T07:50:03Z

There are regression for these two cases. Do you know the root cause?

Before:
VectorSliceBenchmark.intVectorSliceWithVariableIndex       1024  thrpt    2   6204.489          ops/ms
VectorSliceBenchmark.longVectorSliceWithConstantIndex1     1024  thrpt    2   1651.334          ops/ms

After:
VectorSliceBenchmark.intVectorSliceWithVariableIndex       1024  thrpt    2   5626.367          ops/ms
VectorSliceBenchmark.longVectorSliceWithConstantIndex1     1024  thrpt    2    960.958          ops/ms

jatin-bhateja · 2026-02-27T04:43:13Z

There are regression for these two cases. Do you know the root cause?

Before:
VectorSliceBenchmark.intVectorSliceWithVariableIndex       1024  thrpt    2   6204.489          ops/ms
VectorSliceBenchmark.longVectorSliceWithConstantIndex1     1024  thrpt    2   1651.334          ops/ms

After:
VectorSliceBenchmark.intVectorSliceWithVariableIndex       1024  thrpt    2   5626.367          ops/ms
VectorSliceBenchmark.longVectorSliceWithConstantIndex1     1024  thrpt    2    960.958          ops/ms

Hi @XiaohongGong I observed that there is quite a lot of run to run variation in these micro even with stock JDK, I collected PMU events and found on AVX512 system there are MISALIGNED vector memory operation in fallback which causes this variation.

jatin-bhateja · 2026-03-10T06:17:11Z

Hi @XiaohongGong , @merykitty , @erifan let me know if you have other comments.

erifan · 2026-03-10T07:04:09Z

LGTM, thanks!

XiaohongGong

Still LGTM!

merykitty · 2026-03-12T16:47:34Z

I'm still not convinced with this solution. If the pattern matching method proves itself to be not reliable, then we can proceed with an intrinsics. Otherwise, we risk introduce a change that will eventually become redundant.

jatin-bhateja · 2026-03-13T05:47:39Z

I'm still not convinced with this solution. If the pattern matching method proves itself to be not reliable, then we can proceed with an intrinsics. Otherwise, we risk introduce a change that will eventually become redundant.

Hi @merykitty , As discussed earlier your suggestions were incorporated in latest version of patch, idea here is not to hold an optimization in anticipation of future optimization. x86 backend changes will still be usable if at later point we decide to use complex pattern matching once TypeVect has constant information. What we have currently is generic handling which can inline any fallback after failed intrinsification attempts. Looking forward to your comments on backend part and any further improvement on existing handling.

Hi @sviswa7 , @iwanowww , May I request you to share your views / comments

iwanowww · 2026-03-14T00:33:35Z

I briefly looked at the patch.

First of all, I suggest to separate the logic to handle intrinsification failures. It's not specific to the proposed enhancement and will improve handling of intrinsification failures for vector operations.

Speaking of proposed approach, it aligns well with current Vector API implementation practices. I agree it would be nice to automatically detect equivalent IR shapes and transform them accordingly, but if it means hard-coding the shape of sliceTemplate into the compiler, current proposal does look well-justified.

jatin-bhateja · 2026-03-25T09:04:33Z

Thanks @iwanowww , I agree that approach to inline on intrinsic failure is generic enough and can benefit other vector operations also as it may absorb boxing penalties. For slice and un-slice since the fallback is completely written in vector APIs it will give most benefits and that is the focus of this patch.

Looking forward to your other comments on current implementation.

Thanks @iwanowww , I agree that approach to inline on intrinsic failure is generic enough and can benefit other vector operations also as it may absorb boxing penalties. For slice and un-slice since the fallback is completely written in vector APIs it will give most benefits and that is the focus of this patch.

Looking forward to your other comments on current implementation.

8303762: Optimize vector slice operation with constant index using VP…

2a17c5d

…ALIGNR instruction

openjdk bot added graal graal-dev@openjdk.org hotspot hotspot-dev@openjdk.org core-libs core-libs-dev@openjdk.org labels Mar 18, 2025

openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Mar 18, 2025

openjdk bot added the merge-conflict Pull request has merge conflict with target branch label May 18, 2025

bridgekeeper bot added oca Needs verification of OCA signatory status and removed oca Needs verification of OCA signatory status labels Jul 15, 2025

jbhateja added 2 commits July 23, 2025 22:29

Merge branch 'master' of https://github.com/openjdk/jdk into JDK-8303762

7385c75

new benchmark

edf51e7

openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Jul 24, 2025

jatin-bhateja force-pushed the JDK-8303762 branch from 3d09134 to c0b9eea Compare July 25, 2025 02:53

Optimizing AVX2 backend and some re-factoring

607a8fc

jatin-bhateja force-pushed the JDK-8303762 branch from c0b9eea to 607a8fc Compare July 25, 2025 02:55

Fixes for failing regressions

b2e9343

jatin-bhateja marked this pull request as ready for review July 25, 2025 13:40

openjdk bot added the rfr Pull request is ready for review label Jul 25, 2025

Updating predicate checks

04be59a

Review comments resolutions

ae24292

erifan approved these changes Jan 26, 2026

View reviewed changes

XiaohongGong approved these changes Jan 26, 2026

View reviewed changes

Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8303762

1dfff55

merykitty reviewed Feb 16, 2026

View reviewed changes

Review comments resolution

444a356

erifan reviewed Feb 25, 2026

View reviewed changes

Review resolutions

9625b04

XiaohongGong approved these changes Mar 12, 2026

View reviewed changes

Conversation

jatin-bhateja commented Mar 18, 2025 • edited by openjdk bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Progress

Issue

Reviewers

Reviewing

Uh oh!

bridgekeeper bot commented Mar 18, 2025

Uh oh!

openjdk bot commented Mar 18, 2025

Uh oh!

openjdk bot commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jatin-bhateja commented Mar 18, 2025 • edited by bridgekeeper bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openjdk bot commented Mar 18, 2025

Uh oh!

bridgekeeper bot commented May 13, 2025

Uh oh!

jatin-bhateja commented May 18, 2025 • edited by bridgekeeper bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openjdk bot commented May 18, 2025

Uh oh!

openjdk bot commented May 18, 2025

Uh oh!

bridgekeeper bot commented Jul 13, 2025

Uh oh!

jatin-bhateja commented Jul 25, 2025

Uh oh!

mlbridge bot commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Webrevs

Uh oh!

jatin-bhateja commented Jan 23, 2026

Uh oh!

XiaohongGong left a comment

Choose a reason for hiding this comment

Uh oh!

sviswa7 commented Jan 26, 2026

Uh oh!

jatin-bhateja commented Feb 3, 2026

Uh oh!

jatin-bhateja commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

merykitty left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jatin-bhateja Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jatin-bhateja Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jatin-bhateja commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

merykitty commented Feb 17, 2026

Uh oh!

jatin-bhateja commented Feb 17, 2026

Uh oh!

merykitty commented Feb 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

jatin-bhateja commented Mar 18, 2025 •

edited by openjdk bot

Loading

openjdk bot commented Mar 18, 2025 •

edited

Loading

jatin-bhateja commented Mar 18, 2025 •

edited by bridgekeeper bot

Loading

jatin-bhateja commented May 18, 2025 •

edited by bridgekeeper bot

Loading

mlbridge bot commented Jul 25, 2025 •

edited

Loading

jatin-bhateja commented Feb 16, 2026 •

edited

Loading

jatin-bhateja Feb 16, 2026 •

edited

Loading

jatin-bhateja Feb 17, 2026 •

edited

Loading

jatin-bhateja commented Feb 16, 2026 •

edited

Loading

merykitty commented Feb 17, 2026 •

edited

Loading

jatin-bhateja commented Feb 19, 2026 •

edited

Loading

jatin-bhateja commented Feb 24, 2026 •

edited

Loading

jatin-bhateja commented Mar 10, 2026 •

edited

Loading

jatin-bhateja commented Mar 13, 2026 •

edited

Loading

iwanowww commented Mar 14, 2026 •

edited

Loading