25 Oct 04:15

snnn

a83fc4d

Latest

ORT 1.23.2 cherrypick 1 (#26368)

Adds the following commits to the release-1.23.2 branch for ORT 1.23.2:

- [TensorRT] Fix DDS output bug during engine update
  - PR: https://github.com/microsoft/onnxruntime/pull/26272
  - commit id: 00e85dd3c84f511fee373d152d461f6e81d7f514
- Fix shape inference failure with in-memory external data
   - PR: https://github.com/microsoft/onnxruntime/pull/26263
   - commit id: d955476911997842cb058174c18f30f8dc3693b4
- [CUDA] replace 90a-virtual by 90-virtual for forward compatible 
  - PR: https://github.com/microsoft/onnxruntime/pull/26230
  - commit id: b58911f7445be56e45cb0f7993c0d43e6839c09e
- [QNN-EP] Fix logic flow bug
  - PR: https://github.com/microsoft/onnxruntime/pull/26148
  - commit id: b282379ac6066e8de9a5a68f1ce5ef1cf566dd04
- Internal Dupe of #25255 - [MLAS] Optimize MlasConv using thread
partition opt
  - PR: https://github.com/microsoft/onnxruntime/pull/26103
  - commit id: 736251899137449311819bab36ff1c47ea09a62c
- Update qMoE spec to support block quantization
  - PR: https://github.com/microsoft/onnxruntime/pull/25641
  - commit id: 7a8ffa80b78c1e363a04eb7b8ebae22c4e45d140
- [VitisAI] add new api to VitisAI to save graph as a string
  - PR: https://github.com/microsoft/onnxruntime/pull/25602
  - commit id: 3361d723a526d6bcd9ac473ce6d3f0a1a89244da
- [[Build] Lock torch, onnxscript and onnx-ir versions to latest]
  - PR: https://github.com/microsoft/onnxruntime/pull/26315
  - commit id: ea69c4df0bd032ec1ca3790455f123de99187cee

---------

Co-authored-by: Hariharan Seshadri <[email protected]>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Edward Chen <[email protected]>
Co-authored-by: Yateng Hong <[email protected]>
Co-authored-by: Changming Sun <[email protected]>
Co-authored-by: Dmitri Smirnov <[email protected]>
Co-authored-by: Tianlei Wu <[email protected]>
Co-authored-by: quic-calvnguy <[email protected]>
Co-authored-by: quic_calvnguy <quic_calvnguy@quic_inc.com>
Co-authored-by: yifei410 <[email protected]>
Co-authored-by: yifei <[email protected]>

Assets 11

08 Oct 04:12

snnn

v1.23.1

d9b2048

ONNX Runtime v1.23.1

What's Changed

Fix Attention GQA implementation on CPU (#25966)
Address edge GetMemInfo edge cases (#26021)
Implement new Python APIs (#25999)
MemcpyFromHost and MemcpyToHost support for plugin EPs (#26088)
[TRT RTX EP] Fix bug for generating the correct subgraph in GetCapability (#26132)
add session_id_ to LogEvaluationStart/Stop, LogSessionCreationStart (#25590)
[build] fix WebAssembly build on macOS/arm64 (#25653)
[CPU] MoE Kernel (#25958)
[CPU] Block-wise QMoE kernel for CPU (#26009)
[C#] Implement missing APIs (#26101)
Regenerate test model with ONNX IR < 12 (#26149)
[CPU] Fix compilation errors because of unused variables (#26147)
[EP ABI] Check if nodes specified in GetCapability() have already been assigned (#26156)
[QNN EP] Add dynamic option to set HTP performance mode (#26135)

Full Changelog: v1.23.0...v1.23.1

Assets 11

26 Sep 04:33

snnn

v1.23.0

be835ef

ONNX Runtime v1.23.0

Announcements

This release introduces Execution Provider (EP) Plugin API, which is a new infrastructure for building plugin-based EPs. (#24887 , #25137, #25124, #25147, #25127, #25159, #25191, #2524)
This release introduces the ability to dynamically download and install execution providers. This feature is exclusively available in the WinML build and requires Windows 11 version 25H2 or later. To leverage this new capability, C/C++/C# users should use the builds distributed through the Windows App SDK, and Python users should install the onnxruntime-winml package(will be published soon). We encourage users who can upgrade to the latest Windows 11 to utilize the WinML build to take advantage of this enhancement.

Upcoming Changes

The next release will stop providing x86_64 binaries for macOS and iOS operating systems.
The next release will increase the minimum supported macOS version from 13.4 to 14.0.
The next release will stop providing python 3.10 wheels.

Execution & Core Optimizations

Shutdown logic on Windows is simplified

Now on Windows some global object will be not destroyed if we detect that the process is being shutting down(#24891) . It will not cause memory leak as when a process ends all the memory will be returned to the operating system. This change can reduce the chance of having crashes on process exit.

AutoEP/Device Management

Now ONNX Runtime has the ability to automatically discovery computing devices and select the best EPs to download and register. The EP downloading feature currently only works on Windows 11 version 25H2 or later.

Execution Provider (EP) Updates

ROCM EP was removed from the source tree. Users are recommended to use Migraphx or Vitis AI EPs from AMD.
A new EP, Nvidia TensorRT RTX, was added.

Web

EMDSK is upgraded from 4.0.4 to 4.0.8

WebGPU EP

Added WGSL template support.

QNN EP

SDK Update: Added support for QNN SDK 2.37.

KleidiAI

Enhanced performance for SGEMM, IGEMM, and Dynamic Quantized MatMul operations, especially for Conv2D operators on hardware that supports SME2 (Scalable Matrix Extension v2).

Known Problems

There was a change in build.py that was related to KleidiAI that may cause build failures when doing cross-compiling (#26175) .

Contributions

Contributors to ONNX Runtime include members across teams at Microsoft, along with our community members:

@1duo, @Akupadhye, @amarin16, @AndreyOrb, @ankan-ban, @ankitm3k, @anujj, @aparmp-quic, @arnej27959, @bachelor-dou, @benjamin-hodgson, @Bonoy0328, @chenweng-quic, @chuteng-quic, @clementperon, @co63oc, @daijh, @damdoo01-arm, @danyue333, @fanchenkong1, @gedoensmax, @genarks, @gnedanur, @Honry, @huaychou, @ianfhunter, @ishwar-raut1, @jing-bao, @joeyearsley, @johnpaultaken, @jordanozang, @JulienMaille, @keshavv27, @kevinch-nv, @khoover, @krahenbuhl, @kuanyul-quic, @mauriciocm9, @mc-nv, @minfhong-quic, @mingyueliuh, @MQ-mengqing, @NingW101, @notken12, @omarhass47, @peishenyan, @pkubaj, @qc-tbhardwa, @qti-jkilpatrick, @qti-yuduo, @quic-ankus, @quic-ashigarg, @quic-ashwshan, @quic-calvnguy, @quic-hungjuiw, @quic-tirupath, @qwu16, @ranjitshs, @saurabhkale17, @schuermans-slx, @sfatimar, @stefantalpalaru, @sunnyshu-intel, @TedThemistokleous, @thevishalagarwal, @toothache, @umangb-09, @vatlark, @VishalX, @wcy123, @xhcao, @xuke537, @zhaoxul-qti

Contributors

JulienMaille, wcy123, and 71 other contributors

Assets 11

3 Join discussion

13 Aug 16:53

vraspar

v1.22.2

5630b08

ONNX Runtime v1.22.2

What's new?

This release adds an optimized CPU/MLAS implementation of DequantizeLinear (8 bit) and introduces the build option client_package_build, which enables default options that are more appropriate for client/on-device workloads (e.g., disable thread spinning by default).

Build System & Packages

Add –client_package_build option (#25351) - @jywu-msft
Remove the python installation steps from win-qnn-arm64-ci-pipeline.yml (#25552) - @snnn

CPU EP

Add multithreaded/vectorized implementation of DequantizeLinear for int8 and uint8 inputs (SSE2, NEON) (#24818) - @adrianlizarraga

QNN EP

Add support for the Upsample, Einsum, LSTM, and CumSum operators (#24265, #24616, #24646, #24820) - @quic-zhaoxul, @1duo, @chenweng-quic, @Akupadhye
Fuse scale into Softmax (#24809) - @qti-yuduo
Enable DSP queue polling when performance is set to “burst” mode (#25361) - @quic-calvnguy
Update QNN SDK to version 2.36.1 (#25388) - @qti-jkilpatrick
Include the license file from QNN SDK in the Microsoft.ML.OnnxRunitme.QNN NuGet package (#25158) - @HectorSVC

Contributors

snnn, 1duo, and 9 other contributors

Assets 4

08 Jul 22:08

vraspar

v1.22.1

89746dc

ONNX Runtime v1.22.1

What's new?

This release replaces static linking of dxcore.lib with optional runtime loading, lowering the minimum supported version from Windows 10 22H2 (10.0.22621) to 20H1 (10.0.19041). This enables compatibility with Windows Server 2019 (10.0.17763), where dxcore.dll may be absent.

change dependency from gitlab eigen to github eigen-mirror #24884 - @prathikr
Weaken dxcore dependency #24845 - @skottmckay
[DML] Restore compatibility with Windows Sdk 10.0.17134.0 #24950 - @JulienMaille
Disable VCPKG's binary cache #24889 - @snnn

Contributors

JulienMaille, snnn, and 2 other contributors

Assets 11

10 May 01:14

MaanavD

v1.22.0

f217402

ONNX Runtime v1.22

Announcements

This release introduces new API's for Model Editor, Auto EP infrastructure, and AOT Compile
OnnxRuntime GPU packages require CUDA 12.x , packages built for CUDA 11.x are no longer published.
The min supported Windows version is now 10.0.19041.

GenAI & Advanced Model Features

Constrained Decoding: Introduced new capabilities for constrained decoding, offering more control over generative AI model outputs.

Execution & Core Optimizations

Core

Auto EP Selection Infrastructure: Added foundational infrastructure to enable automatic selection of Execution Providers via selection policies, aiming to simplify configuration and optimize performance. (Pull Request #24430)
Compile API: Introduced new APIs to support explicit compilation of ONNX models.
- See: OrtCompileApi Struct Reference (Assuming a similar link structure for future documentation)
- See: EP Context Design (Assuming a similar link structure for future documentation)
Model Editor API api's for creating or editing ONNX models
- See: OrtModelEditorApi

Execution Provider (EP) Updates

CPU EP/MLAS

KleidiAI Integration: Integrated KleidiAI into ONNX Runtime/MLAS for enhanced performance on Arm architectures.
MatMulNBits Support: Added support for MatMulNBits, enabling matrix multiplication with weights quantized to 8 bits.
GroupQueryAttention optimizations and enhancements

OpenVINO EP

Added support up to OpenVINO 2025.1
Introduced Intel compiler level optimizations for QDQ models.
Added support to select Intel devices based on LUID
Load_config feature improvement to support AUTO, HETERO and MULTI plugin.
misc bugfixes/optimizations
For detailed updates, refer to Pull Request #24394: ONNXRuntime OpenVINO - Release 1.22

QNN EP

SDK Update: Added support for QNN SDK 2.33.2.
operator updates/support to Sum, Softmax, Upsample, Expand, ScatterND, Einsum
QNN EP can be built as shared or static library.
enable QnnGpu backend
For detailed updates refer to recent QNN tagged PR's

TensorRT EP

TensorRT Version: Added support for TensorRT 10.9.
- Note for onnx-tensorrt open-source parser users: Please check here for specific requirements (Referencing 1.21 link as a placeholder, this should be updated for 1.22).
New Features:
- EP option to enable TRT Preview Feature
- Support to load TensorRT V3 plugin
Bug Fixes:
- Resolved an issue related to multithreading scenarios.
- Fixed incorrect GPU usage that affected both TensorRT EP and CUDA EP.

NV TensorRT RTX EP

New Execution Provider: Introduced a new Execution Provider specifically for Nvidia RTX GPUs, leveraging TensorRT for optimized performance.

CUDA EP

MatMulNBits Enhancement: Added support for 8-bit weight-only quantization in MatMulNBits.
Bug Fixes:
- Fixed incorrect GPU usage (also mentioned under TensorRT EP).

VitisAI EP

Miscellaneous bug fixes and improvements.

Infrastructure & Build Improvements

Build System & Packages

QNN Nuget Package: The QNN Nuget package is now built as ARM64x.

Dependencies / Version Updates

CUDA Version Update: This release includes an update to the CUDA version. Users should consult the documentation for specific version requirements. CUDA 11 based GPU packages no longer published.

Web

WebGPU Expansion:
- Added WebGPU support to the node.js package (Windows and macOS).
- Enabled WebGPU when building from source for macOS, Linux, and Windows.

Mobile

No major updates of note this release.

Contributions

Contributors to ONNX Runtime include members across teams at Microsoft, along with our community members:

Yulong Wang, Jian Chen, Changming Sun, Satya Kumar Jandhyala, Hector Li, Prathik Rao, Adrian Lizarraga, Jiajia Qin, Scott McKay, Jie Chen, Tianlei Wu, Edward Chen, Wanming Lin, xhcao, vraspar, Dmitri Smirnov, Jing Fang, Yifan Li, Caroline Zhu, Jianhui Dai, Chi Lo, Guenther Schmuelling, Ryan Hill, Sushanth Rajasankar, Yi-Hong Lyu, Ankit Maheshkar, Artur Wojcik, Baiju Meswani, David Fan, Enrico Galli, Hans, Jambay Kinley, John Paul, Peishen Yan, Yateng Hong, amarin16, chuteng-quic, kunal-vaishnavi, quic-hungjuiw, Alessio Soldano, Andreas Hussing, Ashish Garg, Ashwath Shankarnarayan, Chengdong Liang, Clément Péron, Erick Muñoz, Fanchen Kong, George Wu, Haik Silm, Jagadish Krishnamoorthy, Justin Chu, Karim Vadsariya, Kevin Chen, Mark Schofield, Masaya, Kato, Michael Tyler, Nenad Banfic, Ningxin Hu, Praveen G, Preetha Veeramalai, Ranjit Ranjan, Seungtaek Kim, Ti-Tai Wang, Xiaofei Han, Yueqing Zhang, co63oc, derdeljan-msft, genmingz@AMD, jiangzhaoming, jing-bao, kuanyul-quic, liqun Fu, minfhong-quic, mingyue, quic-tirupath, quic-zhaoxul, saurabh, selenayang888, sfatimar, sheetalarkadam, virajwad, zz002, Ștefan Talpalaru

Assets 15

21 Apr 17:38

amarin16

v1.21.1

8f7cce3

ONNX Runtime v1.21.1

What's new?

Extend CMAKE_CUDA_FLAGS with all Blackwell compute capacity #23928 - @yf711
[ARM CPU] Fix fp16 const initialization on no-fp16 platform #23978 - @fajin-corp
[TensorRT EP] Call cudaSetDevice at compute function for handling multithreading scenario #24010 - @chilo-ms
Fix attention bias broadcast #24017 - @tianleiwu
Deleted the constant SKIP_CUDA_TEST_WITH_DML #24113 - @CodingSeaotter
[QNN EP] ARM64EC python package remove --vcpkg in build #24174 - @jywu-msft
[wasm] remove --vcpkg in wasm build #24179 - @fs-eire

Contributors

fs-eire, tianleiwu, and 5 other contributors

Assets 3

08 Mar 05:33

MaanavD

v1.21.0

e0b66ca

ONNX Runtime v1.21.0

Announcements

No large announcements of note this release! We've made a lot of small refinements to streamline your ONNX Runtime experience.

GenAI & Advanced Model Features

Enhanced Decoding & Pipeline Support

Added "chat mode" support for CPU, GPU, and WebGPU.
Provided support for decoder model pipelines.
Added support for Java API for MultiLoRA.

API & Compatibility Updates

Chat mode introduced breaking changes in the API (see migration guide).

Bug Fixes for Model Output

Fixed Phi series garbage output issues with long prompts.
Resolved gibberish issues with top_k on CPU.

Execution & Core Optimizations

Core Refinements

Reduced default logger usage for improved efficiency(#23030).
Fixed a visibility issue in theadpool (#23098).

Execution Provider (EP) Updates

General

Removed TVM EP from the source tree(#22827).
Marked NNAPI EP for deprecation (following Google's deprecation of NNAPI).
Fixed a DLL delay loading issue that impacts WebGPU EP and DirectML EP's usability on Windows (#23111, #23227)

TensorRT EP Improvements

Added support for TensorRT 10.8.
- onnx-tensorrt open-source parser user: please check here for requirement.
Assigned DDS ops (NMS, RoiAlign, NonZero) to TensorRT by default.
Introduced option trt_op_types_to_exclude to exclude specific ops from TensorRT assignment.

CUDA EP Improvements

Added a python API preload_dlls to coexist with PyTorch.
Miscellaneous enhancements for Flux model inference.

QNN EP Improvements

Introduced QNN shared memory support.
Improved performance for AI Hub models.
Added support for QAIRT/QNN SDK 2.31.
Added Python 3.13 package.
Miscellaneous bug fixes and enhancements.
QNN EP is now built as a shared library/DLL by default. To retain previous build behavior, use build option --use_qnn static_lib.

DirectML EP Support & Upgrades

Updated DirectML version from 1.15.2 to 1.15.4(#22635).

OpenVINO EP Improvements

Introduced OpenVINO EP Weights Sharing feature.
Added support for various contrib Ops in OVEP:
- SkipLayerNormalization, MatMulNBits, FusedGemm, FusedConv, EmbedLayerNormalization, BiasGelu, Attention, DynamicQuantizeMatMul, FusedMatMul, QuickGelu, SkipSimplifiedLayerNormalization
Miscellaneous bug fixes and improvements.

VitisAI EP Improvements

Miscellaneous bug fixes and improvements.

Mobile Platform Enhancements

CoreML Updates

Added support for caching generated CoreML models.

Extensions & Tokenizer Improvements

Expanded Tokenizer Support

Now supports more tokenizer models, including ChatGLM, Baichuan2, Phi-4, etc.
Added full Phi-4 pre/post-processing support for text, vision, and audio.
Introduced RegEx pattern loading from tokenizer.json.

Image Codec Enhancements

ImageCodec now links to native APIs if available; otherwise, falls back to built-in libraries.

Unified Tokenizer API

Introduced a new tokenizer op schema to unify the tokenizer codebase.
Added support for loading tokenizer data from a memory blob in the C API.

Infrastructure & Build Improvements

Runtime Requirements

All the prebuilt Windows packages now require VC++ Runtime version >= 14.40(instead of 14.38). If your VC++ runtime version is lower than that, you may see a crash when ONNX Runtime was initializing. See https://github.com/microsoft/STL/wiki/Changelog#vs-2022-1710 for more details.

Updated minimum iOS and Android SDK requirements to align with React Native 0.76:

iOS >= 15.1
Android API >= 24 (Android 7)

All macOS packages now require macOS version >= 13.3.

CMake File Changes

CMake Version: Increased the minimum required CMake version from 3.26 to 3.28. Added support for CMake 4.0.
Python Version: Increased the minimum required Python version from 3.8 to 3.10 for building ONNX Runtime from source.
Improved VCPKG support

Added the following cmake options for WebGPU EP

onnxruntime_USE_EXTERNAL_DAWN
onnxruntime_CUSTOM_DAWN_SRC_PATH
onnxruntime_BUILD_DAWN_MONOLITHIC_LIBRARY
onnxruntime_ENABLE_PIX_FOR_WEBGPU_EP
onnxruntime_ENABLE_DAWN_BACKEND_VULKAN
onnxruntime_ENABLE_DAWN_BACKEND_D3D12

Added cmake option onnxruntime_BUILD_QNN_EP_STATIC_LIB for building with QNN EP as a static library.
Removed cmake option onnxruntime_USE_PREINSTALLED_EIGEN.

Fixed a build issue with Visual Studio 2022 17.3 (#23911)

Modernized Build Tools

Now using VCPKG for most package builds.
Upgraded Gradle from 7.x to 8.x.
Updated JDK from 11 to 17.
Enabled onnxruntime_USE_CUDA_NHWC_OPS by default for CUDA builds.
Added support for WASM64 (build from source; no package published).

Dependency Cleanup

Removed Google’s nsync from dependencies.

Others

Updated Node.js installation script to support network proxy usage (#23231)

Web

No updates of note.

Contributors

Contributors to ONNX Runtime include members across teams at Microsoft, along with our community members:

Changming Sun, Yulong Wang, Tianlei Wu, Jian Chen, Wanming Lin, Adrian Lizarraga, Hector Li, Jiajia Qin, Yifan Li, Edward Chen, Prathik Rao, Jing Fang, shiyi, Vincent Wang, Yi Zhang, Dmitri Smirnov, Satya Kumar Jandhyala, Caroline Zhu, Chi Lo, Justin Chu, Scott McKay, Enrico Galli, Kyle, Ted Themistokleous, dtang317, wejoncy, Bin Miao, Jambay Kinley, Sushanth Rajasankar, Yueqing Zhang, amancini-N, ivberg, kunal-vaishnavi, liqun Fu, Corentin Maravat, Peishen Yan, Preetha Veeramalai, Ranjit Ranjan, Xavier Dupré, amarin16, jzm-intel, kailums, xhcao, A-Satti, Aleksei Nikiforov, Ankit Maheshkar, Javier Martinez, Jianhui Dai, Jie Chen, Jon Campbell, Karim Vadsariya, Michael Tyler, PARK DongHa, Patrice Vignola, Pranav Sharma, Sam Webster, Sophie Schoenmeyer, Ti-Tai Wang, Xu Xing, Yi-Hong Lyu, genmingz@AMD, junchao-zhao, sheetalarkadam, sushraja-msft, Akshay Sonawane, Alexis Tsogias, Ashrit Shetty, Bilyana Indzheva, Chen Feiyue, Christian Larson, David Fan, David Hotham, Dmitry Deshevoy, Frank Dong, Gavin Kinsey, George Wu, Grégoire, Guenther Schmuelling, Indy Zhu, Jean-Michaël Celerier, Jeff Daily, Joshua Lochner, Kee, Malik Shahzad Muzaffar, Matthieu Darbois, Michael Cho, Michael Sharp, Misha Chornyi, Po-Wei (Vincent), Sevag H, Takeshi Watanabe, Wu, Junze, Xiang Zhang, Xiaoyu, Xinpeng Dou, Xinya Zhang, Yang Gu, Yateng Hong, mindest, mingyue, raoanag, saurabh, shaoboyan091, sstamenk, tianf-fff, wonchung-microsoft, xieofxie, zz002

Assets 16

12 Feb 22:57

adrianlizarraga

v1.20.2

8608bf0

ONNX Runtime v1.20.2 [QNN-only]

What's new?

Build System & Packages

Merge Windows machine pools for Web CI pipeline to reduce maintenance costs (#23243) - @snnn
Update boost URL for React Native CI pipeline (#23281) - @jchen351
Move ORT Training pipeline to GitHub actions and enable CodeQL scan for the source code (#22543) - @snnn
Move Linux GitHub actions to a dedicated machine pool (#22566) - @snnn
Update Apple deployment target to iOS 15.1 and macOS 13.3 (#23308) - @snnn
Deprecate macOS 12 in packaging pipeline (#23017) - @mszhanyi
Remove net8.0-android MAUI target from MAUI test project (#23607) - @carzh

CUDA EP

Fixes use of numeric_limits that causes a compiler error in Visual Studio 2022 v17.12 Preview 5 (#22738, #22868) - @tianleiwu

QNN EP

Enable offloading graph input quantization and graph output dequantization to CPU by default. Improves inference latency by reducing the amount of I/O data copied between CPU and NPU. (#23368) - @adrianlizarraga

Contributors

snnn, mszhanyi, and 4 other contributors

Assets 2

21 Nov 22:20

sophies927

v1.20.1

5c1b7cc

ONNX Runtime v1.20.1

What's new?

Python Quantization Tool

Prevent int32 quantized bias from clipping by adjusting the weight's scale (#22020) - @adrianlizarraga
Update QDQ Pad, Slice, Softmax (#22676) - @adrianlizarraga
Introduce get_qdq_config() helper to get QDQ configurations (#22677) - @adrianlizarraga
Add reduce_range option to get_qdq_config() (#22782) - @adrianlizarraga
Flaky test due to Pad reflect bug (#22798) - @adrianlizarraga

CPU EP

Refactor SkipLayerNorm implementation to address issues (#22719, #22862) - @amarin16, @liqunfu

QNN EP

Add QNN SDK v2.28.2 support (#22724, #22844) - @HectorSVC, @adrianlizarraga

TensorRT EP

Exclude DDS ops from running on TRT (#22875) - @chilo-ms

Packaging

Rework the native library usage so that a pre-built ORT native package can be easily used (#22345) - @skottmckay
Fix Maven Sha256 Checksum Issue (#22600) - @idiskyle

Contributions

Big thank you to the release manager @yf711, along with @adrianlizarraga, @HectorSVC, @jywu-msft, and everyone else who helped to make this patch release process a smooth one!

Contributors

skottmckay, liqunfu, and 7 other contributors

Assets 12

Releases: microsoft/onnxruntime

ONNX Runtime v1.23.2

Uh oh!

ONNX Runtime v1.23.1

What's Changed

Uh oh!

ONNX Runtime v1.23.0

Announcements

Upcoming Changes

Execution & Core Optimizations

Shutdown logic on Windows is simplified

AutoEP/Device Management

Execution Provider (EP) Updates

Web

WebGPU EP

QNN EP

KleidiAI

Known Problems

Contributions

Contributors

Uh oh!

ONNX Runtime v1.22.2

What's new?

Build System & Packages

CPU EP

QNN EP

Contributors

Uh oh!

ONNX Runtime v1.22.1

What's new?

Contributors

Uh oh!

ONNX Runtime v1.22

Announcements

GenAI & Advanced Model Features

Execution & Core Optimizations

Core

Execution Provider (EP) Updates

CPU EP/MLAS

OpenVINO EP

QNN EP

TensorRT EP

NV TensorRT RTX EP

CUDA EP

VitisAI EP

Infrastructure & Build Improvements

Build System & Packages

Dependencies / Version Updates

Web

Mobile

Contributions

Uh oh!

ONNX Runtime v1.21.1

What's new?

Contributors

Uh oh!

ONNX Runtime v1.21.0

Announcements

GenAI & Advanced Model Features

Enhanced Decoding & Pipeline Support

API & Compatibility Updates

Bug Fixes for Model Output

Execution & Core Optimizations

Core Refinements

Execution Provider (EP) Updates

General

TensorRT EP Improvements

CUDA EP Improvements

QNN EP Improvements

DirectML EP Support & Upgrades

OpenVINO EP Improvements

VitisAI EP Improvements

Mobile Platform Enhancements

CoreML Updates

Extensions & Tokenizer Improvements

Expanded Tokenizer Support

Image Codec Enhancements

Unified Tokenizer API

Infrastructure & Build Improvements

Runtime Requirements