Releases · NVIDIA/warp

07 Jun 03:53

c0d1f1ed

v1.2.0

e8f9caf

v1.2.0

[1.2.0] - 2024-06-06

Add a not-a-number floating-point constant that can be used as wp.NAN or wp.nan.
Add wp.isnan(), wp.isinf(), and wp.isfinite() for scalars, vectors, matrices, etc.
Improve kernel cache reuse by hashing just the local module constants. Previously, a
module's hash was affected by all wp.constant() variables declared in a Warp program.
Revised module compilation process to allow multiple processes to use the same kernel cache directory.
Cached kernels will now be stored in hash-specific subdirectory.
Add runtime checks for wp.MarchingCubes on field dimensions and size
Fix memory leak in wp.Mesh BVH (GH-225)
Use C++17 when building the Warp library and user kernels
Increase PTX target architecture up to sm_75 (from sm_70), enabling Turing ISA features
Extended NanoVDB support (see warp.Volume):
- Add support for data-agnostic index grids, allocation at voxel granularity
- New wp.volume_lookup_index(), wp.volume_sample_index() and generic wp.volume_sample()/wp.volume_lookup()/wp.volume_store() kernel-level functions
- Zero-copy aliasing of in-memory grids, support for multi-grid buffers
- Grid introspection and blind data access capabilities
- warp.fem can now work directly on NanoVDB grids using warp.fem.Nanogrid
- Fixed wp.volume_sample_v() and wp.volume_store_*() adjoints
- Prevent wp.volume_store() from overwriting grid background values
Improve validation of user-provided fields and values in warp.fem
Support headless rendering of wp.render.OpenGLRenderer via pyglet.options["headless"] = True
wp.render.RegisteredGLBuffer can fall back to CPU-bound copying if CUDA/OpenGL interop is not available
Clarify terms for external contributions, please see CONTRIBUTING.md for details
Improve performance of wp.sparse.bsr_mm() by ~5x on benchmark problems
Fix for XPBD incorrectly indexing into of joint actuations joint_act arrays
Fix for mass matrix gradients computation in wp.sim.FeatherstoneIntegrator()
Fix for handling of --msvc_path in build scripts
Fix for wp.copy() params to record dest and src offset parameters on wp.Tape()
Fix for wp.randn() to ensure return values are finite
Fix for slicing of arrays with gradients in kernels
Fix for function overload caching, ensure module is rebuilt if any function overloads are modified
Fix for handling of bool types in generic kernels
Publish CUDA 12.5 binaries for Hopper support, see https://github.com/nvidia/warp?tab=readme-ov-file#installing for details

[1.1.1] - 2024-05-24

wp.init() is no longer required to be called explicitly and will be performed on first call to the API
Speed up omni.warp.core's startup time

Assets 9

08 May 15:54

c0d1f1ed

v1.1.0

e04b5c6

v1.1.0

[1.1.0] - 2024-05-09

Support returning a value from @wp.func_native CUDA functions using type hints
Improved differentiability of the wp.sim.FeatherstoneIntegrator
Fix gradient propagation for rigid body contacts in wp.sim.collide()
Added support for event-based timing, see wp.ScopedTimer()
Added Tape visualization and debugging functions, see wp.Tape.visualize()
Support constructing Warp arrays from objects that define the __cuda_array_interface__ attribute
Support copying a struct to another device, use struct.to(device) to migrate struct arrays
Allow rigid shapes to not have any collisions with other shapes in wp.sim.Model
Change default test behavior to test redundant GPUs (up to 2x)
Test each example in an individual subprocess
Polish and optimize various examples and tests
Allow non-contiguous point arrays to be passed to wp.HashGrid.build()
Upgrade LLVM to 18.1.3 for from-source builds and Linux x86-64 builds
Build DLL source code as C++17 and require GCC 9.4 as a minimum
Array clone, assign, and copy are now differentiable
Use Ruff for formatting and linting
Various documentation improvements (infinity, math constants, etc.)
Improve URDF importer, handle joint armature
Allow builtins.bool to be used in Warp data structures
Use external gradient arrays in backward passes when passed to wp.launch()
Add Conjugate Residual linear solver, see wp.optim.linear.cr()
Fix propagation of gradients on aliased copy of variables in kernels
Facilitate debugging and speed up import warp by eliminating raising any exceptions
Improve support for nested vec/mat assignments in structs
Recommend Python 3.9 or higher, which is required for JAX and soon PyTorch.
Support gradient propagation for indexing sliced multi-dimensional arrays, i.e. a[i][j] vs. a[i, j]
Provide an informative message if setting DLL C-types failed, instructing to try rebuilding the library

[1.0.3] - 2024-04-17

Add a support_level entry to the configuration file of the extensions

Assets 6

22 Mar 20:42

c0d1f1ed

v1.0.2

276742b

v1.0.2

[1.0.2] - 2024-03-22

Make examples runnable from any location
Fix the examples not running directly from their Python file
Add the example gallery to the documentation
Update README.md examples USD location
Update example_graph_capture.py description

Assets 6

15 Mar 17:32

c0d1f1ed

v1.0.1

6d6f5be

v1.0.1

[1.0.1] - 2024-03-15

Document Device total_memory and free_memory
Documentation for allocators, streams, peer access, and generics
Changed example output directory to current working directory
Added python -m warp.examples.browse for browsing the examples folder
Print where the USD stage file is being saved
Added examples/optim/example_walker.py sample
Make the drone example not specific to USD
Reduce the time taken to run some examples
Optimise rendering points with a single colour
Clarify an error message around needing USD
Raise exception when module is unloaded during graph capture
Added wp.synchronize_event() for blocking the host thread until a recorded event completes
Flush C print buffers when ending stdout capture
Remove more unneeded CUTLASS files
Allow setting mempool release threshold as a fractional value

Assets 6

08 Mar 01:58

c0d1f1ed

v1.0.0

017f635

v1.0.0

[1.0.0] - 2024-03-07

Add FeatherstoneIntegrator which provides more stable simulation of articulated rigid body dynamics in generalized coordinates (State.joint_q and State.joint_qd)
Introduce warp.sim.Control struct to store control inputs for simulations (optional, by default the Model control inputs are used as before); integrators now have a different simulation signature: integrator.simulate(model: Model, state_in: State, state_out: State, dt: float, control: Control)
joint_act can now behave in 3 modes: with joint_axis_mode set to JOINT_MODE_FORCE it behaves as a force/torque, with JOINT_MODE_VELOCITY it behaves as a velocity target, and with JOINT_MODE_POSITION it behaves as a position target; joint_target has been removed
Add adhesive contact to Euler integrators via Model.shape_materials.ka which controls the contact distance at which the adhesive force is applied
Improve handling of visual/collision shapes in URDF importer so visual shapes are not involved in contact dynamics
Experimental JAX kernel callback support
Improve module load exception message
Add wp.ScopedCapture
Removing enable_backward warning for callables
Copy docstrings and annotations from wrapped kernels, functions, structs

Assets 6

06 Mar 02:49

c0d1f1ed

v0.15.1

525ff16

v0.15.1

[0.15.1] - 2024-03-05

Add examples assets to the wheel packages
Fix broken image link in documentation
Fix codegen for custom grad functions calling their respective forward functions
Fix custom grad function handling for functions that have no outputs
Fix issues when wp.config.quiet = True

Assets 6

05 Mar 05:06

c0d1f1ed

v0.15.0

db3e8bb

v0.15.0

[0.15.0] - 2024-03-04

Add thumbnails to examples gallery
Apply colored lighting to examples
Moved examples directory under warp/
Add example usage to python -m warp.tests --help
Adding torch.autograd.function example + docs
Add error-checking to array shapes during creation
Adding example_graph_capture
Add a Diffsim Example of a Drone
Fix verify_fp causing compiler errors and support CPU kernels
Fix to enable matmul to be called in CUDA graph capture
Enable mempools by default
Update wp.launch to support tuple args
Fix BiCGSTAB and GMRES producing NaNs when converging early
Fix warning about backward codegen being disabled in test_fem
Fix assert_np_equal when NaN's and tolerance are involved
Improve error message to discern between CUDA being disabled or not supported
Support cross-module functions with user-defined gradients
Suppress superfluous CUDA error when ending capture after errors
Make output during initialization atomic
Add warp.config.max_unroll, fix custom gradient unrolling
Support native replay snippets using @wp.func_native(snippet, replay_snippet=replay_snippet)
Look for the CUDA Toolkit in default locations if the CUDA_PATH environment variable or --cuda_path build option are not used
Added wp.ones() to efficiently create one-initialized arrays
Rename wp.config.graph_capture_module_load_default to wp.config.enable_graph_capture_module_load_by_default

[0.14.0] - 2024-02-19

Add support for CUDA pooled (stream-ordered) allocators
- Support memory allocation during graph capture
- Support copying non-contiguous CUDA arrays during graph capture
- Improved memory allocation/deallocation performance with pooled allocators
- Use wp.config.enable_mempools_at_init to enable pooled allocators during Warp initialization (if supported)
- wp.is_mempool_supported() - check if a device supports pooled allocators
- wp.is_mempool_enabled(), wp.set_mempool_enabled() - enable or disable pooled allocators per device
- wp.set_mempool_release_threshold(), wp.get_mempool_release_threshold() - configure memory pool release threshold
Add support for direct memory access between devices
- Improved peer-to-peer memory transfer performance if access is enabled
- Caveat: enabling peer access may impact memory allocation/deallocation performance and increase memory consumption
- wp.is_peer_access_supported() - check if the memory of a device can be accessed by a peer device
- wp.is_peer_access_enabled(), wp.set_peer_access_enabled() - manage peer access for memory allocated using default CUDA allocators
- wp.is_mempool_access_supported() - check if the memory pool of a device can be accessed by a peer device
- wp.is_mempool_access_enabled(), wp.set_mempool_access_enabled() - manage access for memory allocated using pooled CUDA allocators
Refined stream synchronization semantics
- wp.ScopedStream can synchronize with the previous stream on entry and/or exit (only sync on entry by default)
- Functions taking an optional stream argument do no implicit synchronization for max performance (e.g., wp.copy(), wp.launch(), wp.capture_launch())
Support for passing a custom deleter argument when constructing arrays
- Deprecation of owner argument - use deleter to transfer ownership
Optimizations for various core API functions (e.g., wp.zeros(), wp.full(), and more)
Fix wp.matmul() to always use the correct CUDA context
Fix memory leak in BSR transpose
Fix stream synchronization issues when copying non-contiguous arrays

[0.13.1] - 2024-02-22

Ensure that the results from the Noise Deform are deterministic across different Kit sessions

Assets 6

16 Feb 23:42

c0d1f1ed

v0.13.0

9b2a57f

v0.13.0

[0.13.0] - 2024-02-16

Update the license to NVIDIA Software License, allowing commercial use (see LICENSE.md)
Add CONTRIBUTING.md guidelines (for NVIDIA employees)
Hash CUDA snippet and adj_snippet strings to fix caching
Fix build_docs.py on Windows
Add missing .py extension to warp/tests/walkthrough_debug
Allow wp.bool usage in vector and matrix types

[0.12.0] - 2024-02-05

Add a warning when the enable_backward setting is set to False upon calling wp.Tape.backward()
Fix kernels not being recompiled as expected when defined using a closure
Change the kernel cache appauthor subdirectory to just "NVIDIA"
Ensure that gradients attached to PyTorch tensors have compatible strides when calling wp.from_torch()
Add a Noise Deform node for OmniGraph that deforms points using a perlin/curl noise

Assets 6

23 Jan 21:39

c0d1f1ed

v0.11.0

875adca

v0.11.0

[0.11.0] - 2024-01-23

Re-release 1.0.0-beta.7 as a non-pre-release 0.11.0 version so it gets selected by pip install warp-lang.
Introducing a new versioning and release process, detailed in PACKAGING.md and resembling that of Python itself:
- The 0.11 release(s) can be found on the release-0.11 branch.
- Point releases (if any) go on the same minor release branch and only contain bug fixes, not new features.
- The public branch, previously used to merge releases into and corresponding with the GitHub main branch, is retired.

[1.0.0-beta.7] - 2024-01-23

Ensure captures are always enclosed in try/finally
Only include .py files from the warp subdirectory into wheel packages
Fix an extension's sample node failing at parsing some version numbers
Allow examples to run without USD when possible
Add a setting to disable the main Warp menu in Kit
Add iterative linear solvers, see wp.optim.linear.cg, wp.optim.linear.bicgstab, wp.optim.linear.gmres, and wp.optim.linear.LinearOperator
Improve error messages around global variables
Improve error messages around mat/vec assignments
Support conversion of scalars to native/ctypes, e.g.: float(wp.float32(1.23)) or ctypes.c_float(wp.float32(1.23))
Add a constant for infinity, see wp.inf
Add a FAQ entry about array assignments
Add a mass spring cage diff simulation example, see examples/example_diffsim_mass_spring_cage.py
Add -s, --suite option for only running tests belonging to the given suites
Fix common spelling mistakes
Fix indentation of generated code
Show deprecation warnings only once
Improve wp.render.OpenGLRenderer
Create the extension's symlink to the core library at runtime
Fix some built-ins failing to compile the backward pass when nested inside if/else blocks
Update examples with the new variants of the mesh query built-ins
Fix type members that weren't zero-initialized
Fix missing adjoint function for wp.mesh_query_ray()

Assets 6

10 Jan 21:44

c0d1f1ed

v1.0.0-beta.6

20db9c6

v1.0.0-beta.6 Pre-release

Pre-release

[1.0.0-beta.6] - 2024-01-10

Do not create CPU copy of grad array when calling array.numpy()
Fix assert_np_equal() bug
Support Linux AArch64 platforms, including Jetson/Tegra devices
Add parallel testing runner (invoke with python -m warp.tests, use warp/tests/unittest_serial.py for serial testing)
Fix support for function calls in range()
matmul adjoints now accumulate
Expand available operators (e.g. vector @ matrix, scalar as dividend) and improve support for calling native built-ins
Fix multi-gpu synchronization issue in sparse.py
Add depth rendering to OpenGLRenderer, document warp.render
Make atomic_min, atomic_max differentiable
Fix error reporting using the exact source segment
Add user-friendly mesh query overloads, returning a struct instead of overwriting parameters
Address multiple differentiability issues
Fix backpropagation for returning array element references
Support passing the return value to adjoints
Add point basis space and explicit point-based quadrature for warp.fem
Support overriding the LLVM project source directory path using build_lib.py --build_llvm --llvm_source_path=
Fix the error message for accessing non-existing attributes
Flatten faces array for Mesh constructor in URDF parser

Assets 6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[1.2.0] - 2024-06-06

[1.1.1] - 2024-05-24

[1.1.0] - 2024-05-09

[1.0.3] - 2024-04-17

[1.0.2] - 2024-03-22

[1.0.1] - 2024-03-15

[1.0.0] - 2024-03-07

[0.15.1] - 2024-03-05

[0.15.0] - 2024-03-04

[0.14.0] - 2024-02-19

[0.13.1] - 2024-02-22

[0.13.0] - 2024-02-16

[0.12.0] - 2024-02-05

[0.11.0] - 2024-01-23

[1.0.0-beta.7] - 2024-01-23

[1.0.0-beta.6] - 2024-01-10

Releases: NVIDIA/warp

v1.2.0

[1.2.0] - 2024-06-06

[1.1.1] - 2024-05-24

v1.1.0

[1.1.0] - 2024-05-09

[1.0.3] - 2024-04-17

v1.0.2

[1.0.2] - 2024-03-22

v1.0.1

[1.0.1] - 2024-03-15

v1.0.0

[1.0.0] - 2024-03-07

v0.15.1

[0.15.1] - 2024-03-05

v0.15.0

[0.15.0] - 2024-03-04

[0.14.0] - 2024-02-19

[0.13.1] - 2024-02-22

v0.13.0

[0.13.0] - 2024-02-16

[0.12.0] - 2024-02-05

v0.11.0

[0.11.0] - 2024-01-23

[1.0.0-beta.7] - 2024-01-23

v1.0.0-beta.6

[1.0.0-beta.6] - 2024-01-10