Extend device data node binding API to not clone specified input tensors #9054

rpsilva-aws · 2025-04-29T02:23:26Z

In this PR, we extend the _get_tensors_xla_device_data_node binding API to return the same tensor values for a given set of specified unmutated tensor inputs. It currently returns a list of tensors that capture the XLATensor values of the graph inputs. However, these tensors end up creating new ATen tensors, which are effectively clones of the original tensors.

xla/torch_xla/csrc/init_python_bindings.cpp

Line 2919 in c4b45a9

torch_xla::XLATensor::Create(backend_data));

This makes it so that these are not eligible to be aliased at the step barrier. Instead, we extend the API to allow users to specify the list of known input tensors, such that if graph input matches one of those inputs, then the same ATen tensor is returned back to the user as instead of returning a clone.

>>> t4 = torch.tensor(50).to(device)
>>> t5 = t4 + 10
>>> t6 = t5 * 20
>>> [torch_xla._XLAC._xla_get_tensor_id(x) for x in [t4, t5, t6]]
[8, 10, 12]
>>> results_with_input = torch_xla._XLAC._get_tensors_xla_device_data_node([t4, t5, t6], [t4])
>>> [torch_xla._XLAC._xla_get_tensor_id(x) for x in results_with_input[1]]
[8, 18, 19]
>>> results_without_input = torch_xla._XLAC._get_tensors_xla_device_data_node([t4, t5, t6], [])
>>> [torch_xla._XLAC._xla_get_tensor_id(x) for x in results_without_input[1]]
[20, 21, 22]

The input tensors are kept as optional, ensuring that the API change is backwards compatible.

cc: @mcuiaws

rpsilva-aws · 2025-04-30T20:53:53Z

@tengyifei @qihqi PTAL, as this will help us move away from explicit aliasings for the gradient accumulation (needed for 2.7.1), and with some internal nightly experimental workstreams.

tengyifei · 2025-05-01T07:16:21Z

torch_xla/csrc/init_python_bindings.cpp

-            }
+  m.def(
+      "_get_tensors_xla_device_data_node",
+      [](const std::vector<at::Tensor>& output_tensors,


Is the argument order swapped? Should input_tensors come first?

Extend device data node binding to not override tensor IDs

6edb8c4

rpsilva-aws marked this pull request as ready for review April 29, 2025 17:04

rpsilva-aws changed the title ~~Extend device data node binding to not override tensor IDs~~ Extend device data node binding API to not clone specified input tensors Apr 29, 2025

rpsilva-aws requested review from tengyifei and qihqi April 29, 2025 23:29

tengyifei reviewed May 1, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend device data node binding API to not clone specified input tensors #9054

Extend device data node binding API to not clone specified input tensors #9054

rpsilva-aws commented Apr 29, 2025 •

edited

Loading

rpsilva-aws commented Apr 30, 2025

tengyifei May 1, 2025

Extend device data node binding API to not clone specified input tensors #9054

Are you sure you want to change the base?

Extend device data node binding API to not clone specified input tensors #9054

Conversation

rpsilva-aws commented Apr 29, 2025 • edited Loading

rpsilva-aws commented Apr 30, 2025

tengyifei May 1, 2025

Choose a reason for hiding this comment

rpsilva-aws commented Apr 29, 2025 •

edited

Loading