Enable end to end non-DPS testing #289

jhalakpatel · 2024-10-18T02:29:57Z

Implement python binding changes to allow execute function return
multiple returns. Update tests to use non-DPS style calling convention.

Also, enable end to end lowering by enabling conversion of closed alloc group op to tensorrt dialect.

Miscellaneous fixes:

Add missing handling of CallAllocOp in EliminateShapeOps pass.
Skip non ranked tensor type function arguments while collecting host tensor arguments.
Temporarily add a pass to remove clone operation in MemRefToExecutor dialect conversion.
Relax memref creation for empty shape tensors.
Fix memref life returned from Lua function results. This required session allocator to track returned memref.
Return error status instead of silently erroring out.

mlir-tensorrt/executor/include/mlir-executor-c/Runtime/Runtime.h

mlir-tensorrt/compiler/lib/Conversion/TensorRTToTensorRTRuntime/TensorRTToTensorRTRuntime.cpp

mlir-tensorrt/executor/include/mlir-executor/Runtime/Backend/Lua/LuaRuntime.h

mlir-tensorrt/python/bindings/Runtime/RuntimePyBind.cpp

mlir-tensorrt/executor/lib/Conversion/MemRefToExecutor.cpp

mlir-tensorrt/executor/test/lib/BufferizationTestPass.cpp

mlir-tensorrt/tensorrt/lib/Target/TensorRTEncodingOpInterface/NetworkEncoder.cpp

mlir-tensorrt/compiler/lib/Dialect/Plan/Transforms/EliminateShapeOps.cpp

mlir-tensorrt/compiler/lib/Conversion/TensorRTRuntimeToExecutor/TensorRTRuntimeToExecutor.cpp

mlir-tensorrt/compiler/lib/Dialect/Plan/Transforms/EliminateShapeOps.cpp

mlir-tensorrt/test/Conversion/TensorRTRuntimeToExecutor/tensorrt-runtime-to-executor.mlir

mlir-tensorrt/python/bindings/Runtime/RuntimePyBind.cpp

mlir-tensorrt/executor/lib/Runtime/API/API.cpp

mlir-tensorrt/executor/lib/Conversion/MemRefToExecutor.cpp

mlir-tensorrt/executor/lib/CAPI/Runtime/Runtime.cpp

mlir-tensorrt/executor/include/mlir-executor-c/Runtime/Runtime.h

christopherbate · 2024-11-07T06:29:42Z

mlir-tensorrt/executor/include/mlir-executor-c/Runtime/Runtime.h

@@ -53,7 +53,7 @@ extern "C" {
 /// caller must be sure to delete errors via mtrtStatusDestroy.
 //===----------------------------------------------------------------------===//

-typedef struct MTRT_RuntimeClient MTRT_Runtimeclient;
+struct MTRT_RuntimeClient; // Forward declaration


This is a C header, not C++.

Without the typedef, we would fail to compile when including this header in a C library (e.g. we would need to rename all the types below to struct MTRT_RuntimeClient instead of just MTRT_RuntimeClient. The typedef is a requirement.

any idea why we use MTRT_Runtimeclient instead of MTRT_RuntimeClient? Is it just a typo?

mlir-tensorrt/compiler/lib/Dialect/Plan/Transforms/OutlineClusters.cpp

mlir-tensorrt/executor/include/mlir-executor-c/Runtime/Runtime.h

christopherbate · 2024-11-07T06:42:23Z

mlir-tensorrt/executor/lib/Runtime/Backend/Lua/LuaRuntime.cpp


-    results.push_back(std::move(*memref));
+    (*client)->getAllocTracker().incrementExternalCount((*memRef)->getMemory());


I think this violates an invariant that we should be enforcing -- external reference count should be 0 for any "externally managed" pointer. Reference counts are only required for the object that owns the resource (which in this case is the sesssion).

I've been meaning to erase the distinction between client/session trackers and just have RuntimeSession use the Client's tracker. But until we do that, what we need here actually is to "release" the pointer from the session ownership and have the client assume ownership.

mlir-tensorrt/test/python/IntegrationTests/test_non_dps_cconv.py

christopherbate · 2024-11-07T06:45:50Z

Overall, looks good except for minor comments. Great to see it working end-to-end, and of course much kudos to you for getting this working!

Implement python binding changes to allow execute function return multiple returns. Update tests to use non-DPS style calling convention. Also, enable end to end lowering by enabling conversion of closed alloc group op to tensorrt dialect. Miscellaneous fixes: 1. Add missing handling of `CallAllocOp` in EliminateShapeOps pass. 2. Skip non ranked tensor type function arguments while collecting host tensor arguments. 3. Temporarily add a pass to remove clone operation in MemRefToExecutor dialect conversion. 4. Relax memref creation for empty shape tensors. 5. Fix memref life returned from Lua function results. This required session allocator to track returned memref. Also, address Fix incorrect indexing into output memref results Return error status instead of silently erroring out during TensorRT weight conversion Address review comments

jhalakpatel · 2024-11-09T19:23:12Z

mlir-tensorrt/executor/lib/Runtime/Backend/Lua/LuaRuntime.cpp

@@ -650,9 +650,6 @@ parseResults(const sol::protected_function_result &pfr,
    if (!memref.isOk())
      return memref.getStatus();

-    // Increment external reference count since we are returning a memref
-    allocator.incrementExternalCount(info.ptr);


We do not need this. Reference count should only be incremented if a explicitly create a DL pack tensor on a memref.
Here memref is internally owned and tracked.

jhalakpatel · 2024-12-19T07:50:40Z

@christopherbate were you able to merge changes from this PR?

jhalakpatel commented Oct 18, 2024

View reviewed changes

mlir-tensorrt/executor/include/mlir-executor-c/Runtime/Runtime.h Show resolved Hide resolved

jhalakpatel force-pushed the jhalakp-python-exec-non-dps branch from 406ef76 to ad0cdfb Compare November 4, 2024 21:06

jhalakpatel requested review from christopherbate and shelkesagar29 as code owners November 4, 2024 21:06

jhalakpatel commented Nov 4, 2024

View reviewed changes

mlir-tensorrt/compiler/lib/Conversion/TensorRTToTensorRTRuntime/TensorRTToTensorRTRuntime.cpp Outdated Show resolved Hide resolved

jhalakpatel force-pushed the jhalakp-python-exec-non-dps branch from ad0cdfb to fcb0964 Compare November 4, 2024 21:47

jhalakpatel changed the title ~~[Python/Bindings] Allow execute_function API to return values~~ Enable end to end non-DPS testing Nov 4, 2024

jhalakpatel commented Nov 4, 2024

View reviewed changes

mlir-tensorrt/executor/include/mlir-executor/Runtime/Backend/Lua/LuaRuntime.h Show resolved Hide resolved

jhalakpatel commented Nov 4, 2024

View reviewed changes

mlir-tensorrt/python/bindings/Runtime/RuntimePyBind.cpp Show resolved Hide resolved

jhalakpatel force-pushed the jhalakp-python-exec-non-dps branch 2 times, most recently from 9fb24dd to 43d4c93 Compare November 5, 2024 01:05

jhalakpatel commented Nov 5, 2024

View reviewed changes

mlir-tensorrt/python/bindings/Runtime/RuntimePyBind.cpp Outdated Show resolved Hide resolved

jhalakpatel force-pushed the jhalakp-python-exec-non-dps branch 4 times, most recently from 7254cbe to ac28e6c Compare November 6, 2024 06:12

christopherbate reviewed Nov 6, 2024

View reviewed changes

mlir-tensorrt/executor/lib/Conversion/MemRefToExecutor.cpp Outdated Show resolved Hide resolved

christopherbate reviewed Nov 6, 2024

View reviewed changes

mlir-tensorrt/executor/test/lib/BufferizationTestPass.cpp Outdated Show resolved Hide resolved

christopherbate reviewed Nov 6, 2024

View reviewed changes

mlir-tensorrt/tensorrt/lib/Target/TensorRTEncodingOpInterface/NetworkEncoder.cpp Show resolved Hide resolved

christopherbate reviewed Nov 6, 2024

View reviewed changes

mlir-tensorrt/compiler/lib/Dialect/Plan/Transforms/EliminateShapeOps.cpp Outdated Show resolved Hide resolved

jhalakpatel force-pushed the jhalakp-python-exec-non-dps branch 4 times, most recently from e75953d to 070359d Compare November 6, 2024 22:04