Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialize conversion passes from ONNX to Torch-MLIR backend contract #1731

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

qedawkins
Copy link

Includes LLVM lit tests for ONNXAddOp and ONNXConstantOp. This PR is a result of the RFC discussed in #1639. The following overview of the PR is collected from there, but see the RFC for a more detailed discussion. This includes four passes in the conversion pipeline:

  1. "convert-onnx-to-torch" handles the op by op lowering to torch dialect
  2. "convert-function-types-to-torch-types" converts function arguments to torch types (e.g. torch.vtensor) that wasn't converted in the previous pass
  3. "finalize-torch-type-conversion" finalizes the conversion to torch types and appropriately removes any UnrealizedCastConversionOp's
  4. "erase-onnx-entry-point" right now this entry point op gets removed for compatibility with torch-mlir.

As requested there is an option to disable the torch-mlir build with ONNX_MLIR_ENABLE_TORCH. One point of awkwardness here though comes from torch-mlir including llvm-project as a submodule. Thus when adding torch-mlir to onnx-mlir as a submodule, cloning recursively will clone the entirety of llvm as well (it doesn't build it as we are only building the parts of torch-mlir that we need). Suggestions on how to manage the submodules in this case, as well as any comments on if/how to break up this PR are welcome.

If there is anything in particular needed to help with reviewing here please let me know.

@jenkins-droid
Copy link
Collaborator

Can one of the admins verify this patch?

@ghost
Copy link

ghost commented Sep 23, 2022

Thank you for the great work! You could probably configure a shallow submodule clone.

git config -f .gitmodules submodule.third_party/torch-mlir.shallow true

@qedawkins
Copy link
Author

Thank you for the great work! You could probably configure a shallow submodule clone.

Hmmm as far as I can tell a shallow submodule won't prevent the full llvm-project submodule in Torch-MLIR from being cloned when cloning with

git clone --recursive https://github.com/onnx/onnx-mlir.git

Correct me if I'm wrong but it didn't work when I tried your suggestion. I opened an issue in Torch-MLIR about formal support for the use case here (llvm/torch-mlir#1411).

@tungld
Copy link
Collaborator

tungld commented Sep 27, 2022

@jenkins-droid test this please

@qedawkins
Copy link
Author

qedawkins commented Sep 27, 2022

It looks like CMake on Windows isn't able to find the Torch-MLIR external dialects (TMTensor). I'll have to set up on Windows to iron out the CI problems.

@jenkins-droid
Copy link
Collaborator

Can one of the admins verify this patch?

@qedawkins
Copy link
Author

I can convert this PR to a draft until I verify that the Windows build works, but style comments/reviews are still welcome at this point.

@qedawkins qedawkins marked this pull request as draft September 27, 2022 01:14
@AlexandreEichenberger
Copy link
Collaborator

You need to abide by DCO. See Contributing section of main Readme.md. Every commits need to be signed. It can be overridden on this page (details next to DCO) but this should be used only exceptionally.

@AlexandreEichenberger
Copy link
Collaborator

@jenkins-droid test this please

@jenkins-droid
Copy link
Collaborator

Can one of the admins verify this patch?

@AlexandreEichenberger
Copy link
Collaborator

@jenkins-droid test this please

@qedawkins
Copy link
Author

I pushed a new change that changes to a temporary fork of Torch-MLIR that will be used for iterating this PR without needing to create PRs in Torch-MLIR for each small change. Once the reviewing is done here we can switch back to the main branch of Torch-MLIR. With this I was able to build successfully in the Docker image used by the CI with all unit tests passing, but I haven't tested locally on all platforms.

With this, if there are more CI failures, especially as this PR is reviewed, I am wondering if there is a way to set up so that running the CI doesn't require admin approval each time to allow for faster iteration (perhaps on a temporary branch within ONNX-MLIR). @AlexandreEichenberger if you or one of the other admins could help/comment here that would be greatly appreciated! If this isn't possible then no problem, but I figured I would ask.

@jenkins-droid
Copy link
Collaborator

Can one of the admins verify this patch?

@AlexandreEichenberger
Copy link
Collaborator

@jenkins-droid test this please

…sts for ONNXAddOp and ONNXConstantOp

Signed-off-by: Quinn Dawkins <[email protected]>
@jenkins-droid
Copy link
Collaborator

Can one of the admins verify this patch?

@gongsu832
Copy link
Collaborator

@jenkins-droid test this please

1 similar comment
@AlexandreEichenberger
Copy link
Collaborator

@jenkins-droid test this please

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants