Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Adding an ONNX to Torch conversion #1639

Open
qedawkins opened this issue Aug 25, 2022 · 44 comments
Open

[RFC] Adding an ONNX to Torch conversion #1639

qedawkins opened this issue Aug 25, 2022 · 44 comments

Comments

@qedawkins
Copy link

qedawkins commented Aug 25, 2022

Hello everyone, in a recent RFC in torch-mlir a conversion from ONNX dialect to the Torch dialect in torch-mlir was proposed (llvm/torch-mlir#1255). After feedback we are looking at adding that conversion pass to onnx-mlir.

Proposal

  • Add torch-mlir as a submodule: https://github.com/llvm/torch-mlir
  • Add a new conversion pipeline from ONNX to Torch (e.g. "--convert-onnx-to-torch-pipeline") to manage both op and type conversion.
  • Conversion tests in "test/mlir/conversion/onnx_to_torch/"
  • Ideally target the torch-mlir backend contract which comes after shape inference and maximizing value semantics

Proof of Concept

A proof of concept for the conversion can be found at: https://github.com/nod-ai/onnx-mlir/tree/convert-to-torch
A conversion for ONNXAddOp and ONNXConstantOp is included along with unit tests for each. This includes five passes in the conversion pipeline:

  1. "convert-onnx-to-torch" handles the op by op lowering to torch dialect
  2. "convert-function-types-to-torch-types" converts function arguments to torch types (e.g. torch.tensor) that wasn't converted in the previous pass
  3. "finalize-torch-type-conversion" finalizes the conversion to torch types and appropriately removes any UnrealizedCastConversionOp's
  4. "erase-onnx-entry-point" right now this entry point op gets removed for compatibility with torch-mlir. If this poses a problem on the onnx-mlir side feedback would be much appreciated.
  5. "refine-torch-func-value-semantics" changes the types of tensors for public functions without uses to have value semantics. This is a compatibility requirement for torch-mlir but also a reflection of this PoC conversion targeting non-value semantic tensors. If tensors in onnx-mlir implicitly have value semantics then this pass can be removed and the type conversions handling in passes 1, 2, and 3 can be changed to match the type conversion in torch-mlir more closely.

Looking forward to any and all feedback!

cc @powderluv @silvasean @sstamenova @ashay

@AlexandreEichenberger
Copy link
Collaborator

We are generally receptive, and we already have an effort from onnx to mhlo and onnx to tossa. Have you considered piggybacking on these? Namely from mhlo or toss to torch-mlid?
Adding new conversions is fine pending on these two conditions: one is that there is the support behind it to carry to a fully (or close to fully) supported pass. Our goal here is to avoid semi-complete solutions that add code burden to the project without providing useful service to our customers. Second that we resolve our dependences with LLVM in a somewhat concerted fashion. On that latter point, right now we attempt to synchronize our upgrade to LLVM on a weekly basis with MHLO... you can check our pinned issue #1624.

@silvasean
Copy link

We are generally receptive, and we already have an effort from onnx to mhlo and onnx to tossa. Have you considered piggybacking on these? Namely from mhlo or toss to torch-mlid?

We have Torch -> {Linalg,MHLO,TOSA} conversions on the Torch-MLIR side, so ideally we can centralize all efforts on the ONNX -> Torch conversion and remove the need for ONNX -> {MHLO,TOSA} direct lowerings. Conversely, if significant effort is going to be invested in ONNX -> {MHLO,TOSA} then ONNX -> Torch makes less sense.

On that latter point, right now we attempt to synchronize our upgrade to LLVM on a weekly basis with MHLO... you can check our pinned issue #1624.

We are all basing off of the green commit in llvm/torch-mlir#1178 so this should not be an issue.

5. "refine-torch-func-value-semantics" changes the types of tensors for public functions without uses to have value semantics. This is a compatibility requirement for torch-mlir but also a reflection of this PoC conversion targeting non-value semantic tensors. If tensors in onnx-mlir implicitly have value semantics then this pass can be removed and the type conversions handling in passes 1, 2, and 3 can be changed to match the type conversion in torch-mlir more closely.

This shouldn't be needed. Builtin tensors, which onnx-mlir appears to use, already have value semantics, so you should just directly convert builtin tensors to !torch.vtensor<...> appropriately.

@powderluv
Copy link

Adding new conversions is fine pending on these two conditions: one is that there is the support behind it to carry to a fully (or close to fully) supported pass. Our goal here is to avoid semi-complete solutions that add code burden to the project without providing useful service to our customers.

Sounds good. We have few customers who would like to see this unified to avoid having two separate paths. nod.ai team can contribute to support the required passes (as we do in torch-mlir) .

If this sounds reasonable we can get @qedawkins to refine the RFC with specific commits and start putting them up for review.

@AlexandreEichenberger
Copy link
Collaborator

Right now, the MHLO is making rapid progress, which would integrate the ONNX->TF ecosystem. If there is sufficient support on your end, I can see the benefit of having a ONNX -> Torch ecosystem as well. Can you comment on the technical merit of having a direct ONNX->Torch-MLIR vs ONNX->MHLO for your particular use?

@silvasean
Copy link

Can you comment on the technical merit of having a direct ONNX->Torch-MLIR vs ONNX->MHLO for your particular use?

So for Torch-MLIR it would be a no-op technically. The idea is that since Torch-MLIR already has these lowerings written, then onnx-mlir could just use them. A few technical advantages of ONNX -> Torch -> MHLO for the ecosystem would be:

  1. as MHLO evolves to StableHLO for frontends to target, onnx-mlir would be shielded from all those changes (additionally, as an ecosystem, we would only need to update one place)
  2. onnx-mlir would benefit from all the e2e correctness testing that Torch-MLIR does for the lowerings
  3. I see that some of the contributors to ONNX -> MHLO lowerings are from the same teams as Torch -> MHLO lowerings @Connor-XY @yaochengji (cc @fortianyou and @ZihengJiang for visibility from the TorchToMhlo side). This may reduce the work for those teams.
  4. ONNX-mlir would get TOSA and direct Linalg lowerings "for free".

A few potential downsides:

  1. onnx-mlir development would be more coupled to Torch-MLIR
  2. onnx-mlir developers would have to be more aware of PyTorch op semantics to write ONNX -> Torch lowerings

@AlexandreEichenberger
Copy link
Collaborator

Thanks for the feedback. We are definitely encouraging the MHLO onnx-mlir developers to migrate to StableHLO and I believe they agreed once it becomes stable... think it is almost there.

I agree also that with a stronger connection to MLIR via MHLO and possibly Torch, onnx to tosa might become redundant and since I have not seen lots of progress lately (apologies if mischaracterized) that might be a better way to go about that.

Given the small teams working on these projects, if there is a way to have common infrastructure (reusing when possible) between the MHLO and Torch, that would be great. Exploring structures to enable this kind of reuse would strengthen both effort and reduce redundancy, as AI models (ONNX, Torch, TF) do a lot of incremental changes that all need to be maintained and upgraded over time. So more reuse means less updating of redundant activities...

Along a similar way, we have a unified way to do shape inference to detect compile time shapes, and reuse the same code (in a different context) to generate the runtime code for runtime shapes. MHLO has started approaching custom code for dyn shapes, and we are encouraging to reuse the common infrastructure between shape inference for compile time and run time shapes.

@silvasean
Copy link

Along a similar way, we have a unified way to do shape inference to detect compile time shapes, and reuse the same code (in a different context) to generate the runtime code for runtime shapes. MHLO has started approaching custom code for dyn shapes, and we are encouraging to reuse the common infrastructure between shape inference for compile time and run time shapes.

We have the exact same thing: https://github.com/llvm/torch-mlir/blob/main/docs/shape_lib.md 🤣

@AlexandreEichenberger
Copy link
Collaborator

Good to know... looks like we went through the same pain points, having 2 similar code for shape inference and later code gen, to unify both for lower maintenance.

We also have not generated guard yet, though it is also in the back of our mind.

@tungld
Copy link
Collaborator

tungld commented Sep 1, 2022

As I understand the propose here is to integrate ONNX->Torch into onnx-mlir only and does not utilize onnx-mlir further e.g. to LLVM level. Also this erase-onnx-entry-point indicates onnx-mlir driver is not needed. Then onnx-mlir can be a third_party somewhere for torch-mlir to import onnx models into torch-mlir dialect. In that sense, shape inference and other passes in onnx-mlir are not needed, and there is no need to unify them between torch-mlir and onnx-mlir.

@silvasean
Copy link

The idea is that ONNX->Torch could save effort for the onnx-mlir community, by providing ONNX->Torch->{Linalg,MHLO,TOSA} which gives Linalg, MHLO, and TOSA "for free" (and eventually TCP, StableHLO, etc.). As Torch-MLIR we don't really have any interest per se in importing ONNX, so we would not take a dependency on onnx-mlir. We are simply offering to collaborate and provide a service if onnx-mlir doesn't want to maintain Linalg, MHLO, and TOSA (+ TCP, StableHLO, ...) lowerings yourselves. I'll let the onnx-mlir community decide if that is appealing, and we are willing to collaborate if you folks want that.

@tungld
Copy link
Collaborator

tungld commented Sep 1, 2022

The idea is that ONNX->Torch could save effort for the onnx-mlir community, by providing ONNX->Torch->{Linalg,MHLO,TOSA} which gives Linalg, MHLO, and TOSA "for free" (and eventually TCP, StableHLO, etc.)

Just want to see the big picture. Could you elevate a bit about how end-users use this pipeline? For example, how they prepare inputs for running the compiled model and get outputs. Your vision is to use onnx-mlir driver for that purpose or expect a new driver, or the driver already exists.

We are simply offering to collaborate and provide a service if onnx-mlir doesn't want to maintain Linalg, MHLO, and TOSA (+ TCP, StableHLO, ...) lowerings yourselves

Yes, sure, it makes sense.

@silvasean
Copy link

Just want to see the big picture. Could you elevate a bit about how end-users use this pipeline? For example, how they prepare inputs for running the compiled model and get outputs. Your vision is to use onnx-mlir driver for that purpose or expect a new driver, or the driver already exists.

I'm imagining that onnx-mlir would just link to Torch-MLIR and run a few of Torch-MLIR's passes as an internal implementation detail to get from ONNX -> {MHLO,TOSA,Linalg} depending on the user wants. It shouldn't touch things like preparing inputs/drivers/etc.

@powderluv
Copy link

We (nod.ai) have couple customers that require this - and we plan to support them somehow. And that is why we added the initial RFC with implementation in torch-mlir (which included onnx / onnx-mlir as deps) and then this RFC+implementation with torch-mlir as a dep in onnx-mlir.

Basically the customers want to avoid two paths to MHLO (which they care about) and two kinds / levels of ops/shape support. We (nod.ai) team contribute a lot to torch-mlir to maintain the torch-dialect->{MHLO, TOSA, Linalg} and if we can leverage an ONNX entry point into torch-dialect and then leverage all the other lowerings it is one set of backend lowerings with two "frontends" torch and onnx.

The end product would like like a compiler/binary that can consume ONNX modules as an Onnxruntime EP or can consume Torch in some other serving solution like Triton Inference server. The lowering to torch-dialect may be different but after that it is all the same allowing backend Accelerators to easily consume from both frameworks.

@tungld
Copy link
Collaborator

tungld commented Sep 1, 2022

@Connor-XY @yaochengji Could you please chime in? Would like to hear your comments about ONNX->Torch->MHLO vs ONNX-MHLO. Thanks!

@Connor-XY
Copy link
Contributor

I think it would be great to have the conversion from ONNX to Torch, but I also believe that it is better to have the conversion from ONNX to MHLO directly. First, we could do the conversion directly so it isn’t necessary to use Torch as the bridge. Second, if we adopt ONNX->Torch->MHLO and there is a version upgrade, we need to modify both ONNX->Torch and Torch->MHLO. The conversion would rely on Torch (onnx-milr development would rely on Torch more).

@yaochengji
Copy link
Member

To add what @Connor-XY said, thanks for the help of @tungld, @AlexandreEichenberger and all other community members, we are currently applying onnx-mlir on our business models in ByteDance.

We'll definitely try the ONNX->Torch route after it is mature. And before that, we'll keep maintaining and developing the ONNX-MHLO route.

@Svoch
Copy link

Svoch commented Sep 6, 2022

The idea of a robust conversions between ONNX and Torch front-ends seems interesting. Excited to see how this RFC evolves. It seems reasonable to avoid duplicating the effort to support intermediate dialects and features like shape inference.

@AlexandreEichenberger
Copy link
Collaborator

AlexandreEichenberger commented Sep 7, 2022

Do you think you could present the benefits of this direct approach at our Tu evening meeting, either next week or in 2 weeks?
And if you could give us a good link to read about the MLIR/Torch effort, that would be great

@qedawkins
Copy link
Author

Do you think you could present the benefits of this direct approach at our Tu evening meeting, either next week or in 2 weeks? And if you could give us a good link to read about the MLIR/Torch effort, that would be great

We can aim for a presentation + discussion at next week's meeting if that works for you. Torch-MLIR has documentation you can reference for the time being here: https://github.com/llvm/torch-mlir/tree/main/docs

The architecture document (https://github.com/llvm/torch-mlir/blob/main/docs/architecture.md) might be a good place to start.

@qedawkins
Copy link
Author

@AlexandreEichenberger What is the status on this? Is tomorrow's meeting still a good time to discuss this?

@AlexandreEichenberger
Copy link
Collaborator

Absolutely, you can have a 15 min slot at our 8-9pm est meeting.

https://github.com/onnx/onnx-mlir/wiki/Informal-meeting-agenda-and-notes

@AlexandreEichenberger
Copy link
Collaborator

Would be great if you could explain the value add of the onnx-torch lowering in the picture of the torch-mlir effort:
https://github.com/llvm/torch-mlir/blob/main/Torch-MLIR.png

@silvasean
Copy link

There is no value-add for Torch-MLIR (it is actually a minor cost for us to support this). It is purely ONNX-MLIR reusing work already done for Torch-MLIR. It is a win for the ecosystem to maintain fewer lowerings, in theory.

@AlexandreEichenberger
Copy link
Collaborator

Got it. Found your https://github.com/llvm/torch-mlir/blob/main/docs/architecture.md instructing. If you could give us an idea of where you would intersect ONNX dialect with the dialects in that doc, that would let us focus our attention a bit more on the relevant dialect, tx

@silvasean
Copy link

silvasean commented Sep 13, 2022

ONNX-MLIR would produce the torch dialect in the "backend contract" form.

@qedawkins
Copy link
Author

qedawkins commented Sep 13, 2022

This diagram shows a preview of how the conversion pass slots in on the Torch-MLIR side for reference before the meeting.
image

The ONNX-MLIR side in this diagram leaves out some components that can be filled in later.

@AlexandreEichenberger
Copy link
Collaborator

Is there a similar effort to have a "stable torch dialect", like "stableMHLO"?

@silvasean
Copy link

No, our dialect is only as stable as the torch op set (which is fairly stable, but not guaranteed)

@AlexandreEichenberger
Copy link
Collaborator

@qedawkins would you be able to add a pdf of your presentation yesterday? for these that were not able to be present. Tx

@qedawkins
Copy link
Author

Sure!
Here it is: ONNX To Torch Conversion.pdf

@AlexandreEichenberger
Copy link
Collaborator

I have a summary of our Tuesday evening discussion in the wiki pages from this project here.

As a follow on to this discussion, we would like to have feedback from the teams developing onnx->tosa and onnx->mhlo converters, as this RFC may also be able to generate tossa or mhlo output via the torch-mlir backends currently available.

I am listing below developers that I see associated with these two projects; if you know of any other, please add references here and/or contact them directly.

Thanks

@Connor-XY @yaochengji @chenchongsong @BraunPhilipp @sjarus Please provide feedback as you see fit within a week, would be great to have it before the next meeting where we will take a position on this RFC. You are obviously welcome to participate at the meeting too, now open to the whole onnx-mlir community. See wiki page for info on how to join.

@sjarus
Copy link
Contributor

sjarus commented Sep 14, 2022

Adding @eric-k256 @stephenneuendorffer and @ljfitz for input on ONNX-TOSA in the context of the likely presence of ONNX-Torch and the existing Torch-TOSA.

@yaochengji
Copy link
Member

Please provide feedback as you see fit within a week

ONNX2Torch route should be useful, especially when we are considering changing to stablehlo. The biggest concern we have is that ONNX -> Torch -> MHLO route is more complex than ONNX -> MHLO, which introduces more potential issues.

And the ONNX->MHLO route is already applied on the business models in Bytedance. The model coverage is quite high ( about 50%), considering we've only worked on this for 2 months. Therefore we will continue to work on the ONNX->MHLO route to increase the model coverage in Bytedance. For the ONNX -> Torch -> MHLO route, if we later find that it is mature enough, we will switch to it.

@AlexandreEichenberger
Copy link
Collaborator

Thanks for your feedback @yaochengji, this is very useful. Primary concern to us is to have mature implementation of dialects, so knowing that your team will continue working on MHLO is good to know. Hopefully you can also resolve the handling of dynamic shapes in a way that reuses onnx-mlir infra would be great and unblock some of the MHLO PRs that have been blocked for longer than I like.

@yaochengji
Copy link
Member

Hopefully you can also resolve the handling of dynamic shapes in a way that reuses onnx-mlir infra would be great and unblock some of the MHLO PRs that have been blocked for longer than I like.

Yes, we're actively working on this. I've already discussed with @Connor-XY offline and figured out how to use this correctly. I guess @Connor-XY could resolve this soon.

@AlexandreEichenberger
Copy link
Collaborator

AlexandreEichenberger commented Sep 15, 2022

I mentioned that we have a way to annotate code/makefile/xxx and then gather annotations to generate a report of coverage. We do this for our support of lowering to CPU and to our NNPA based accelerator. This same mechanism can be used to generate a report of coverage of a converter from ONNX to TOSA/MHLO/Torch-MLIR.

PR describes format and will let you know which file/makefile to edit. We added link to reports on main README.md page. PR #1475. But check current code as we may have modified the format a bit.

@yaochengji
Copy link
Member

I mentioned that we have a way to annotate code/makefile/xxx and then gather annotations to generate a report of coverage.

Thanks. We'll take a look at this.

@AlexandreEichenberger
Copy link
Collaborator

Any additional feedback? So far, there is mostly positive feedback.

One thing we learned from MHLO effort is that it is nice to have a switch that disable building the MHLO part. CI all work with everything on, but certain users may elect to not have parts that are not needed. This would also be a feature that would be good to have for torch-mlir converter.

@silvasean
Copy link

Yes, Torch-MLIR has a build flag to disable building the MHLO path.

@AlexandreEichenberger
Copy link
Collaborator

Yes, Torch-MLIR has a build flag to disable building the MHLO path.

Was asking if you would also support a flag disabling building Torch-MLIR within onnx-mlir

@silvasean
Copy link

Ah yes, that's a good idea.

@AlexandreEichenberger
Copy link
Collaborator

AlexandreEichenberger commented Sep 20, 2022

As previously stated, I believe that unless we hear any additional concerns by today, we should go ahead with your RFC.

@AlexandreEichenberger
Copy link
Collaborator

I believe you should go ahead and start PRs to implement your converter. Good luck.

@qedawkins
Copy link
Author

Thank you all for the feedback and helpful discussion. I've opened a PR #1731 where we can begin ironing out the details and get an initial version of this merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants