Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nanoGPT FP8 compilation failed #353

Open
yizhuoz004 opened this issue Nov 8, 2024 · 2 comments
Open

nanoGPT FP8 compilation failed #353

yizhuoz004 opened this issue Nov 8, 2024 · 2 comments
Assignees
Labels
mlir-tensorrt Pull request for the mlir-tensorrt project

Comments

@yizhuoz004
Copy link
Collaborator

debug.txt

Error:

(t5799)error: result at index 0 has type 'tensor<1x?x768xf8E4M3FN>', but decomposition has type 'tensor<?x?x?xf8E4M3FN>'
@yizhuoz004 yizhuoz004 added the mlir-tensorrt Pull request for the mlir-tensorrt project label Nov 8, 2024
@christopherbate christopherbate self-assigned this Nov 16, 2024
@christopherbate
Copy link
Collaborator

This one is a simple fix, I'll ensure it gets sync'd up here tomorrow

@pranavm-nvidia
Copy link
Collaborator

I have a draft PR to enable float8, but still seeing the same error:

    (t4723)error: result at index 0 has type 'tensor<1x?x768xf8E4M3FN>', but decomposition has type 'tensor<?x?x?xf8E4M3FN>'

    This error occured while trying to compile the following FlatIR expression:
          |
          | t4723: [rank=(3), shape=((-1, -1, -1)), dtype=(float8), loc=(gpu:0)] = ConvertOp(t_inter1106)
          | 


    Note: This originated from the following expression:

    --> /tripy/tripy/frontend/module/linear.py:136 in __call__()
          |
      136 |                 q_x = quantize(x, self.input_scale, self.quant_dtype)
          |                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mlir-tensorrt Pull request for the mlir-tensorrt project
Projects
None yet
Development

No branches or pull requests

3 participants