Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve behavior of evaluation during compile #409

Open
pranavm-nvidia opened this issue Nov 26, 2024 · 2 comments
Open

Improve behavior of evaluation during compile #409

pranavm-nvidia opened this issue Nov 26, 2024 · 2 comments

Comments

@pranavm-nvidia
Copy link
Collaborator

When a tensor is evaluated during compile, we currently raise an error or print a warning. However, we could make Tripy functionally correct even in the case of evaluation by simply not updating the frontend tensor's op to Storage. This would preserve the computation graph and make compilation work correctly.

This might be as simple as adding one condition to Tensor.eval():

if not self.trace_tensor.is_compile_tracer:
    Storage.build_internal([], [self.trace_tensor], data)
    ...

We will need the warnings though since there may be cases where the evaluated result is erroneously used later in the graph - e.g.

batch = int(x.shape[0]) # Eval happens here
tp.ones((batch, ...)) # Dynamic shapes broken here

We could also suppress warnings in some cases, e.g. if the tensor is only being printed. A simple way to achieve that would be to add a suppress_warnings parameter to eval and set it to True when calling it from __repr__ - we will need to pass it through tolist, so we will likely want a _tolist_helper since the public tolist method should not have this option exposed.

We can, however, drop the errors that are thrown in compile, which means we can also drop the eval_stack_info field of TraceTensor.

Finally, to test it, we should verify the following are true when evaluating while compiling:

  1. We never raise an error
  2. The frontend tensor op is never updated in-place (i.e. it should not be turned into a Storage tensor)
  3. We do not emit warnings when the evaluation is triggered by __repr__ (i.e. we can safely assume the output is unused later in the graph)
@pranavm-nvidia
Copy link
Collaborator Author

We may actually still want warnings in all cases since the extra evaluations will trigger compilation, which could make tracing extremely slow.
A way to mitigate this could be to store the evaluated tensors somehow such that only other evaluated tensors use their values while the compiler still traces them like non-evaluated ops.

For example, if we have a graph like:

A -> B -> C -> D

and we print B and C, we should only need to compile the A->B part once during evaluation and compile C separately.
However, in the final compiled executable, we still want the full graph, not just the ->D part.

@slyubomirsky
Copy link
Collaborator

Per discussion, this probably is not a priority since we could not find cases where evaluating during the compilation is useful, especially since it is already permitted for cases like printing (when the result is not used). See #443.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants