Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to associate phase & graph with ops & time #439

Open
jxtps opened this issue Jun 10, 2022 · 0 comments
Open

Add ability to associate phase & graph with ops & time #439

jxtps opened this issue Jun 10, 2022 · 0 comments
Labels
enhancement New feature or request

Comments

@jxtps
Copy link

jxtps commented Jun 10, 2022

The tensorboard profiling capabilities are great, but it can be hard to tell precisely where in your computational graph the slowness comes from. To help with that, it would be fantastic if it were possible for the profiling tool to include some additional categorization of the performance data:

Associate ops & specific time spent with each phase (forward, loss, backward, optimizer) and each layer (Conv2d, BatchNorm, ReLU, etc) in the graph hierarchy.

Ideally the block diagram of the graph is hierarchically captured so that you can view ops & timings for e.g. your ResNetBlock1, then drill down to whatever constituent Conv/BN/ReLU/other blocks that may be within, recursively.

The training & inference time consumption can be very non-uniform across a graph, and having this breakdown would allow practitioners to pin-point specific problem-blocks in their networks, hopefully assisting in producing more efficient networks.

Thanks!

(coming from tensorflow/tensorboard#5749 - reposted here by request)

@Matt-Hurd Matt-Hurd added the enhancement New feature or request label Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants