Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Op names aren't shown in tensorflow_stats #401

Open
vvolhejn opened this issue Mar 24, 2022 · 1 comment
Open

Op names aren't shown in tensorflow_stats #401

vvolhejn opened this issue Mar 24, 2022 · 1 comment

Comments

@vvolhejn
Copy link

vvolhejn commented Mar 24, 2022

Hi,
I'm trying to profile the inference performance of my model on CPU to figure out where the bottlenecks are. However, when I run the profiler and open tensorflow_stats, it groups all the operations by type (MatMul, BiasAdd...) and doesn't tell me which layer is taking how much time.

Here is a minimal example:

import tensorflow as tf


class FakeModel(tf.keras.Model):
    def __init__(self):
        super().__init__()
        self.l1 = tf.keras.layers.Dense(16, activation=tf.nn.relu, name="bar")
        self.l2 = tf.keras.layers.Dense(8, name="baz")

    def call(self, x, *args, **kwargs):
        y = self.l1(x)
        y = self.l2(y)

        return y


fake_model = FakeModel()


with tf.profiler.experimental.Profile('logdir'):
    for i in range(10):
        fake_batch = tf.random.normal((8, 32))
        with tf.profiler.experimental.Trace("test", step_num=i):
            fake_model(fake_batch)

The console output doesn't report anything weird:

2022-03-24 17:54:20.546354: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-24 17:54:20.557142: I tensorflow/core/profiler/lib/profiler_session.cc:110] Profiler session initializing.
2022-03-24 17:54:20.557166: I tensorflow/core/profiler/lib/profiler_session.cc:125] Profiler session started.
2022-03-24 17:54:20.605772: I tensorflow/core/profiler/lib/profiler_session.cc:67] Profiler session collecting data.
2022-03-24 17:54:20.609004: I tensorflow/core/profiler/lib/profiler_session.cc:143] Profiler session tear down.
2022-03-24 17:54:20.616532: I tensorflow/core/profiler/rpc/client/save_profile.cc:136] Creating directory: logdir/plugins/profile/2022_03_24_17_54_20

2022-03-24 17:54:20.619339: I tensorflow/core/profiler/rpc/client/save_profile.cc:142] Dumped gzipped tool data for trace.json.gz to logdir/plugins/profile/2022_03_24_17_54_20/my-computer.trace.json.gz
2022-03-24 17:54:20.625353: I tensorflow/core/profiler/rpc/client/save_profile.cc:136] Creating directory: logdir/plugins/profile/2022_03_24_17_54_20

2022-03-24 17:54:20.625652: I tensorflow/core/profiler/rpc/client/save_profile.cc:142] Dumped gzipped tool data for memory_profile.json.gz to logdir/plugins/profile/2022_03_24_17_54_20/my-computer.memory_profile.json.gz
2022-03-24 17:54:20.628578: I tensorflow/core/profiler/rpc/client/capture_profile.cc:251] Creating directory: logdir/plugins/profile/2022_03_24_17_54_20
Dumped tool data for xplane.pb to logdir/plugins/profile/2022_03_24_17_54_20/my-computer.xplane.pb
Dumped tool data for overview_page.pb to logdir/plugins/profile/2022_03_24_17_54_20/my-computer.overview_page.pb
Dumped tool data for input_pipeline.pb to logdir/plugins/profile/2022_03_24_17_54_20/my-computer.input_pipeline.pb
Dumped tool data for tensorflow_stats.pb to logdir/plugins/profile/2022_03_24_17_54_20/my-computer.tensorflow_stats.pb
Dumped tool data for kernel_stats.pb to logdir/plugins/profile/2022_03_24_17_54_20/my-computer.kernel_stats.pb

When I view the results in Tensorboard,, I get:

Screenshot 2022-03-24 at 17 55 21

Here I would like the two layers to be shown as separate elements in the charts, rather than there being a single "MatMul" element for both matrix multiplications together. This is also what the guide shows in one of the screenshots. How can I achieve this?

Furthermore, when I run this code, I can only see tensorflow_stats and trace_viewer; some of the others say "No step marker observed and hence the step time is unknown". Am I using the profiler wrong? My model uses a custom training loop so I am using context managers rather than a callback, as per the guide.

Versions:
tensorflow 2.8.0
tensorboard 2.8.0
tensorboard_plugin_profile 2.5.0
macOS Catalina 10.15.7

@zzzaries
Copy link
Collaborator

zzzaries commented Sep 22, 2022

Hi vvolhejn, sorry for the late reply, did you figure out this issue yet?

For your 2nd question, some observations:

  • you seems to missing the 3rd argument when instrumenting the python custom training loops, _r=1.
  • The guide mentions "Make sure that you include the dataset iterator within the tf.profiler.experimental.Trace context for accurate analysis of the input pipeline", so maybe put your fake_batch = tf.random.normal((8, 32)) line underneath the tf.profiler.experimental.Trace scope

For your 1st question, not sure why your Type and Op are the same. noticed that you seem to use the Context manager API to collect the profile data, be sure to use it only in local mode (having the API called in the same machine where your model is running, not remotely like a cloud machine). If it's still not working for you, maybe try switching to the tf.profiler.experimental start/stop API instead and try again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants