Skip to content

Profiling for C++ backend #45

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

RichardWarfield
Copy link

I implemented this functionality in order to be able to profile a function running on the C++ backend. Some notes on the implementation are in the commit message and usage should be the same as for the Python profiler, i.e. call cgt.profile.stop(), cgt.profile.start(), cgt.profile.print_stats().

Currently I have only implemented this on the sequential interpreter though it should be straightforward to make the parallel interpreter work as well.

To test this I made a few minor changes to the MNIST demo so the backend can be switched from the command line.

Examples:

Python backend (original functionality):

richard@seoul-mint ~/workspace/cgt $ python ./examples/demo_mnist.py --epochs=1 --profile --devtype=cpu --backend=python
     Epoch |  Train NLL |  Train Err |   Test NLL |   Test Err | Epoch Time
         0 |   0.245118 |     0.0781 |   0.225886 |     0.0706 |    14.5864
Total time elapsed: 15.7 seconds

************************************************************
*************************  By Op  **************************
************************************************************
Instruction               Count         Time         Frac    Frac cumsum
----------------------  -------  -----------  -----------  -------------
Mul22{N,N}                 1413  5.53256      0.352893          0.352893
Mul22{T,N}                 1407  4.41996      0.281926          0.63482
Mul22{N,T}                  938  3.91324      0.249605          0.884425
multiply                   7510  0.382699     0.0244104         0.908835
Alloc{dtype=f4,ndim=2}    21592  0.283814     0.018103          0.926938
...

C++ CPU backend:

richard@seoul-mint ~/workspace/cgt $ python ./examples/demo_mnist.py --epochs=1 --profile --devtype=cpu --backend=native
using python impl for Argmax{1}
     Epoch |  Train NLL |  Train Err |   Test NLL |   Test Err | Epoch Time
         0 |   0.244724 |     0.0778 |    0.22509 |     0.0696 |    3.05734
Total time elapsed: 3.17 seconds

************************************************************
*************************  By Op  **************************
************************************************************
Instruction               Count         Time         Frac    Frac cumsum
----------------------  -------  -----------  -----------  -------------
Mul22{N,N}                 1413  0.598506     0.188843          0.188843
sqrt                       1407  0.595224     0.187807          0.37665
Mul22{T,N}                 1407  0.468959     0.147968          0.524617
multiply                   7510  0.386939     0.122088          0.646706
divide                     4229  0.341507     0.107754          0.754459
...

C++ GPU backend:

richard@seoul-mint ~/workspace/cgt $ python ./examples/demo_mnist.py --epochs=1 --profile --devtype=gpu --backend=native
using python impl for Argmax{1}
     Epoch |  Train NLL |  Train Err |   Test NLL |   Test Err | Epoch Time
         0 |   0.244901 |     0.0779 |   0.225598 |     0.0703 |    1.89208
Total time elapsed: 1.89 seconds

************************************************************
*************************  By Op  **************************
************************************************************
Instruction               Count         Time         Frac    Frac cumsum
----------------------  -------  -----------  -----------  -------------
Transport                 13160  1.64103      0.866082          0.866082
Mul22{N,N}                 1413  0.107393     0.0566784         0.922761
multiply                   7510  0.033405     0.0176301         0.940391
add                        3754  0.015549     0.00820627        0.948597
divide                     4700  0.0137958    0.00728097        0.955878
...

@f0k
Copy link

f0k commented Oct 19, 2015

You merged a bunch of foreign commits into your PR. You should be able to fix this by:

git fetch upstream
git reset --hard upstream/master
git cherry-pick 96b36ed
git push --force

(assuming you've got joschu/cgt added as a remote named "upstream")

…end SequentialInterpreter.

Native profiling involves several components:
- C++ NativeProfiler class in execution.cpp, which plays the role
  of compilation._Profiler
- Calls to update the profiler from SequentialInterpreter.run
- Python wrapper for NativeProfiler in cycgt.pyx (NativeProfilerWrapper)
- A reference to the wrapper is stored as a variable in the cycgt module
  (native_profiler)
- In order for us to be consistent in displaying the profiling output on
  a per-Op (rather than per-instruction) basis, we need a way to map each
  C++ Instruction instance back to the Python Instr instance from which it
  derives.  In order to make this possible I added a call in
  CppInterpreterWrapper__init__ that tells the NativeProfilerWrapper to
  store the mappings.
- A C++ Instruction cannot (easily/safely) refer back to the Python
  op, but we need the op description to properly display the profile
  results; therefore I have passed the op description as a string
  op_repr which is stored in the Instruction class and passed back to
  Python for display when

From a user perspective the profiling interface should remain unchanged,
i.e. cgt.profile.start, cgt.profile.print_stats... etc, should work as
before but the results will now include native operations.  But note
that cgt.profiler.instr2stats will NOT include native calls.

TODO:
- Make this work for ParallelInterpreter.
- Since the Instruction->Instr mapping may be useful outside of profiling
  I feel it may be better to refactor and store this information somewhere
  else.
@RichardWarfield
Copy link
Author

Thanks... it should be fixed now.

@joschu
Copy link
Owner

joschu commented Oct 19, 2015

Nice work -- I'll review it in the next day or two.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants