Skip to content

Commit

Permalink
queue length for MTIA (#202)
Browse files Browse the repository at this point in the history
Summary:

HTA computes the number of outstanding operations on each stream and is represented by queue length. It generates another trace with the queue length info

Differential Revision: D65774955
  • Loading branch information
fenypatel99 authored and facebook-github-bot committed Nov 12, 2024
1 parent 553804a commit c8aec40
Showing 1 changed file with 4 additions and 2 deletions.
6 changes: 4 additions & 2 deletions hta/common/trace_symbol_table.py
Original file line number Diff line number Diff line change
Expand Up @@ -141,11 +141,13 @@ def get_runtime_launch_events_query(self) -> str:
cuLaunchKernel_id = self.sym_index.get("cuLaunchKernel", self.NULL)
cudaMemcpyAsync_id = self.sym_index.get("cudaMemcpyAsync", self.NULL)
cudaMemsetAsync_id = self.sym_index.get("cudaMemsetAsync", self.NULL)

mtiaLaunchKernel_id = self.sym_index.get(
"runFunction - job_prep_and_submit_for_execution", self.NULL
)
return (
f"((name == {cudaMemsetAsync_id}) or (name == {cudaMemcpyAsync_id}) or "
f" (name == {cudaLaunchKernel_id}) or (name == {cudaLaunchKernelExC_id})"
f" or (name == {cuLaunchKernel_id})) and (index_correlation > 0)"
f" or (name == {cuLaunchKernel_id}) or (name == {mtiaLaunchKernel_id})) and (index_correlation > 0)"
)


Expand Down

0 comments on commit c8aec40

Please sign in to comment.