-
Notifications
You must be signed in to change notification settings - Fork 40
Add task list based profiling #1337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
lroberts36
wants to merge
47
commits into
develop
Choose a base branch
from
lroberts36/task-list-timing
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
47 commits
Select commit
Hold shift + click to select a range
db512c3
Add task list timing code
lroberts36 62cfccc
clean up
lroberts36 9cd59aa
remove unused
lroberts36 351c813
Merge branch 'develop' into lroberts36/task-list-timing
lroberts36 c43fb6a
change build boundary buffers logic to pre-allocate a more thoughtful…
jonahm-LANL a00f508
pull out unified comms
jonahm-LANL eb76052
switch to weak pointers
lroberts36 238ecec
only allocate buffers for sprase vars that are allocated
jonahm-LANL f33eb49
comm buffer reset cadence
jonahm-LANL 621429c
reallocate -> reset
jonahm-LANL 51f6ae5
why did the CI linter catch this but not make lint?
jonahm-LANL 41fe15f
add missing defaulted constructor to bnd_id
jonahm-LANL 11fc6fb
CHANGELOG
jonahm-LANL ea05725
CC
jonahm-LANL eb0b53a
param docs
jonahm-LANL 556aae7
what is this instrumentation crud doing here??
jonahm-LANL b7a4a2a
also bnd-info
jonahm-LANL ca643c1
and semicolon
jonahm-LANL 853a956
come on
jonahm-LANL d753418
apparently kokkos_defaulted_function doesnt work for destructors on HIP?
jonahm-LANL cf36e1d
Im grasping at straws here. wtf.
jonahm-LANL a15fd49
There we go now it works
jonahm-LANL e19c756
ok try this
jonahm-LANL 1f9978e
OK IT WORKS
jonahm-LANL 7ef9638
put things where they belong
jonahm-LANL f26633d
fix for index split for AMD
jonahm-LANL 4d1b63a
Update doc/sphinx/src/boundary_communication.rst
Yurlungur 56761bb
lroberts comments
jonahm-LANL 4473cb9
parthenon enable gpu macro
jonahm-LANL 64b6017
oops true -> false
jonahm-LANL 0e3cd60
pgrete comments part 1
jonahm-LANL 33484a7
Add control over whether to include/exclude an output on final signal…
pgrete 8b3c76d
clean up annoying warning
jonahm-LANL 3a6e0d8
use error handling macros
jonahm-LANL 0633d0c
changelog
jonahm-LANL 6c24a6c
a restart on the testing framework
lroberts36 bfb16c8
move to list
lroberts36 b0c26a1
working local sync
lroberts36 132e506
add some comments
lroberts36 ade0663
remove print statements
lroberts36 9ae1c4a
format and lint
lroberts36 c0e0ce4
Merge branch 'develop' into lroberts36/task-list-timing
lroberts36 273feae
add timing unit test
lroberts36 44cc4e5
Add json writing capability
lroberts36 b3f0a5a
Allow for globally turning off timing
lroberts36 4e62298
changelog
lroberts36 7917e4a
Merge branch 'develop' into lroberts36/task-list-timing
lroberts36 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,109 @@ | ||
| //======================================================================================== | ||
| // (C) (or copyright) 2023-2025. Triad National Security, LLC. All rights reserved. | ||
| // | ||
| // This program was produced under U.S. Government contract 89233218CNA000001 for Los | ||
| // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC | ||
| // for the U.S. Department of Energy/National Nuclear Security Administration. All rights | ||
| // in the program are reserved by Triad National Security, LLC, and the U.S. Department | ||
| // of Energy/National Nuclear Security Administration. The Government is granted for | ||
| // itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide | ||
| // license in this material to reproduce, prepare derivative works, distribute copies to | ||
| // the public, perform publicly and display publicly, and to permit others to do so. | ||
| //======================================================================================== | ||
|
|
||
| #include <algorithm> | ||
| #include <cstdio> | ||
| #include <fstream> | ||
| #include <iomanip> | ||
| #include <iostream> | ||
| #include <map> | ||
| #include <memory> | ||
| #include <regex> | ||
| #include <sstream> | ||
| #include <string> | ||
| #include <utility> | ||
| #include <vector> | ||
|
|
||
| #include "task_timing.hpp" | ||
| #include "tasks.hpp" | ||
|
|
||
| namespace parthenon { | ||
|
|
||
| void TimingAccumulator::CollectTask(Task *task) { | ||
| ntasks++; | ||
| task->time_task = true; | ||
| task->timing_accumulators.push_back(shared_from_this()); | ||
| } | ||
|
|
||
| void TimingAccumulator::CollectTaskIfCollecting(Task *task) { | ||
| if (collecting) CollectTask(task); | ||
| } | ||
|
|
||
| Real TimingAccumulator::GetTotalTime() const { | ||
| Real total_time{0.0}; | ||
| for (auto &[start, end, status] : timings) | ||
| total_time += GetDurationInSeconds(start, end); | ||
| return total_time; | ||
| } | ||
|
|
||
| std::shared_ptr<TimingAccumulator> | ||
| TimingAccumulatorDictionary::GetOrAddAndRegister(const std::string &label, TaskList &tl) { | ||
| if (dict_.count(label) == 0) dict_[label] = TimingAccumulator::create(); | ||
| tl.RegisterTimingAccumulator(dict_[label]); | ||
| return dict_[label]; | ||
| } | ||
|
|
||
| void TimingAccumulatorDictionary::WriteToJSON(const std::string &filename) { | ||
| std::map<std::string, std::vector<std::pair<double, double>>> timings; | ||
|
|
||
| // First, find the minimum time to set zero | ||
| TimingAccumulator::time_t min_time = std::chrono::steady_clock::now(); | ||
| for (auto &[name, taccum] : dict_) { | ||
| for (const auto &timing : taccum->GetTimings()) { | ||
| min_time = std::min(min_time, std::get<0>(timing)); | ||
| } | ||
| } | ||
|
|
||
| // Now, go through and build the map that can be interpreted by python | ||
| for (auto &[name, taccum] : dict_) { | ||
| timings[name] = std::vector<std::pair<double, double>>(); | ||
| for (const auto &timing : taccum->GetTimings()) { | ||
| const double start = taccum->GetDurationInSeconds(min_time, std::get<0>(timing)); | ||
| const double end = taccum->GetDurationInSeconds(min_time, std::get<1>(timing)); | ||
| timings[name].push_back(std::make_pair(start, end)); | ||
| } | ||
| } | ||
|
|
||
| std::ofstream file(filename); | ||
| file << "{"; | ||
|
|
||
| bool firstKey = true; | ||
| for (const auto &[key, value] : timings) { | ||
| if (!firstKey) { | ||
| file << ","; | ||
| } | ||
| firstKey = false; | ||
|
|
||
| file << "\"" << key << "\":["; | ||
|
|
||
| bool firstPair = true; | ||
| for (const auto &pair : value) { | ||
| if (!firstPair) { | ||
| file << ","; | ||
| } | ||
| firstPair = false; | ||
|
|
||
| // Write pair as JSON array [first, second] | ||
| // Use high precision to preserve double values | ||
| file << "[" << std::fixed << std::setprecision(15) << pair.first << "," | ||
| << pair.second << "]"; | ||
| } | ||
|
|
||
| file << "]"; | ||
| } | ||
|
|
||
| file << "}"; | ||
| file.close(); | ||
| } | ||
|
|
||
| } // namespace parthenon | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||||
|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,113 @@ | ||||||||
| //======================================================================================== | ||||||||
| // (C) (or copyright) 2023-2025. Triad National Security, LLC. All rights reserved. | ||||||||
| // | ||||||||
| // This program was produced under U.S. Government contract 89233218CNA000001 for Los | ||||||||
| // Alamos National Laboratory (LANL), which is operated by Triad National Security, LLC | ||||||||
| // for the U.S. Department of Energy/National Nuclear Security Administration. All rights | ||||||||
| // in the program are reserved by Triad National Security, LLC, and the U.S. Department | ||||||||
| // of Energy/National Nuclear Security Administration. The Government is granted for | ||||||||
| // itself and others acting on its behalf a nonexclusive, paid-up, irrevocable worldwide | ||||||||
| // license in this material to reproduce, prepare derivative works, distribute copies to | ||||||||
| // the public, perform publicly and display publicly, and to permit others to do so. | ||||||||
| //======================================================================================== | ||||||||
| #ifndef TASKS_TASK_TIMING_HPP_ | ||||||||
| #define TASKS_TASK_TIMING_HPP_ | ||||||||
|
|
||||||||
| #include <algorithm> | ||||||||
| #include <array> | ||||||||
| #include <cassert> | ||||||||
| #include <chrono> | ||||||||
| #include <functional> | ||||||||
| #include <list> | ||||||||
| #include <map> | ||||||||
| #include <memory> | ||||||||
| #include <set> | ||||||||
| #include <string> | ||||||||
| #include <tuple> | ||||||||
| #include <unordered_map> | ||||||||
| #include <unordered_set> | ||||||||
| #include <utility> | ||||||||
| #include <vector> | ||||||||
|
|
||||||||
| #include <basic_types.hpp> | ||||||||
| #include <parthenon_mpi.hpp> | ||||||||
|
|
||||||||
| #include "utils/error_checking.hpp" | ||||||||
|
|
||||||||
| namespace parthenon { | ||||||||
|
|
||||||||
| class Task; | ||||||||
| class TimingAccumulator : public std::enable_shared_from_this<TimingAccumulator> { | ||||||||
| public: | ||||||||
| using time_t = std::chrono::time_point<std::chrono::steady_clock>; | ||||||||
| using timing_chunk_t = std::tuple<time_t, time_t, TaskStatus>; | ||||||||
|
|
||||||||
| private: | ||||||||
| bool collecting{false}; | ||||||||
| std::vector<timing_chunk_t> timings; | ||||||||
| int ntasks{0}; | ||||||||
|
|
||||||||
| class private_t {}; | ||||||||
|
|
||||||||
| public: | ||||||||
| explicit TimingAccumulator(private_t) {} | ||||||||
|
|
||||||||
| static std::shared_ptr<TimingAccumulator> create() { | ||||||||
| return std::make_shared<TimingAccumulator>(private_t()); | ||||||||
| } | ||||||||
|
|
||||||||
| void AddTiming(const timing_chunk_t &timing) { timings.push_back(timing); } | ||||||||
|
|
||||||||
| void StopCollectingTasks() { collecting = false; } | ||||||||
| void StartCollectingTasks() { collecting = true; } | ||||||||
|
|
||||||||
| void CollectTask(Task *task); | ||||||||
| void CollectTaskIfCollecting(Task *task); | ||||||||
|
|
||||||||
| double GetDurationInSeconds(time_t start, time_t end) const { | ||||||||
| return 1.e-9 * | ||||||||
| static_cast<double>( | ||||||||
| std::chrono::duration_cast<std::chrono::nanoseconds>(end - start).count()); | ||||||||
| } | ||||||||
|
|
||||||||
| Real GetTotalTime() const; | ||||||||
|
|
||||||||
| int GetTotalTasks() const { return ntasks; } | ||||||||
|
|
||||||||
| const std::vector<timing_chunk_t> &GetTimings() const { return timings; } | ||||||||
| }; | ||||||||
|
|
||||||||
| struct TimingAccumulatorGuard { | ||||||||
| explicit TimingAccumulatorGuard(std::shared_ptr<TimingAccumulator> timing_accumulator) | ||||||||
| : tidc(timing_accumulator) { | ||||||||
| tidc->StartCollectingTasks(); | ||||||||
| } | ||||||||
| ~TimingAccumulatorGuard() { tidc->StopCollectingTasks(); } | ||||||||
| std::shared_ptr<TimingAccumulator> tidc; | ||||||||
| }; | ||||||||
|
|
||||||||
| class TaskList; | ||||||||
| class TimingAccumulatorDictionary { | ||||||||
lroberts36 marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||||
| std::map<std::string, std::shared_ptr<TimingAccumulator>> dict_; | ||||||||
|
|
||||||||
| public: | ||||||||
| std::shared_ptr<TimingAccumulator> GetOrAddAndRegister(const std::string &label, | ||||||||
| TaskList &tl); | ||||||||
|
|
||||||||
| std::shared_ptr<TimingAccumulator> Get(const std::string &label) { | ||||||||
| PARTHENON_REQUIRE(dict_.count(label) > 0, "Asking for non-existent timing region."); | ||||||||
| return dict_[label]; | ||||||||
|
Comment on lines
+98
to
+99
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not just do this?
Suggested change
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like having the descriptive error message and I think this isn't performance critical. |
||||||||
| } | ||||||||
|
|
||||||||
| void clear() { dict_.clear(); } | ||||||||
| auto begin() { return dict_.begin(); } | ||||||||
| auto end() { return dict_.end(); } | ||||||||
| auto begin() const { return dict_.begin(); } | ||||||||
| auto end() const { return dict_.end(); } | ||||||||
|
|
||||||||
| void WriteToJSON(const std::string &file_name); | ||||||||
| }; | ||||||||
|
|
||||||||
| } // namespace parthenon | ||||||||
|
|
||||||||
| #endif // TASKS_TASK_TIMING_HPP_ | ||||||||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.