Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added forall_with_streams and updated BenchmarkForall.cpp #232

Open
wants to merge 19 commits into
base: develop
Choose a base branch
from
17 changes: 17 additions & 0 deletions benchmarks/BenchmarkForall.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,23 @@ static void benchmark_gpu_loop(benchmark::State& state) {
// Register the function as a benchmark
BENCHMARK(benchmark_gpu_loop)->Range(1, INT_MAX);

static void benchmark_gpu_loop_streams(benchmark::State& state) {
adayton1 marked this conversation as resolved.
Show resolved Hide resolved
const int size = state.range(0);
care::host_device_ptr<int> data(size, "data");

for (auto _ : state) {
RAJA::resources::Cuda res;
care::forall_with_stream(care::gpu{}, res, "BenchmarkForall.cpp", 78, 0, size, [=] CARE_DEVICE (int i) {
data[i] = i;
adayton1 marked this conversation as resolved.
Show resolved Hide resolved
});
}

data.free();
}

// Register the function as a benchmark
BENCHMARK(benchmark_gpu_loop_streams)->Range(1, INT_MAX);

#endif

// Run the benchmarks
Expand Down
41 changes: 41 additions & 0 deletions src/care/forall.h
Original file line number Diff line number Diff line change
Expand Up @@ -189,6 +189,47 @@ namespace care {
#endif
}

////////////////////////////////////////////////////////////////////////////////
///
/// @author Neela Kausik
///
/// @brief If GPU is available, execute on the device. Otherwise, execute on
/// the host. This specialization is needed for clang-query.
///
/// @arg[in] gpu Used to choose this overload of forall
/// @arg[in] res Resource provided for execution
/// @arg[in] fileName The name of the file where this function is called
/// @arg[in] lineNumber The line number in the file where this function is called
/// @arg[in] start The starting index (inclusive)
/// @arg[in] end The ending index (exclusive)
/// @arg[in] body The loop body to execute at each index
///
////////////////////////////////////////////////////////////////////////////////

template <typename LB>
void forall_with_stream(gpu, RAJA::resources::Cuda res, const char * fileName, const int lineNumber,
const int start, const int end, LB&& body) {
#if CARE_ENABLE_PARALLEL_LOOP_BACKWARDS
s_reverseLoopOrder = true;
#endif

#if CARE_ENABLE_GPU_SIMULATION_MODE
forall(gpu_simulation{}, res, fileName, lineNumber, start, end, std::forward<LB>(body));
#elif defined(__CUDACC__)
forall(RAJA::cuda_exec<CARE_CUDA_BLOCK_SIZE, CARE_CUDA_ASYNC>{},
res, RAJA::RangeSegment(start, end), std::forward<LB>(body));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need an overload of forall that takes a resource, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the function you are calling: https://github.com/LLNL/CARE/pull/232/files#diff-1df40e04088de0f82501a0065752487396b8abeb4c3d30780e79119cc63789a7R74

But it does not accept a resource argument. I'm confused at how this is working.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was calling RAJA::forall, but will look into it further

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be calling RAJA::forall - there's no overload that takes the fileName and lineNumber.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, that's the main reason I dislike "using namespace..." statements - it's too easy to accidentally call the wrong function.

#elif defined(__HIPCC__)
forall(RAJA::hip_exec<CARE_CUDA_BLOCK_SIZE, CARE_CUDA_ASYNC>{},
res, RAJA::RangeSegment(start, end), std::forward<LB>(body));
#else
forall(RAJA::seq_exec{}, res, fileName, lineNumber, start, end, std::forward<LB>(body));
#endif

#if CARE_ENABLE_PARALLEL_LOOP_BACKWARDS
s_reverseLoopOrder = false;
#endif
}

////////////////////////////////////////////////////////////////////////////////
///
/// @author Alan Dayton
Expand Down