You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some of the new transport scheme kernels have logging functionality, as the algorithm can potentially fail if the conditions are too severe. Apparently, it is not trivial to ascertain this before calling the kernel. Thus these kernels fail gracefully by calling the logger with an Error. For Kernel execution on a GPU having, CPU functionality inside is not a good idea - basically, it won't work.
The solution is for the kernel to throw an error code. This means returning a scalar, which can then be tested higher up the call stack for success or failure - and halt execution gracefully on the CPU. Currently, this is not supported. One reason for this is it is not thread safe. If the kernel is called from a threaded region, then one column may fail and another may not. Which is correct? Well in this case if any column fails, execution should halt. If tested the PSy layer, then which column could be part of the error message. Perhaps, a better solution is to perform a local reduction on the scalar then this would trigger the exit in the Algorithm layer.
Specifically, if the scalar was set to zero before the horizontal domain looping, and set to one in a failing kernel then a reduction performed across threads after kernel execution would result in value one if any column failed and zero otherwise. Thus this could be tested in the Algorithm layer.
Example transport kernel is common/hori_dep_dist_ffsl_kernel_mod.F90
The text was updated successfully, but these errors were encountered:
Some of the new transport scheme kernels have logging functionality, as the algorithm can potentially fail if the conditions are too severe. Apparently, it is not trivial to ascertain this before calling the kernel. Thus these kernels fail gracefully by calling the logger with an Error. For Kernel execution on a GPU having, CPU functionality inside is not a good idea - basically, it won't work.
The solution is for the kernel to throw an error code. This means returning a scalar, which can then be tested higher up the call stack for success or failure - and halt execution gracefully on the CPU. Currently, this is not supported. One reason for this is it is not thread safe. If the kernel is called from a threaded region, then one column may fail and another may not. Which is correct? Well in this case if any column fails, execution should halt. If tested the PSy layer, then which column could be part of the error message. Perhaps, a better solution is to perform a local reduction on the scalar then this would trigger the exit in the Algorithm layer.
Specifically, if the scalar was set to zero before the horizontal domain looping, and set to one in a failing kernel then a reduction performed across threads after kernel execution would result in value one if any column failed and zero otherwise. Thus this could be tested in the Algorithm layer.
Example transport kernel is
common/hori_dep_dist_ffsl_kernel_mod.F90
The text was updated successfully, but these errors were encountered: