[GPU] Fix kernels synchronization in PagedAttention operation (#28645)

### Details: - Fix synchronization in PagedAttention operation when KV-cache rotation is enabled but skipped for the current iteration. Previously, `dep_events` was always replaced with `res_events` if `has_rotated_blocks=true`, leading to empty events vector being passed to the next kernels and causing accuracy deviations in cases of out_of_order queue due to missing synchronization
openvinotoolkit · Jan 23, 2025 · 485833c · 485833c
1 parent ecdecf3
commit 485833c
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/src/plugins/intel_gpu/src/graph/impls/ocl/paged_attention.cpp b/src/plugins/intel_gpu/src/graph/impls/ocl/paged_attention.cpp
@@ -346,7 +346,7 @@ struct paged_attention_impl : multi_stage_primitive<paged_attention> {
 
         std::vector<event::ptr> res_events;
         std::vector<event::ptr> dep_events = events;
-        if (has_rotated_blocks) {
+        if (has_rotated_blocks && !_kernels_data[Stage::KV_CACHE_ROTATE].kernels[0].skip_execution) {
             execute_stage(dep_events, instance, res_events, Stage::KV_CACHE_ROTATE, is_mixed_mode);
             dep_events = res_events;
         }