Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

os: fix amp hang issue #6623

Merged
merged 2 commits into from
Jan 13, 2025
Merged

os: fix amp hang issue #6623

merged 2 commits into from
Jan 13, 2025

Conversation

gSahitya-samsung
Copy link
Contributor

To pause the cpu, cpu should call up_cpu_paused. The call to up_cpu_paused will get missed when g_cpu_irqlock is unlocked (means none of the cpu having access to irqlock), arm_pause_handler rely on enter_critical_section to pause the cpu, it must handle up_cpu_pausereq without any fail, otherwise other cpu will wait forever in spinlock.

Change while loop to do-while loop in irq_waitlock, which will ensure there is no pending up_cpu_pausereq before getting lock on g_cpu_irqlock otherwise pause request will get missed.

The SP_DMB is needed to ensure both cpu read correct spinlock value.

The spin_lock(&g_cpu_resumed[cpu]) is moved upward in code to avoid race condition between cpus.

@lhenry-realtek
Copy link
Contributor

Hi @gSahitya-samsung since your commit includes my change of including DMB in arm_pause_handler, is #6619 still required?

@sunghan-chang
Copy link
Contributor

Hi @gSahitya-samsung since your commit includes my change of including DMB in arm_pause_handler, is #6619 still required?

@gSahitya-samsung Let's split commits for @lhenry-realtek 's change. Could you cherry pick his commit here?

@gSahitya-samsung gSahitya-samsung changed the title os: fix amp hang issue [Do Not Merge] [Under Verification] os: fix amp hang issue Jan 9, 2025
@ewoodev
Copy link
Contributor

ewoodev commented Jan 9, 2025

As you know, I think, The rootcause is missed enter_critical_section in up_schedule_sigaction(). #6624

Could you please update if below commit is not needed, Please update it
os: fix amp hang issue

@gSahitya-samsung
Copy link
Contributor Author

As you know, I think, The rootcause is missed enter_critical_section in up_schedule_sigaction(). #6624

Could you please update if below commit is not needed, Please update it os: fix amp hang issue

Hi @ewoodev, this commit is required, the irq_waitlock should return false if there is pending pause request to cpu.

The enter_critical_section in up_schedule_sigaction just preventing the above case to occur.

@gSahitya-samsung gSahitya-samsung force-pushed the smp_hang branch 3 times, most recently from 7360022 to be35154 Compare January 9, 2025 11:23
@gSahitya-samsung gSahitya-samsung changed the title [Do Not Merge] [Under Verification] os: fix amp hang issue os: fix amp hang issue Jan 9, 2025
r-prabu
r-prabu previously approved these changes Jan 10, 2025
sunghan-chang
sunghan-chang previously approved these changes Jan 10, 2025
@@ -277,7 +278,7 @@ int arm_pause_handler(int irq, void *context, void *arg)
* interrupt by calling up_cpu_paused(). If the pause event has already
* been processed then g_cpu_paused[cpu] will not be locked.
*/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this unnecessary change.

lhenry-realtek and others added 2 commits January 10, 2025 09:40
…spinlock variable is observed in ISR

This commit addresses a condition where the spinlock variable becomes un-synced between cores due to out-of-order memory access.

DMB is added to ensure the latest copy of the lock is retrieved before operating on it
To pause the cpu, cpu should call `up_cpu_paused`. The call to `up_cpu_paused` will get missed when `g_cpu_irqlock` is unlocked (means none of the cpu having access to irqlock), `arm_pause_handler` rely on `enter_critical_section` to pause the cpu, it must handle `up_cpu_pausereq` without any fail, otherwise other cpu will wait forever in spinlock.

Change `while` loop to `do-while` loop in `irq_waitlock`, which will ensure there is no pending `up_cpu_pausereq` before getting lock on `g_cpu_irqlock` otherwise pause request will get missed.

The `SP_DMB` is needed to ensure both cpu read correct spinlock value.

The `spin_lock(&g_cpu_resumed[cpu])` is moved upward in code to avoid race condition between cpus.
@kishore-sn kishore-sn merged commit 2f11cb5 into Samsung:master Jan 13, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants