Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llext: add support for PTL DRAM detached sections with MMU enabled #9893

Merged
merged 3 commits into from
Mar 25, 2025

Conversation

abonislawski
Copy link
Member

@abonislawski abonislawski commented Mar 12, 2025

Set MMU permissions for detached sections
+
Enable CONFIG_COLD_STORE_EXECUTE_DRAM on PTL

Required patches:

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question if map would ever fail.

@@ -71,6 +72,32 @@ static int llext_manager_align_unmap(void __sparse_cache *vma, size_t size)
return sys_mm_drv_unmap_region(aligned_vma, ALIGN_UP(pre_pad_size + size, PAGE_SZ));
}

static void llext_manager_detached_map(void __sparse_cache *vma, size_t size, uint32_t flags)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this ever fail ? do we need to return an int ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arch_mem_map returns void so it cant fail in a way we could handle, inside there is only one check for size but it will lead to panic anyway

if (size == 0) {
		LOG_ERR("Cannot map physical memory at 0x%08X: invalid "
			"zero size", (uint32_t)phys);
		k_panic();
	}

/* Use cached virtual address */
uintptr_t va = POINTER_TO_UINT(sys_cache_cached_ptr_get(aligned_vma));

arch_mem_map(aligned_vma, va, ALIGN_UP(pre_pad_size + size, PAGE_SZ), flags);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle you don't have to page-align the size, arch_mem_map() does it already, but doesn't hurt too much

void __sparse_cache *vma,
size_t size)
{
unsigned int i;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's at least place this whole function under #if CONFIG_MMU. And I'm also considering making this simpler and faster by reducing the generality and making it more restricted to our use-case: only handle 2 detached sections - executable .cold and data .coldrodata. I'd add two fields to struct lib_manager_module to identify those two sections, e.g. store their indices. Like in llext_manager_load_data_from_storage() check shdr.sh_flags to see if it's code or data and assign respective index. Then here just do

llext_get_section_info(ldr, ext, mctx->cold_idx, &shdr, &s_region, &s_offset);
llext_get_region_info(ldr, ext, s_region, NULL, &region_addr, NULL);
llext_manager_detached_unmap(...);

and similar for mctx->coldrodata_idx. Yes, you'd need to pass mctx to this function too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my plan at first, but then I realized we can simply check sections the same way we do at the beginning and make it much more generic: whatever we find, we will also find later to unmap.

  • added ifdef for MMU

@abonislawski abonislawski force-pushed the ptl_dram_llext branch 2 times, most recently from d7f87aa to b0e601e Compare March 17, 2025 15:46
@abonislawski
Copy link
Member Author

Rebased (all required PRs have been merged).

@abonislawski abonislawski force-pushed the ptl_dram_llext branch 3 times, most recently from 432d36e to d4fdfda Compare March 19, 2025 11:38
@lyakh
Copy link
Collaborator

lyakh commented Mar 20, 2025

no CI PTL results... let's re-test

@lyakh
Copy link
Collaborator

lyakh commented Mar 20, 2025

SOFCI TEST

if (s_region != region)
continue;

/* unmap detached sections (will be outside requested VMA area) */
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you have to update this PR, better replace "unmap" with "restore default flags"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets do this incrementally - CI passing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spoke too soon - I reloaded and can see a CI timeout on some tests on PTL. Will rerun to be sure.

Copy link
Collaborator

@lyakh lyakh Mar 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lgirdwood those IPC timeouts on PTL are also present in "main," seem to be caused by xruns. So, unrelated to this PR, but have to be fixed
UPDATE: sorry, I might have been wrong, upon closer examination these failures do look different

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was unstable with many timeouts in different tests each run, with reverted 'ipc: move most functions to run from DRAM' its now stable in linux CI, lets firstly merge base PTL DRAM enablement and later stabilize IPC

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abonislawski ok, so you're referring to jenkins failures, not to your local or "Intel Internal" tests. One of the failing jenkins runs was https://sof-ci.01.org/sofpr/PR9893/build11697/devicetest/index.html. In it we see the firmware running into an error:

[00:00:02.029,105] <err> ipc: ipc_comp_free: ipc_comp_free(): comp id: 0x3 state is 5 cannot be freed
[00:00:02.029,115] <err> ipc: ipc_pipeline_free: ipc_pipeline_free(): module free () failed
[00:00:02.029,128] <err> ipc: ipc_cmd: ipc4: FW_GEN_MSG failed with err 12

So that should be debuggable?

Copy link
Member Author

@abonislawski abonislawski Mar 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, only linux CI, not a problem for another one in both configurations. This specific fail with pass in most runs and I really dont want to debug all of this, we need DRAM for modules on platforms with DRAM functionality

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think reverting the DRAM IPC commit only hides the problem. I'd propose a test: update to the Zephyr version as in the PR (use the first commit as is), disable LLEXT and enable cold storage on PTL and run that through the CI. I expect that to pass. Then enable modules but remove all __cold attributes from them - that should pass too. So I suspect only when you enable __cold code in LLEXT modules, then your flag updating code breaks it. So maybe we still have some page overlaps? You're updating flags for sections, that shouldn't be touched?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately this scenario with __cold removed in src/audio also failed

@lgirdwood
Copy link
Member

SOFCI TEST

@lgirdwood
Copy link
Member

No results for PTL - retest

@lgirdwood
Copy link
Member

SOFCI TEST

1 similar comment
@abonislawski
Copy link
Member Author

SOFCI TEST

lyakh
lyakh previously requested changes Mar 24, 2025
Copy link
Collaborator

@lyakh lyakh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This revert wasn't here when I was approving this PR IIRC. Let's check why it's needed now

This patch will set permissions for detached sections
if MMU is enabled

Signed-off-by: Adrian Bonislawski <[email protected]>
Enable CONFIG_COLD_STORE_EXECUTE_DRAM for ACE 3.0

Signed-off-by: Adrian Bonislawski <[email protected]>
This reverts commit 0e53393.

Signed-off-by: Adrian Bonislawski <[email protected]>
@lyakh
Copy link
Collaborator

lyakh commented Mar 25, 2025

let's keep this test result - it was a rare all green PTL test https://sof-ci.01.org/sofpr/PR9893/build11747/devicetest/index.html

@abonislawski
Copy link
Member Author

SOFCI TEST

@lgirdwood
Copy link
Member

Test results are good.

@lyakh lyakh dismissed their stale review March 25, 2025 14:28

schedule to re-apply later

@lgirdwood lgirdwood merged commit 2b4d0ea into thesofproject:main Mar 25, 2025
44 of 49 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants