-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need a hook to limit scanning of the entire PCIe space for pre-silicon effectiveness #418
Comments
Hi @cdwilde, For running ACS at Pre-silicon, we have optimisations to ensure the user can change the values of the maximum bus, device and function according to their requirement.
Please let us know if you need more information. Thanks, |
This is insufficient. We may have a space where the Buses are not consecutive (Think 31, 63, 91, ... up to 255). Setting the MAX BUS is insufficient as we would need the entire range of 255, but only every 32 or so. The way you have this implemented requires that you go through the entire bus range for only a handful of valid buses. What I am proposing will allow the user to have control instead of relying on max values. In this case, we might say in a pal function: |
Hi @cdwilde, In scenarios where the buses are not consecutive, you can modify the number of ECAM to more than 1, with the same ECAM base address and make the corresponding changes of start and end bus. Please find the reference changes at https://github.com/ARM-software/bsa-acs/blob/main/pal/baremetal/target/RDN2/common/include/platform_override_fvp.h#L269. The corresponding change to also be done at https://github.com/ARM-software/bsa-acs/blob/main/pal/baremetal/target/RDN2/common/src/platform_cfg_fvp.c#L256 and https://github.com/ARM-software/bsa-acs/blob/main/pal/baremetal/target/RDN2/common/src/platform_cfg_fvp.c#L273. This will ensure reduced time for non-consecutive bus ranges. In addition, the tests mentioned
Please let us know if this helps. Thanks, |
Thanks for this idea. I am out of the office until mid next week but will give it a try when I get back. |
I'm back. The other issue I run into is that modifying MAX_FUNC and MAX_DEV causes address calculation problems since these values are used all over the place for address calculation. For example, in acs_pcie.c: I would need a way to tell the system that MAX_BUS, MAX_DEV, and MAX_FUNC are the full values for address calculations. I would then need a way to divide up the space to only iterate over certain addresses. I don't see a way to do this with the current implementation. Am I missing something here? |
Hi @cdwilde, Hoping that the scenario of enumeration for non-consecutive bus is taken care by modifying the number of ECAM to more than 1 as mentioned in the previous comment. This modification would have reduced the enumeration/BDF scan time considerably. For PCIE_MAX_DEV and PCIE_MAX_FUNC, These macros used in the tests, apart from configuration address calculation, are part of the test verification, where the entire Device and function space need to be scanned. Request you to please let us know how much time is it taking for scanning upto PCIE_MAX_DEV and PCIE_MAX_FUNC, given that the non-consecutive bus ranges are taken care. If the BSA test "test_pool/pcie/operating_system/test_os_p002.c" and SBSA test "test_pool/pcie/operating_system/test_p003.c" are taking long time, [as the entire BDF range to be tested for complete coverage], these tests can be commented out and run, if the tests have passed in the first run. Thanks, |
When I went to modify PCIE_MAX_DEV and PCIE_MAX_FUNC, my address calculations started failing all over the place. I can try to reduce only the buses and see if that runs in a reasonable time. |
In pre-silicon, the S/BSA setup and tests loop over all Bus devices and functions in multiple parts of the code:
BSA:
test_pool/exerciser/operating_system/test_os_e015.c
test_pool/pcie/operating_system/test_os_p002.c
test_pool/pcie/operating_system/test_os_p004.c
test_pool/pcie/operating_system/test_os_p005.c
val/common/src/acs_pcie.c
val/common/src/acs_peripherals.c
SBSA:
test_pool/pcie/operating_system/test_p003.c
test_pool/pmu/operating_system/test_pmu007.c
This is extremely costly in a pre-silicon environment. Changing the upper MAX value for each bus, device, and function and not always helpful either, especially if the active devices are non-contiguous. I would like to ask that we add a PAL hook to allow the user to cut down this space.
As an example (for acs_pcie.c):
/* Derive ecam specific information */
seg_num = (uint32_t)val_pcie_get_info(PCIE_INFO_SEGMENT, ecam_index);
start_bus = (uint32_t)val_pcie_get_info(PCIE_INFO_START_BUS, ecam_index);
end_bus = (uint32_t)val_pcie_get_info(PCIE_INFO_END_BUS, ecam_index);
Add these lines here to allow the user to skip the rest of this loop iteration:
/* Skip non-sequential BDFs to save time */
if (pal_pcie_check_device_valid(bdf))
continue;
Hopefully, pal_pcie_check_device_valid is a good function to use. As far as I can tell, it is only called in acs_pcie.c and that later call would no longer be needed. I would put buses, devices, and functions that I want skipped in this function and it does the job of not wasting inordinate amounts of time in pre-silicon.
The text was updated successfully, but these errors were encountered: