-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: Update core assignment algorithm in benchexec/resources.py #892
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a preliminary review with some hints, mostly on code style and documentation. Please fix these issues and provide documentation, such that a full review becomes possible and makes sense.
As part of this (as first step actually), please format the source code with the formatter black and make sure to use it in the future for each commit. Consistent code formatting is a big help for readability.
And please also have a look at all the other CI failures. After each commit these checks will run automatically, so always check whether CI is green for your most recent commit. The check check-format
is fixed automatically if you use the code formatter, and reuse
only complains about the missing copyright header. But flake8
and pytype
provide hints about potential errors in your code. And of course the unit-tests
checks just execute our tests.
This method adds a "root" hierarchy level, if the system topology doesnt have one. Necessary for iterating through the whole topology.
now chooses the right starting core for the next thread
- Function name starting with "read" to indicate it reads from kernel. - Parameters in better order. - Identifier naming according to Python standard. - Actually use generic identifiers in a generic function and not names that are specific to one use case. - Also replace all trivial callers with a single function.
- Crucial constants should be present only once, documented, and defined in a central place. - Reading from the system and logic should be separate such that the latter is testable. - For reading from the system we can use an existing helper method. - Add tests.
Uses of plain dicts may catch errors in callers earlier. Furthermore, some of the functions even returned a defaultdict in some cases and a plain dict in other cases. The return type should be consistent. With dict.setdefault() the use of a plain dict is almost as convenient as a defaultdict.
…lity It is not really necessary.
We always want the user to allow us to use entire physical cores. This check was broken, because forbidden sibling cores were already removed from the data structure before the check. Furthermore, cores forbidden via cgroups and via the --allowedCores parameter were treated somehow differently, but the effect should be exactly the same.
So far we read the information about the hyperthreading hierarchy level differently from the other levels. This made the code more difficult to understand, and the way how the ids in the hierarchy_levels[0] dict were chosen differed from the other levels. But we can also read this information in the same way as for the other levels, so let's do this. We still also need to use the previous way of reading all siblings from a given list of cores, but we can also simplify that and the separation of concerns still provides an understandability benefit.
The allocation algorithm already supports an arbitrary number of levels, so we can future proof the allocation and read all information about cache levels that the kernel provides. We can also use the assumption that caches are named the same across all cores, and read the cache names only once instead of separately for every core.
This method actually has nothing to do with "sub" units (children), it just takes a set of cores and a level and groups the cores as appropriate for the level. So the names should reflect that.
To allow easier generation of new tests (where we ideally can automatically generate tests for a large number of (also weird) CPU configurations), it's desireable to be able to specify arbitrary layers more or less directly, without having to create new test classes. The final goal is to be able to generate machine configurations given a single argument (or two arguments: layer configuration as a list and total core count), so we can utilize pytest to write easily maintainable test cases.
As the new method already keeps the layers in the correct order and doesn't create duplicate layers, we can remove those parts from the code
After showing that the new layer generation code is equivalent to the existing one, we can remove the old code with the hardcoded layers and use the new code.
To be able to remove the fixed functions (and replace them with one which takes arbitrary numbers of cores), all calls to the existing methods are redirected to a new method. This removes all direct references to mainAssertValid, allowing us to refactor the code without having to modify each test at the same time.
Currently, there are two different kinds of assert statements in the code - assertValid, which is quite straightforward, and mainAssertValid, which performs additional checks. As the additional checks in mainAssertValid only further constrain when a test passes the assertion, we safely can change the existing calls to assertValid.
This reverts commit dd06fcb.
…rRun This also adds the required values which were inherited from TestCpuCoresPerRun_singleCPU With this, all classes are derivated from TestCpuCoresPerRun, allowing us to use decorators more easily later
Currently, all tests are limited to 1-4, 8 CPUs used with adding test cases taking a bit of work. This decorator dynamically adds a test case to test if the actual assignment matches the expected assignment to a test class. It doesn't use the fixed versions of xxxCoresPerRun, which allows us to remove them later entirely.
Note that one test is commented out, as the decorator unexpectedly fails where the actual test succeeds for at this time unknown reasons. This commit adds redundant tests - this is temporary, as commits in the near future commits will remove the old testing logic, which the decorator replaces.
In the existing testsuite, an overwritten test_fourCoresPerRun method actually used the values of the 3 cores version.
In this case, the preexisting testsuite had a bug, which explains why the results differed from the new decorator based one
As the decorator based tests are identical with the existing code and also all pass, we can remove the old testing logic
Referring to issue #748, the core assignment can now handle additional hierarchy layers (such as a shared L3 cache).
The addition of further layers can be implemented without knowing the exact topology of a machine - the hierarchy of the layers (CPUs, NUMA nodes, L3 caches, hyperthreading, etc) is determined by the algorithm.
Fixes #748
Fixes #850
Kernel documentation: