Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Update core assignment algorithm in benchexec/resources.py #892

Draft
wants to merge 93 commits into
base: main
Choose a base branch
from

Conversation

CGall42
Copy link

@CGall42 CGall42 commented Jan 19, 2023

Referring to issue #748, the core assignment can now handle additional hierarchy layers (such as a shared L3 cache).
The addition of further layers can be implemented without knowing the exact topology of a machine - the hierarchy of the layers (CPUs, NUMA nodes, L3 caches, hyperthreading, etc) is determined by the algorithm.

Fixes #748
Fixes #850

@CGall42 CGall42 added the resource allocation related to allocation of resources like CPU cores and memory label Jan 19, 2023
@CGall42 CGall42 self-assigned this Jan 19, 2023
@CGall42 CGall42 marked this pull request as draft January 19, 2023 21:39
Copy link
Member

@PhilippWendler PhilippWendler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a preliminary review with some hints, mostly on code style and documentation. Please fix these issues and provide documentation, such that a full review becomes possible and makes sense.

As part of this (as first step actually), please format the source code with the formatter black and make sure to use it in the future for each commit. Consistent code formatting is a big help for readability.

And please also have a look at all the other CI failures. After each commit these checks will run automatically, so always check whether CI is green for your most recent commit. The check check-format is fixed automatically if you use the code formatter, and reuse only complains about the missing copyright header. But flake8 and pytype provide hints about potential errors in your code. And of course the unit-tests checks just execute our tests.

benchexec/resources.py Outdated Show resolved Hide resolved
benchexec/resources.py Outdated Show resolved Hide resolved
benchexec/resources.py Outdated Show resolved Hide resolved
benchexec/resources.py Outdated Show resolved Hide resolved
benchexec/resources.py Outdated Show resolved Hide resolved
benchexec/resources.py Outdated Show resolved Hide resolved
benchexec/resources.py Outdated Show resolved Hide resolved
benchexec/resources.py Show resolved Hide resolved
benchexec/resources.py Outdated Show resolved Hide resolved
benchexec/resources.py Outdated Show resolved Hide resolved
Also turn value checks into asserts,
as this is an internal method.

This fixes complaints from Ruff.
This helps us to be more convinced that the data structures
created by the unit tests are the same as in real use.
This is redundant because hierarchy_levels[0] is the same
and there is only effort to keep the two in sync.
- First create the final list of hierarchy levels, then create the
  VirtualCore objects.
  Previously one level was potentially created afterwards and needed
  adjustments to the VirtualCore objects.

- Always insert root level and rely on the filter that we use anyway
  to remove it if necessary. This is the same strategy that we use for
  the siblings_of_core level and thus easier to follow.
All the information in these instances is also present in
hierarchy_levels, it is just a more convenient way to access it.
Creating it locally makes it easier to write tests.
Because the tests were changed significantly together with the
implementation of the core allocation in this branch,
it is hard to see whether the tests still test the same things.
But we want regression tests that ensure that the new allocation
behaves in the same way.
So we restore the old tests and keep the new tests in a separate file
until we are sure that there are no regressions,
and then we can unify the test suites.

For now the old tests fail due to API changes,
this will be adapted next.
This makes most of the tests succeed, but some still fail.
Potentially some further adjustments are needed,
but there is also one regression
(https://github.com/sosy-lab/benchexec/pull/892/files#r1229136970).
The new get_cpu_distribution method has no information about
partial physical cores anymore,
this is checked outside of it.
It is better to have CI green to be able to notice further regressions.
Also do not use "raise Exception", use assert to encode coding
assumptions.
- Function name starting with "read" to indicate it reads from kernel.
- Parameters in better order.
- Identifier naming according to Python standard.
- Actually use generic identifiers in a generic function
  and not names that are specific to one use case.
- Also replace all trivial callers with a single function.
- Crucial constants should be present only once, documented,
  and defined in a central place.
- Reading from the system and logic should be separate
  such that the latter is testable.
- For reading from the system we can use an existing helper method.
- Add tests.
Uses of plain dicts may catch errors in callers earlier.
Furthermore, some of the functions even returned a defaultdict
in some cases and a plain dict in other cases.
The return type should be consistent.

With dict.setdefault() the use of a plain dict
is almost as convenient as a defaultdict.
We always want the user to allow us to use entire physical cores.
This check was broken, because forbidden sibling cores
were already removed from the data structure before the check.
Furthermore, cores forbidden via cgroups and via the
--allowedCores parameter were treated somehow differently,
but the effect should be exactly the same.
So far we read the information about the hyperthreading hierarchy level
differently from the other levels.
This made the code more difficult to understand,
and the way how the ids in the hierarchy_levels[0] dict were chosen
differed from the other levels.
But we can also read this information in the same way as for the other
levels, so let's do this.

We still also need to use the previous way of reading all siblings from
a given list of cores, but we can also simplify that
and the separation of concerns still provides
an understandability benefit.
The allocation algorithm already supports an arbitrary number of levels,
so we can future proof the allocation
and read all information about cache levels
that the kernel provides.

We can also use the assumption that caches are named the same
across all cores, and read the cache names only once
instead of separately for every core.
This method actually has nothing to do with "sub" units (children),
it just takes a set of cores and a level and groups the cores
as appropriate for the level.
So the names should reflect that.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
resource allocation related to allocation of resources like CPU cores and memory
2 participants