From 8fe6eee38587c4d4b1a095cb3635bde9573bbe50 Mon Sep 17 00:00:00 2001 From: Kailun Qin Date: Fri, 6 Aug 2021 07:55:39 -0400 Subject: [PATCH] config-linux: add systemd cgroup path convention The systemd cgroup path convention currently implemented in runtimes like `runc/crun` should be added to the spec. For more information, please kindly refer to e.g. runc systemd cgroup driver doc: https://github.com/opencontainers/runc/blame/main/docs/systemd.md. This patch adds the systemd cgroup convention for `Linux.CgroupsPath` which is in the `slice:prefix:name` form and clarifies the detailed usage. Fixes https://github.com/opencontainers/runtime-spec/issues/1021 Signed-off-by: Kailun Qin --- config-linux.md | 21 ++++++++++++++++++++- specs-go/config.go | 2 ++ 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/config-linux.md b/config-linux.md index 178361f34..41f8446f5 100644 --- a/config-linux.md +++ b/config-linux.md @@ -186,11 +186,30 @@ containers. **`cgroupsPath`** (string, OPTIONAL) path to the cgroups. It can be used to either control the cgroups hierarchy for containers or to run a new process in an existing container. -The value of `cgroupsPath` MUST be either an absolute path or a relative path. +If the runtime manages cgroups on its own (i.e. works with cgroupfs directly), the value of `cgroupsPath` MUST be either an absolute path or a relative path. * In the case of an absolute path (starting with `/`), the runtime MUST take the path to be relative to the cgroups mount point. * In the case of a relative path (not starting with `/`), the runtime MAY interpret the path relative to a runtime-determined location in the cgroups hierarchy. +If the runtime manages cgroups indirectly, via systemd, the value of `cgroupsPath` MUST be in the "slice:prefix:name" form (e.g. "system.slice:runtime:434234"). +By specifying with the transient systemd unit to create for the container and the containing slice which hosts the unit, the systemd units directly map to objects in the cgroup tree. +When these units are activated, they map directly to cgroup paths built from the unit names. + +This form specifies the following systemd cgroup properties which are all optional: + +* `slice` - name of the parent slice systemd unit, under which the container is placed. + Note that `slice` can contain dashes to denote a sub-slice (e.g. `user-1000.slice` is a correct + notation, meaning a subslice of `user.slice`), but it must not contain slashes (e.g. + `user.slice/user-1000.slice` is invalid). A `slice` of `-` represents a root slice. + If not specified, it can default to: + `system.slice` - the default place for all system services; + `user.slice` - the default place for all user sessions, used for cgroup v2 and rootless containers. +* `prefix` - prefix of the scope systemd unit to create for the container. +* `name` - name of the systemd unit to create. + When `name` has `.slice` suffix, in which case `prefix` is ignored and the `name` is used as is, + this describes a unit being created is a slice. Otherwise, `prefix` and `name` are used to + compose the scope unit name, which is `-.scope`. + If the value is specified, the runtime MUST consistently attach to the same place in the cgroups hierarchy given the same value of `cgroupsPath`. If the value is not specified, the runtime MAY define the default cgroups path. Runtimes MAY consider certain `cgroupsPath` values to be invalid, and MUST generate an error if this is the case. diff --git a/specs-go/config.go b/specs-go/config.go index 7e9122103..726c57ba2 100644 --- a/specs-go/config.go +++ b/specs-go/config.go @@ -171,6 +171,8 @@ type Linux struct { // CgroupsPath specifies the path to cgroups that are created and/or joined by the container. // The path is expected to be relative to the cgroups mountpoint. // If resources are specified, the cgroups at CgroupsPath will be updated based on resources. + // If systemd cgroup driver is used to create cgroups and set cgroup limits, the path must be + // in the "slice:prefix:name" form (e.g. "system.slice:runtime:434234"). CgroupsPath string `json:"cgroupsPath,omitempty"` // Namespaces contains the namespaces that are created and/or joined by the container Namespaces []LinuxNamespace `json:"namespaces,omitempty"`