Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TOB-K8S-004: Pervasive world-accessible file permissions #81116

Open
cji opened this issue Aug 8, 2019 · 36 comments
Open

TOB-K8S-004: Pervasive world-accessible file permissions #81116

cji opened this issue Aug 8, 2019 · 36 comments
Assignees
Labels
area/security good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage. triage/accepted Indicates an issue or PR is ready to be actively worked on. wg/security-audit Categorizes an issue or PR as relevant to WG Security Audit.

Comments

@cji
Copy link
Member

cji commented Aug 8, 2019

This issue was reported in the Kubernetes Security Audit Report

Description
Kubernetes uses files and directories to store information ranging from key-value data to certificate data to logs. However, a number of locations have world-writable directories:

cluster/images/etcd/migrate/rollback_v2.go:110:	if err :=
os.MkdirAll(path.Join(migrateDatadir, "member", "snap"), 0777); err != nil {
cluster/images/etcd/migrate/data_dir.go:49:		err := os.MkdirAll(path, 0777)
cluster/images/etcd/migrate/data_dir.go:87:	err = os.MkdirAll(backupDir, 0777)
third_party/forked/godep/save.go:472:	err := os.MkdirAll(filepath.Dir(dst), 0777)
third_party/forked/godep/save.go:585:	err := os.MkdirAll(filepath.Dir(name), 0777)
pkg/volume/azure_file/azure_util.go:34:	defaultFileMode = "0777"
pkg/volume/azure_file/azure_util.go:35:	defaultDirMode  = "0777"
pkg/volume/emptydir/empty_dir.go:41:const perm os.FileMode = 0777

Figure 7.1: World-writable (0777) directories and defaults

Other areas of the system use world-writable files as well:

cluster/images/etcd/migrate/data_dir.go:147:	return ioutil.WriteFile(v.path, data, 0666)
cluster/images/etcd/migrate/migrator.go:120:	err := os.Mkdir(backupDir, 0666)
third_party/forked/godep/save.go:589:	return ioutil.WriteFile(name, []byte(body), 0666)
pkg/kubelet/kuberuntime/kuberuntime_container.go:306:			if err := m.osInterface.Chmod(containerLogPath, 0666); err != nil {
pkg/volume/cinder/cinder_util.go:271:			ioutil.WriteFile(name, data, 0666)
pkg/volume/fc/fc_util.go:118:	io.WriteFile(fileName, data, 0666)
pkg/volume/fc/fc_util.go:128:			io.WriteFile(name, data, 0666)
pkg/volume/azure_dd/azure_common_linux.go:77:			if err = io.WriteFile(name, data, 0666); err != nil {
pkg/volume/photon_pd/photon_util.go:55:	ioutil.WriteFile(fileName, data, 0666)
pkg/volume/photon_pd/photon_util.go:65:			ioutil.WriteFile(name, data, 0666)

Figure 7.2: World-writable (0666) files

A number of locations in the code base also rely on world-readable directories and files. For example, Certificate Signing Requests (CSRs) are written to a directory with mode 0755 (world readable and browseable) with the actual CSR having mode 0644 (world-readable):

// WriteCSR writes the pem-encoded CSR data to csrPath.
// The CSR file will be created with file mode 0644.
// If the CSR file already exists, it will be overwritten.
// The parent directory of the csrPath will be created as needed with file mode 0755.
func WriteCSR(csrDir, name string, csr *x509.CertificateRequest) error {
    ... 
    if err := os.MkdirAll(filepath.Dir(csrPath), os.FileMode(0755)); err != nil {
        ...
    }   

    if err := ioutil.WriteFile(csrPath, EncodeCSRPEM(csr), os.FileMode(0644)); err != nil {
       ... 
    }   
    ...
}

Figure 7.3: Documentation and code from cmd/kubeadm/app/util/pkiutil/pki_helpers.go

Exploit Scenario
Alice wishes to migrate some etcd values during normal cluster maintenance. Eve has local access to the cluster’s filesystem, and modifies the values stored during the migration process, granting Eve further access to the cluster as a whole.

Recommendation
Short term, audit all locations that use world-accessible permissions. Revoke those that are unnecessary. Very few files truly need to be readable by any user on a system. Almost none should need to allow arbitrary system users write access.

Long term, use system groups and extended Access Control Lists (ACLs) to ensure that all files and directories created by Kuberenetes are accessible by only those users and groups that should be able to access them. This will ensure that only the appropriate users with the correct Unix-level groups may access data. Kubernetes may describe what these groups should be, or create a role-based system to which administrators may assign users and groups.

Anything else we need to know?:

See #81146 for current status of all issues created from these findings.

The vendor gave this issue an ID of TOB-K8S-004 and it was finding 8 of the report.

The vendor considers this issue Medium Severity.

To view the original finding, begin on page 32 of the Kubernetes Security Review Report

Environment:

  • Kubernetes version: 1.13.4
@cji cji added the kind/feature Categorizes issue or PR as related to a new feature. label Aug 8, 2019
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Aug 8, 2019
@neolit123
Copy link
Member

the kubeadm parts of this finding are tracked at:
kubernetes/kubeadm#1716

@joelsmith
Copy link
Contributor

/remove-kind feature
/kind bug

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. and removed kind/feature Categorizes issue or PR as related to a new feature. labels Aug 8, 2019
@joelsmith
Copy link
Contributor

/sig storage
/sig node
/area security

@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/node Categorizes an issue or PR as relevant to SIG Node. area/security and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 8, 2019
@liggitt liggitt added the wg/security-audit Categorizes an issue or PR as relevant to WG Security Audit. label Aug 8, 2019
@liggitt liggitt changed the title Pervasive world-accessible file permissions TOB-K8S-004: Pervasive world-accessible file permissions Aug 8, 2019
@joelsmith
Copy link
Contributor

/priority important-longterm
I reviewed most occurrences. The ones I reviewed fell into 2 categories:

  1. Test code
  2. Files or directories which themselves live in a directory with restricted access

I’m most familiar with the emptydir one (which also applies to secrets and CMs) and that one might be tricky to fix without breaking existing workloads.

Tricky or not, it would be nice if we could excise these from the code for 2 reasons: an extra layer of security (defense in depth), and not showing false positive red flags on audits. On the first point, an extra layer of security via more restrictive DACs may make an attack more difficult when an attacker has broken through one control and is looking to parlay that access into something more. Regarding the second point, when cluster operators do security audits (either with automated tools or manual audits), these will turn up as red flags (even if relatively harmless) and will be annoying for auditors and auditees alike since they’ll have to justify their existence or fail the audit, etc.

@k8s-ci-robot k8s-ci-robot added the priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. label Aug 9, 2019
@anguslees
Copy link
Member

anguslees commented Aug 9, 2019

I want to point out somewhere here also that it is normal (and secure!) Unix practice to (eg) create a regular file with 0644 and then rely on the inherited external umask to apply the desired group/other restrictions. The standard mkdir command, for example, also calls mkdir(path, 0777) and this is desirable and good.

I appreciate that kubelet/containers are a special case here because we're often running in a context with uid=0 and an unclear/incomplete parent process hierarchy. I sort of feel it's incorrect to attempt to anticipate the desired permissions policy within regular tools (eg: etcd migration) however, and an automated report that highlights these examples in the code without mentioning the interaction with umask is a bit unfair.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 7, 2019
@cji
Copy link
Member Author

cji commented Nov 7, 2019

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Nov 7, 2019
@neolit123
Copy link
Member

the kubeadm parts are resolved kubernetes/kubeadm#1716

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 5, 2020
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 6, 2020
@joelsmith
Copy link
Contributor

/lifecycle frozen

@mohag
Copy link

mohag commented Jul 19, 2021

Group read permissions might be useful to keep? It allows potential integration to be given read access by adding it to a group. (Especially for log files, where the option to run a collector as a non-root user is useful)

skrobul added a commit to skrobul/kubernetes that referenced this issue Jan 18, 2022
Currently, kubelet creates a world-readable and world-writeable empty files in
`/var/lib/kubelet/pods/{podUID}/containers/busysleep/{containerId}`. These are
meant to be written by the process in containers when container is terminated.

Originally, this file was created with `0644`, then despite security concerns,
it was changed to `0666` in
kubernetes#31839. This was completed to
allow containers running as non-root to write termination messages. Later on, in
2019 this has been highlighted as a security vulnerability in Kubernetes
Security Audir Report in kubernetes#81116.

This commit changes termination log file mode to `0660` which is best of both
worlds - it removes world-writeable file, yet still allows the container's user
to write the termination message.
@mr-anderson86
Copy link

Any updates regarding this?
Is there anyway to change the file permissions there to 0660 or 0644?
(I specifically install k8s using RKE)

@cailynse
Copy link

cailynse commented Sep 6, 2022

/assign

@cailynse
Copy link

cailynse commented Sep 6, 2022

@QiWang19 - Are there any updates on this?

@cailynse
Copy link

cailynse commented Sep 11, 2022

Don't hesitate to request help... I am in EMEA region but asynchronously we should be able to coordinate. Or you can ask on slack sig-node

Hello @matthyx - I'm going to start working on this one file at a time, would you be up for pairing with me on the first one or two? Specifically I'd like to chat about the blockers that impacted the last PR!

I've opened a draft PR here for one of the less risky ones.

@matthyx
Copy link
Contributor

matthyx commented Sep 13, 2022

@cailynse sure let me have a look...

@cailynse
Copy link

cailynse commented Sep 13, 2022

PR file mode location
#112384 0666 staging/src/k8s.io/client-go/tools/clientcmd/loader.go
#114856 0666 pkg/volume/fc/fc_util.go
0666 pkg/volume/cinder/cinder_util.go
0666 pkg/volume/azuredd/azure_common_linux.go
0777 and 0666 cluster/images/etcd/migrate/data_dir.go

@ShivamTyagi12345
Copy link
Member

PR file mode location
#112384 0666 staging/src/k8s.io/client-go/tools/clientcmd/loader.go
0666 pkg/volume/fc/fc_util.go
0666 pkg/volume/cinder/cinder_util.go
0666 pkg/volume/azuredd/azure_common_linux.go
0777 and 0666 cluster/images/etcd/migrate/data_dir.go

Hey @cailynse Is this issue still available to work on?

@cailynse
Copy link

cailynse commented Nov 4, 2022

Hello @ShivamTyagi12345! I am waiting on approval for the first PR and then was going to open the next. But you are welcome to get started on any of the subsequent ones if you would like. You can also find a bunch of issues that still need contributors here in the blog post summarizing the 2019 Third Party Security Audit.

@DeeptanshuDas
Copy link

/assign

@neolit123
Copy link
Member

@DeeptanshuDas i closed your PR
#122040

as the kubeadm parts are already fixed, you can send PRs for other areas that need fixes, though.

@QiWang19 QiWang19 removed their assignment Nov 28, 2023
@matthias-herrmann
Copy link

matthias-herrmann commented Dec 4, 2023

@neolit123 I'd like to contribute to start out with a first issue.

Does it make sense for me to change these to 0644 and assign myself?

pkg/volume/fc/fc_util.go:118:	io.WriteFile(fileName, data, 0666)
pkg/volume/fc/fc_util.go:128: io.WriteFile(name, data, 0666)

io.WriteFile(fileName, data, 0666)

io.WriteFile(name, data, 0666)

@neolit123
Copy link
Member

@neolit123 I'd like to contribute to start out with a first issue.

Does it make sense for me to change these to 0644 and assign myself?

pkg/volume/fc/fc_util.go:118:	io.WriteFile(fileName, data, 0666)
pkg/volume/fc/fc_util.go:128: io.WriteFile(name, data, 0666)

io.WriteFile(fileName, data, 0666)

io.WriteFile(name, data, 0666)

you can try sending a PR. unclear whether the maintainers will accept it.

@walterrowe
Copy link

walterrowe commented Mar 11, 2024

These permissions issues cause worker nodes hosting pods to fail Federal IT Security scans for world writeable files. At NIST we are required to adhere to all of the FIMSA and SP 800-53 controls. This FAILS.

Why can't the system simply honor the umask of the user or system as set by the organization's IT security policies?

Hard coding world write permissions is an absolute worst practice. What is the justification? This deserves its own CVE.

@anguslees
Copy link
Member

anguslees commented Mar 24, 2024

Why can't the system simply honor the umask of the user or system as set by the organization's IT security policies?

I'm confused here too. Afaics the io.WriteFile(...) call should use 0666, and we should turn that into an 0640 (or whatever) by setting umask correctly in the environment.

The original bug - and some of the later comments and proposals - are phrased as if umask doesn't exist. Umask does exist 😅, and controlling permissions in the code itself (ie: pretending umask is 0) is incorrect / wrong / not common unix practice.

Hard coding world write permissions is an absolute worst practice. What is the justification? This deserves its own CVE.

Great example - it's not a CVE because these are not what controls the final file permissions. If io.WriteFile(...., 0666) turns into an actual 0666 file in your environment, then your umask is incorrect (and that might warrant a CVE) - io.WriteFile's 0666 is not the issue.

@mohag
Copy link

mohag commented Mar 24, 2024

Hard coding world write permissions is an absolute worst practice. What is the justification? This deserves its own CVE.

The permissions are not usable at least - the directories along the way tends to be limited enough that the files can't be accessed. (but they do mess with security audits)

@walterrowe
Copy link

walterrowe commented Mar 25, 2024

@mohag @anguslees

The io.WriteFile method appears to be deprecated? Also the documentation says the file permissions are before umask. It isn't clear if umask is applied or not.

https://pkg.go.dev/io/ioutil#WriteFile

@mohag
Copy link

mohag commented Apr 2, 2024

@mohag @anguslees

The io.WriteFile method appears to be deprecated? Also the documentation says the file permissions are before umask. It isn't clear if umask is applied or not.

https://pkg.go.dev/io/ioutil#WriteFile

I'm reading "before umask" as implying that umask will be applied...

io.WriteFile calls os.WriteFile, which calls os.OpenFile, which calls openFileNolog. That seems to eventually call the OS's open() syscall, which applies the umask...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/security good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/storage Categorizes an issue or PR as relevant to SIG Storage. triage/accepted Indicates an issue or PR is ready to be actively worked on. wg/security-audit Categorizes an issue or PR as relevant to WG Security Audit.
Projects