A Storage Vulnerability Deep Dive
Kubernetes Safety is consistently evolving – protecting tempo with enhanced performance, usability and suppleness whereas additionally balancing the safety wants of a large and various set of use-cases.
Not too long ago, the GKE Safety workforce found a excessive severity vulnerability that allowed workloads to have entry to elements of the host filesystem outdoors the mounted volumes boundaries. Though the vulnerability was patched again in September we thought it could be useful to write down up a extra in-depth evaluation of the problem to share with the group.
We assessed the affect of the vulnerability as described in vulnerability administration in open-source Kubernetes and labored intently with the GKE Storage workforce and the Kubernetes Safety Response Committee to discover a repair. On this publish we’ll give some background on how the subpath storage system works, an outline of the vulnerability, the steps to seek out the basis trigger and the repair, and eventually some suggestions for GKE and Anthos customers.
Kubernetes Filesystems: Intro to Quantity Subpath
The vulnerability, CVE-2021-25741, was brought on by a race situation through the creation of a subpath bind mount inside a container, and allowed an attacker to realize unauthorized entry to the underlying node filesystem and its delicate recordsdata. We’ll describe how that system is meant to work, after which discuss concerning the vulnerability.
The quantity subpath function in Kubernetes permits sharing a quantity in a number of containers inside a pod. For instance, we may create a Pod with an InitContainer that creates directories with pre-populated information in a mounted filesystem quantity. These directories can then be utilized by containers in the identical Pod by mounting the identical quantity and optionally specifying a subpath area to restrict what’s seen contained in the container.
Whereas there are some nice use circumstances for this function, it’s an space that has had vulnerabilities found previously. The kubelet have to be additional cautious when dealing with user-owned subpaths as a result of it operates with privileges within the host. One vulnerability that has been beforehand found concerned the creation of a malicious workload the place an InitContainer would create a symlink pointing to any location within the host. For instance, the InitContainer may mount a quantity in /mnt and create a symlink /mnt/assault contained in the container pointing to /and many others. Later within the Pod lifecycle, one other container would try and mount the identical quantity with subpath assault. Whereas getting ready the volumes for the container, the kubelet would find yourself following the symlink to the host’s /and many others as an alternative of the container’s /and many others, unknowingly exposing the host filesystem to the container. A earlier repair made positive that the subpath mount location is resolved and validated to level to a location inside the bottom quantity and that it is not changeable by the person in between the time the trail was validated and when the container runtime bind mounts it. This race situation is named time of test to time of use (TOCTOU) the place the topic being validated modifications after it has been validated.
These validations and others are summarized within the following container lifecycle sequence diagram.
Quantity subpath validations earlier than the container startup
A New TOCTOU Vulnerability: CVE-2021-25741
The most recent vulnerability was found by performing a symlink assault just like the one defined above, with the distinction being that it consistently swapped the symlink with a listing in a good loop, utilizing the RENAME_EXCHANGE choice with renameat(2). If the timing is excellent, the kubelet will see the trail as a listing and move the validation test. Then the mount utility could discover that the trail is a symlink pointing to the host and comply with it, exposing the host filesystem to the container. That is visualized within the following diagram:
The expectation and the assault consequence
The GKE Safety and Storage groups labored intently to revise the repair performed beforehand to discover a answer. The earlier repair takes a number of steps to make sure that the listing being mounted is safely opened and validated. After the file is opened and validated, the kubelet makes use of the magic-link path beneath /proc/[pid]/fd listing for all subsequent operations to make sure the file stays unchanged. Nonetheless, we discovered that all the efforts had been undone by the mount(8) linux utility which was dereferencing the procfs magic-link by default. As soon as the issue was understood, the repair concerned ensuring that the mount utility does not dereference the magic-links through the use of the –no-canonicalize flag within the mount command.
The repair is in
GKE launched a Google Kubernetes Engine safety bulletin on this vulnerability, which detailed what prospects can do to instantly remediate this concern throughout GKE and Anthos. We additionally offered steerage to prospects who manually handle their node variations, making certain that mounted releases had been out there in each area for our Static and Launch Channels.
Transferring ahead
Google continues to speculate closely within the safety of GKE and Kubernetes. We encourage customers fascinated about discovering vulnerabilities to take part within the Kubernetes bug bounty program and within the Google Vulnerability Rewards Program (VRP) which was not too long ago expanded to cowl GKE vulnerabilities. For the newest steerage on safety points, please comply with our GKE Safety Bulletins.