Reading time ~7 minutes
The Current State of Kubernetes Threat Modelling
- Kubernetes Security Audit Working Group
If you are planning on using Kubernetes in production, one of the key things to consider from a security perspective is your threat model.
But where can someone start when tasked with “threat modelling” such a complex system, made up by numerous components? Starting from scratch is probably not really feasible (unless you have countless hours available), so I had a look at what’s already available for public consumption.
So far, three main initiatives (performed by as many independent organisations) took upon the challenge of threat modelling a Kubernetes cluster.
In this blog post, I will summarise the outcome produced by each one of those initiatives, so that anyone can use them as a starting point for their own (custom) threat modelling exercise.
NCC released, in November 2017, their approach to Kubernetes threat modelling with a blog post titled “Kubernetes security: Consider your threat model”. There, they highlighted three main groups of threats to a Kubernetes cluster.
- Threat Actor: People who have no access to the cluster apart from being able to reach the applications running on it and/or the management port(s) over a network.
- Controls: Ensure that management services (e.g., the API server, kubelet and etcd) are not exposed to untrusted networks without authentication controls in place.
- Threat Actor: An attacker has access to a single container (likely through some application vulnerability) and would like to expand their access to take over the whole cluster.
- Caveat: Several Kubernetes distributions have made the decision that they don’t consider malicious containers part of their threat model. As such, once an attacker has that level of access then there are minimal controls, by default, stopping them from getting full cluster-admin rights.
- Ensure that all management ports visible on the cluster network require authentication for all users.
- Ensure that service accounts are either not mounted in containers or have restricted rights (i.e., not cluster admin).
- Use Network Policies to restrict access between namespaces and pods.
- Threat Actor: An attacker has valid credentials to execute commands against the Kubernetes API, as well as network access.
- Ensure that RBAC policies are in place for all users, providing “least privilege” access to cluster resources.
- Ensure that Pod Security Policies are in place for all users to restrict the rights of pods that can be created, paying particular attention to high risk items such as privileged containers.
The CNCF Financial User Group released, in January 2020, documentation and outcomes of an in-depth threat modelling exercise performed against a generic Kubernetes cluster. The aim of this work was to provide a detailed view of threats and mitigations that can be used as a checklist to identify common attack vectors for the platform and how an attacker could exploit configuration vulnerabilities within Kubernetes to achieve specific goals.
Specifically, each component of the Kubernetes architecture was analysed using the STRIDE methodology to identify potential security issues at the trust boundaries within the platform.
Main Attack Vectors
|Service Token||By default, a service token is automatically mounted into each pod. If a container is compromised, the attacker will be provided with a mechanism of exploitation using those credentials.
Strict RBAC policies and disabling the automatic service token mounting are key mitigations here.
|Compromised Container||Major focal point within the cluster as this provides a remote execution point for an attacker. Other than the service token attack mentioned above, other attack vectors of note include default network exposure of the control plane to all running containers.|
|Network Endpoints||Each Kubernetes endpoint should be secured from internal malicious actors, preventing an easy attack vector. Note that if an attacker is able to compromise a container, they gain access to the endpoints if the pods network policy permits.|
|Denial of Service||Up until the 1.14 release there were relatively few mitigations against denial of service attacks.|
|RBAC Issues||Many attack vectors rely on mis-configuration of RBAC policies.
Mitigations should rely on automated tooling to validate such policies.
The effort of the working group resulted in a set of attack trees, created so to identify the lineage of an initial attempt to create a foothold in the cluster to the ultimate attacker goal. Two approaches were taken to create this work:
- Bottom up Approach: This approach shows entry points throughout the Kubernetes platform with the aim of satisfying the stated goal. Useful to map security controls and standards against threats in order to understand their coverage.
- Scenario Approach: Scenario based view, identifying attack vectors open to an attacker in certain scenarios. This approach leverages much of the detail in the first approach, but in a more realistic form that can be used to provide focus on more prevalent attack vectors.
The initial set of Attack Trees are open sourced and available on GitHub. Below is a summary:
|Malicious Code Execution||Bottom Up||The aim of this attack tree is to execute malicious code on a cluster. The initial foothold are primarily through a compromised application providing access to a container.
Once an attacker gains access to a container, the next step is to move towards loading additional malicious code into the environment. Alternatively, if the Image pull secret can be obtained, there is a potential that the attacker could poison the repository to distribute the malicious code from there.
|Establish Persistence||Bottom Up||The aim of this tree is to discover the several ways an attacker can attempt to gain persistence in the cluster with differing periods of longevity.
One branch focuses on reading secrets from within the cluster in order to exploit other vulnerable areas, whereas a second branch focuses on threats where an attacker has gained container access and leverages misconfigurations to establish persistence resilient to container/pod/node restarts.
|Access Sensitive Data||Bottom Up||Major approaches focuses on being able to read secret data from the cluster directly by exploiting misconfigured RBAC permissions. Other approaches include viewing sensitive data stored within logs and eavesdropping on network traffic.|
|Denial Of Service||Bottom Up||This tree examines approaches where an attacker can attempt to instigate a denial of service attack on the cluster.
The first approach is from a container compromise scenario where the attacker could attempt to DOS the cluster from within by exhausting its resources.
The second approach focuses on an attacker, with network access to the cluster control plane, which might attempt to flood the network at the appropriate end points to exhaust resources.
|Compromised application leads to foothold in container||Scenario||This scenario details potential vectors open to an attacker once they have exploited an application running in a container. This would lead to remote code execution within the container via programmatic or shell access mechanisms.|
|Attacker on the Network||Scenario||This scenario focuses on an internal attacker with access to the networks hosting the Kubernetes cluster. This would likely be a more privileged user but without direct cluster access. Note, the majority of these threats can be mitigated with firewalls and appropriate Kubernetes configuration.|
Kubernetes Security Audit Working Group
The Security Audit Working Group (wg-security-audit) commissioned, in Q2 2019, Trail of Bits with performing a security audit on Kubernetes and producing as artifacts a threat model and whitepaper outlining everything found during the audit. These documents focus on the specific parts composing a Kubernetes cluster, and are a very good read I strongly recommend to invest some time on.
Going in the specifics of this threat model, it reviewed Kubernetes’ components across six control families:
- Secrets Management
In addition, given the fact Kubernetes itself is a large system spanning from API gateways to container orchestration to networking and beyond, eight components were chosen and selected to be part of the scope of this exercise:
- Container Runtime
In total, the assessment team found 17 issues across the various components, ranging in severity from
I’ll let you skim through the report if you are interested in the findings, but what is most interesting here is the methodology adopted.
Starting with a dataflow for the selected components, the working group modified the Mozilla’s Rapid Risk Assessment (RRA) template to focus on the selected controls. From there, RRA was used to perform a risk assessment of each component in scope, which got then validated by the community.