While publicly disclosed Kubernetes-related security breaches have thankfully been infrequent, we’ve seen environments where it’s possible that an attacker would likely go unnoticed for a long time if they gained access to the cluster and persisted using non-distructive means. Implementing defense-in-depth, thorough logging, and detailed metrics can be complex time consuming. In the meantime, what if there was an easy, high-confidence way to be alerted if a malicious entity was present?
If you are familiar with the concept of Canary Tokens, they are “a free, quick, painless way to help defenders discover they’ve been breached (by having attackers announce themselves.)”. Lenny Zeltser blogged about setting up and using honeytokens, and it really helped solidify the concept and approach. That got us thinking about how we might be able to apply this concept to Kubernetes clusters.
We came up with the following criteria to help guide our thinking about this problem:
Therefore, we propose using an artisanally crafted Kubernetes Service Account token as a Honeytoken.
Referencing the Kubernetes Threat Matrix blog post from the Azure Security Center team in combination with our own security testing and CTF experience working inside Kubernetes clusters:
Credential Access > List K8s Secrets and
Credential Access > Access container service account tactics are both highly likely to be used or attempted during most post-exploitation activities. Our other blog post on the power of Kubernetes RBAC LIST highlighted the interaction between Kubernetes
secrets access and Kubernetes
serviceaccount token storage, so we can potentially cover both via a single, canary
serviceaccount token concept for our desired criteria:
serviceaccounttakes seconds, and the controller will generate the never-expiring JWT token for us as a
secretin the desired
serviceaccountincurs extremely low risk and requires very low overhead.
kube-systemand administrators are not in the habit of randomly attaching
serviceaccountsto workloads in
kube-systemwithout peer review, any in-cluster use would be highly suspect.
namespacewhere a lot of system workloads with higher privileges exist is a natural target for enumeration at a minimum and active exploitation in most cases.
serviceaccount token would an attacker go for if they found it? Our first thoughts are common workloads that are typically granted
cluster-admin permissions to do what they need to do and have a history of allowing a path for easy escalation. Helm v2 is at the top of that list because it leverages an in-cluster
Tiller to install Helm charts. Also, most admins grant
cluster-admin to its locally-mounted
serviceaccount token to be able to interact with the API server and listens on
TCP/44134 without requiring authentication by default. So, if an attacker compromised a
pod or a lower-privileged credential to a cluster, looking for and leveraging that
serviceaccount token attached to that
Tiller deployment would be an attractive approach.
Note: If you are still running Helm v2 with Tiller, you should read this blog and migrate to Helm v3 ASAP
Installing a realistic
Tiller deployment with a dedicated
serviceaccount mounted takes a few seconds:
$ kubectl create sa -n kube-system tiller serviceaccount/tiller created
$ helm init --service-account tiller --tiller-namespace kube-system --stable-repo-url https://charts.helm.sh/stable
tiller and the
$ helm reset --force $ kubectl delete sa -n kube-system tiller
This requires that your cluster is configured to send the proper audit logs to a central location and for that logging facility to parse the
subject of any successful API server request. For GKE, this requires project data access logging to be enabled and for metrics and alerts to be set up when logs that match the
serviceaccount name are sent. For EKS, this requires the EKS Control Plane Audit Logs to be enabled and to be sent to CloudWatch Logs along with similar filters and alerts.
Now, any time the
serviceaccount token is used to authenticate successfully against the API server, it indicates malicious access is already obtained with sufficient permissions to read the contents of a
secret in the
namespace, and we are capturing the correct audit logs and have filters in place send a high priority alert to the right team.
Depending on your threat model, exposing
tcp/44134 unauthenticated gRPC port inside the cluster may not be desirable as it may potentially give an unauthenticated attacker a valid credential to be in the
system:authenticated RBAC group. Implementing client certificate authentication or a
NetworkPolicy preventing ingress to the
tiller pod should mitigate this avenue, but it is a trade-off in that it might prevent them from gaining access to
serviceaccount altogether and avoid our custom detection mechanism.
Finally, it’s important to state that this approach is experimental and that you should fully validate this in a test environment before considering it to your definition of “production ready”. That said, we hope this fosters discussion and other creative solutions along these lines. We’d love to hear what you think, so feel free to reach out to us.
Blog photo: Unsplash