Darkbit has joined Aqua Security!

...

Kubernetes Honey Token

19 January 2021

While publicly disclosed Kubernetes-related security breaches have thankfully been infrequent, we’ve seen environments where it’s possible that an attacker would likely go unnoticed for a long time if they gained access to the cluster and persisted using non-distructive means. Implementing defense-in-depth, thorough logging, and detailed metrics can be complex time consuming. In the meantime, what if there was an easy, high-confidence way to be alerted if a malicious entity was present?

Honeytokens?

If you are familiar with the concept of Canary Tokens, they are “a free, quick, painless way to help defenders discover they’ve been breached (by having attackers announce themselves.)”. Lenny Zeltser blogged about setting up and using honeytokens, and it really helped solidify the concept and approach. That got us thinking about how we might be able to apply this concept to Kubernetes clusters.

We came up with the following criteria to help guide our thinking about this problem:

  1. The “token” should be extremely quick and easy to implement and also to remove cleanly. Under 5 minutes is the goal.
  2. Introducing the “token” into an environment should not interact negatively with scale, availability, or operations.
  3. It’s use should indicate with extremely high confidence that malicious activity just happened, so it should be configured to avoid false positives and accidental triggers.
  4. It’s placement should catch the most frequently used reconnaissance, discovery, and enumeration techniques used during post-exploitation efforts.
  5. The malicious entity should ideally never know that they triggered an alert by using the “token”.

Therefore, we propose using an artisanally crafted Kubernetes Service Account token as a Honeytoken.

Kubernetes Service Accounts

Referencing the Kubernetes Threat Matrix blog post from the Azure Security Center team in combination with our own security testing and CTF experience working inside Kubernetes clusters:

Kubernetes Threat Matrix

The Credential Access > List K8s Secrets and Credential Access > Access container service account tactics are both highly likely to be used or attempted during most post-exploitation activities. Our other blog post on the power of Kubernetes RBAC LIST highlighted the interaction between Kubernetes secrets access and Kubernetes serviceaccount token storage, so we can potentially cover both via a single, canary serviceaccount token.

Revisiting the serviceaccount token concept for our desired criteria:

  1. Quick and easy to implement: Creating a new serviceaccount takes seconds, and the controller will generate the never-expiring JWT token for us as a secret in the desired namespace.
  2. No negative effects: Creating a single serviceaccount incurs extremely low risk and requires very low overhead.
  3. High confidence of malicious activity: Assuming it’s run in a namespace like kube-system and administrators are not in the habit of randomly attaching serviceaccounts to workloads in kube-system without peer review, any in-cluster use would be highly suspect.
  4. Common techniques and pathways: Running it in the kube-system namespace where a lot of system workloads with higher privileges exist is a natural target for enumeration at a minimum and active exploitation in most cases.
  5. Silent alerting: Shipping the Kubernetes Audit Logs from interactions with the API server to an off-cluster location happens silently in the background, so there would be no direct feedback loop to an attacker.

“TillerPot”

Which serviceaccount token would an attacker go for if they found it? Our first thoughts are common workloads that are typically granted cluster-admin permissions to do what they need to do and have a history of allowing a path for easy escalation. Helm v2 is at the top of that list because it leverages an in-cluster deployment named Tiller to install Helm charts. Also, most admins grant cluster-admin to its locally-mounted serviceaccount token to be able to interact with the API server and listens on TCP/44134 without requiring authentication by default. So, if an attacker compromised a pod or a lower-privileged credential to a cluster, looking for and leveraging that serviceaccount token attached to that Tiller deployment would be an attractive approach.

Note: If you are still running Helm v2 with Tiller, you should read this blog and migrate to Helm v3 ASAP

Installing a realistic Tiller deployment with a dedicated serviceaccount mounted takes a few seconds:

$ kubectl create sa -n kube-system tiller
serviceaccount/tiller created
$ helm init --service-account tiller --tiller-namespace kube-system --stable-repo-url https://charts.helm.sh/stable

To remove tiller and the serviceaccount:

$ helm reset --force
$ kubectl delete sa -n kube-system tiller

Logging and Alerting

This requires that your cluster is configured to send the proper audit logs to a central location and for that logging facility to parse the subject of any successful API server request. For GKE, this requires project data access logging to be enabled and for metrics and alerts to be set up when logs that match the tiller serviceaccount name are sent. For EKS, this requires the EKS Control Plane Audit Logs to be enabled and to be sent to CloudWatch Logs along with similar filters and alerts.

Now, any time the tiller serviceaccount token is used to authenticate successfully against the API server, it indicates malicious access is already obtained with sufficient permissions to read the contents of a secret in the kube-system namespace, and we are capturing the correct audit logs and have filters in place send a high priority alert to the right team.

Additional Thoughts

Depending on your threat model, exposing tiller's tcp/44134 unauthenticated gRPC port inside the cluster may not be desirable as it may potentially give an unauthenticated attacker a valid credential to be in the system:authenticated RBAC group. Implementing client certificate authentication or a NetworkPolicy preventing ingress to the tiller pod should mitigate this avenue, but it is a trade-off in that it might prevent them from gaining access to tiller's serviceaccount altogether and avoid our custom detection mechanism.

Finally, it’s important to state that this approach is experimental and that you should fully validate this in a test environment before considering it to your definition of “production ready”. That said, we hope this fosters discussion and other creative solutions along these lines. We’d love to hear what you think, so feel free to reach out to us.

Blog photo: Unsplash