...

Simple DLP for AWS S3

18 February 2020

When discussing the risk S3 buckets pose to organizations, the majority of the discussion is around public buckets and inadvertently exposing access. While this is certainly a common threat vector, it can be addressed in a number of policy-driven ways. Blocking the ability to accidentally expose buckets at the organization or account level is much more practical now, and probably a more scalable and sound approach than trying to implement a reactive solution.

But what about data exfiltration from attackers that may have gained access through some other attack vector?

Background

If we refer to the MITRE ATT&CK knowledge base, technique T1537 illustrates how attackers can leverage cloud provider APIs to exfiltrate data undetected.

An adversary may exfiltrate data by transferring the data […] to another cloud account they control on the same service to avoid typical file transfers/downloads and network-based exfiltration detection.

Two common ways to move data from one S3 bucket to another, are to use either copy or sync. This post will explore ways we can detect when data is exfiltrated out of our account to any external account.

Approach

If we generate CloudTrail test events by using S3 copy or sync, we can see they both generate CopyObject events in CloudTrail. S3 sync will first call ListObjectsV2, then proceed to copy the objects it finds in the source bucket that aren’t yet in the destination bucket. S3 copy, unsurprisingly also generates CopyObject events.

If we look at these CloudTrail events, we can see there is some extra metadata in the detail.resources field. In the broader AWS API, this field is optional and up to the service generating the event to decide what gets populated there. In the case of S3, CopyObect events will list the objects and accounts involved in the operation.

[
  {
    "type": "AWS::S3::Object",
    "ARN": "arn:aws:s3:::exfil-1234567890/file1"
  },
  {
    "accountId": "1234567890",
    "type": "AWS::S3::Bucket",
    "ARN": "arn:aws:s3:::exfil-1234567890"
  },
  {
    "accountId": "9876543210",
    "type": "AWS::S3::Bucket",
    "ARN": "arn:aws:s3:::my-secrets-9876543210"
  },
  {
    "type": "AWS::S3::Object",
    "ARN": "arn:aws:s3:::my-secrets-9876543210/file1"
  }
]

We can see that this field gives us the building blocks to be able to detect data exfiltration – we have the source, destination and objects (files) involved. All we need to do is determine if the files are being copied to a bucket we don’t own or trust.

Solution

We can leverage CloudWatch and SNS to deliver S3 API events and process them on-demand with a Lambda function, delivering a super lightweight, event-driven solution that provides a DLP (Data Loss Prevention) capability for our data stored in S3.

As a final step, if we want to notify our shared ops Slack channel when an unauthorized transfer occurs, we can obtain near realtime notification of a possible data breach. The basic event flow might look something like this:

S3 DLP flow

Prerequisites

Before setting up the AWS components, make sure you have a Slack webhook URL for the channel to which you want notifications posted.

https://hooks.slack.com/services/T00000000/B00000000/XXXXXXXXXXXXXXXXXXXXXXXX

Next, clone this GitHub repo locally for a copy of the Lambda function code we’ll be using.

git clone https://github.com/darkbitio/aws-s3-dlp
Cloning into 'aws-s3-dlp'...
remote: Enumerating objects: 15, done.
remote: Counting objects: 100% (15/15), done.
remote: Compressing objects: 100% (13/13), done.
remote: Total 15 (delta 2), reused 13 (delta 2), pack-reused 0
Unpacking objects: 100% (15/15), done.

CloudTrail Setup

Make sure CloudTrail is enabled with write data events enabled for all buckets.

CloudTrail with data events

Note that additional charges do apply for enabling S3 data events.

SNS Setup

Next, create an SNS topic so the CloudWatch rule we create in the next step has a target to send events to.

SNS topic

Do not create any subscriptions yet. We’ll add a Lambda subscription shortly.

CloudWatch Setup

Next, create a CloudWatch rule to trigger whenever an S3 API write event is generated. We can create this rule through CloudWatch or EventBridge, since EventBridge uses the same API as CloudWatch. We’ll use the EventBridge interface, since the workflow is a little more streamlined than that of the CloudWatch console.

CloudWatch rule

Next, define the rule pattern. Select pre-defined pattern by service , AWS as the service provider, S3 as the service name, Object Level Operations as the event type, and CopyObject as the specific operations to trigger on.

CloudWatch rule pattern

Leave the default event bus selected. And select the SNS topic created earlier as the target. Note that we could send the events directly to a Lambda function from CloudWatch, but using SNS as an intermediate step gives us more flexibility for debugging and/or extending functionality later.

CloudWatch rule target

Lambda Function Setup

Next, create a Lambda function. Name the function and select Node.js 12.x as the runtime.

Lambda function

Now we have a Lambda function with the default Node.js code and no triggers.

default Lambda function

default Lambda code

Let’s update the Lambda function code with the function from the GitHub repo.

aws lambda update-function-code --function-name s3-dlp --zip-file fileb://lambda/function.zip
{
    "FunctionName": "s3-dlp",
    "FunctionArn": "arn:aws:lambda:us-east-1:9876543210:function:s3-dlp",
    "Runtime": "nodejs12.x",
    "Role": "arn:aws:iam::9876543210:role/service-role/s3-dlp-role-www4klu5",
    "Handler": "index.handler",
    "CodeSize": 1879,
    "Description": "",
    "Timeout": 3,
    "MemorySize": 128,
    "LastModified": "2020-02-18T15:50:14.749+0000",
    "CodeSha256": "stU/bkX0XZ6hK0K34gfWY/+q7Bm4q1JpsvPUaDiTIKI=",
    "Version": "$LATEST",
    "TracingConfig": {
        "Mode": "PassThrough"
    },
    "RevisionId": "bc7d0bf6-d7b6-4c6a-be7d-b06386ebbcd6",
    "State": "Active",
    "LastUpdateStatus": "Successful"
}

You should now see the updated code in the Lambda function.

updated Lambda code

Next, add a trigger to invoke the Lambda function from SNS.

Lambda trigger

Select the SNS topic we created earlier as the trigger.

Lambda SNS trigger

Lambda trigger success

The final step to configure our Lambda function is to define the AUTHORIZED_ACCOUNTS and SLACK_WEBHOOK environment variables that the function needs to evaluate events and send Slack notifications.

Lambda env vars

The AUTHORIZED_ACCOUNTS variable should be a comma separated list of your trusted AWS accounts. Any objects copied from S3 into an account not in this list will be considered unauthorized and will generate a Slack notification.

Final Test

To test the full workflow, copy an object from an authorized S3 bucket to an unauthorized one.

aws s3 cp s3://secret-9876543210/file1 s3://exfil-1234567890

Within a few seconds, you should see a Slack notification with the metadata about which object was copied, by whom and to what account it was copied into.

Slack message

Note that if you just turned on CloudTrail for the first time, it can take 5-10 minutes for the trail to start triggering CloudWatch events. If you have any other issues or suggestions, let us know in the GitHub repo issues.