Security Nuances of the AWS Metadata Service in Container Workloads

Published in

Simply CloudSec

11 min readJun 22, 2022

A deep dive into AWS metadata services on container orchestration platforms.

Background

I recently found an excellent collection of cloud security breaches and vulnerabilities from the past year. According to this report, using Server-Side Request Forgery (SSRF) vulnerabilities to access the AWS metadata service is still a common attack vector and has been used by threat actors to compromise AWS environments. With the prevalence of container workloads today, I wanted to do a deep dive into how the AWS metadata service functions on container-based compute platforms and figure out the security implications in each scenario. The guiding question is as follows: how important is it to restrict access to the metadata service in container-based workloads?

This post will describe different ways to attack and restrict access to the metadata service in different AWS compute platforms, specifically focusing on ECS, by walking through three scenarios. As we are coming up on three years from the (in)famous Capital One data breach, we will use the SSRF attack seen in the breach as a model for attacker behavior in AWS environments.

This is not intended to serve as a post on mitigating SSRF attacks in AWS, as there’s tons of great material out there already. Instead, we will focus on exploring the nuances of the metadata service on different compute offerings and the security implications of their functionality.

A Quick Refresher on Concepts

Metadata Service

The metadata service is a local service that runs on AWS compute platforms, which is what services will use to get metadata about themselves — notably, accessing credentials for the role assigned to them. The Instance Metadata Service (IMDS) is the service that runs on EC2 virtual machines, and the Container Metadata Service (CMDS) is the service present on containers running in ECS and EKS.

The IMDS has two versions: version 1, which is the default and original version, and version 2, which was created to add some security and help mitigate SSRF attacks by using a session-oriented method of fetching credentials. The details of v2 are outside the scope of this post, but if you are interested, you can read the AWS announcement — for now, just consider it as a tool that prevents most common SSRF attacks from being effective against it.

Elastic Container Service

The Elastic Container Service (ECS) is one of Amazon’s offerings to help run container workloads. There are two types of workloads in ECS: a serverless offering called Fargate, where the container agent and infrastructure is managed for you, and a self-managed offering, where you configure your own cluster of virtual machines and Amazon is only responsible for managing the container agent. The self-managed offering uses the EC2 launch type.

In the Fargate offering, you do not have any access to or control of the underlying host, whereas in the self-managed offering, you have complete control of the EC2s upon which your containers run on. This is an important distinction which we’ll come back to.

Scenarios

Now that we’re familiar with the relevant concepts, let’s walk through three different common configurations for running workloads on AWS and understand how the metadata service functions in them.

Scenario 1: Metadata Service on EC2

This is the most basic configuration on AWS, where you have a running EC2 and you deploy your application directly on top of it. We won’t spend too much time on this since we’re focused on container workloads, but the AWS documentation has plenty of helpful examples and snippets to reference. The EC2 IMDS runs on a local endpoint of 169.254.169.254. In version 1, you can issue raw curl requests to the endpoint and get back metadata about the EC2 instance, as well as credentials. For example:

curl --silent 169.254.169.254/latest/meta-data/iam/security-credentials/{role-name} | jq

will give you back some JSON containing access keys and a token. This is exactly how the Capital One data breach occurred — an SSRF vulnerability led to credentials being exfiltrated, which subsequently were used to pull data from S3.

To combat this, AWS released IMDS version 2 to require a little more effort, since you have to get a session token using a PUT request with custom headers first. This helps address most common SSRF attacks, which are usually achieved through GET requests, by requiring the attacker to also have the ability to control HTTP requests and add custom headers.

TOKEN=`curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600"`curl -H "X-aws-ec2-metadata-token: $TOKEN" -v http://169.254.169.254/latest/meta-data/iam/security-credentials/{role-name}

The contents of the response will be identical to the response in v1.

Scenario 2: Metadata Service on ECS (Serverless)

In this scenario, we are running a serverless container workload using the ECS service with the Fargate launch type. We do not have access to the host and are only able to operate from within the context of our container.

To test this out, I spun up an ECS cluster and a Fargate task in awsvpc networking mode, which deployed a simple Python script that looped forever so I could get a shell onto the container and poke around. To get into the Fargate environment, I took advantage of the AWS ecs-exec feature, which was extremely handy for a scenario like this.

Starting from my local machine, I ran the following to get a shell in my container:

aws ecs execute-command --region us-west-2 --cluster serverless-test --task {task-id} --container hello-world --command bash --interactive

Now that I’m on the container, I can start hitting the Container Metadata Service and see what I can find. The service location differs from IMDS and can be found at 169.254.170.2. Let’s take a look at what we can see if we hit the metadata endpoint:

We get back lots of information about the running task, but nothing which seems to point in the direction of credentials. Sifting through the AWS documentation, we find that there’s a variety of environment variables exposed to the running container, one of which is AWS_CONTAINER_CREDENTIALS_RELATIVE_URI. That looks really promising, so let’s dump the environment variables. Note that AWS_EXECUTION_ENV reflects that we’re using Fargate.

It’s a little hard to see, but I highlighted the second variable in the list in a red box, which contains a path to credentials for this specific task. The CMDS will dynamically generate a random UUID for each task, which means that there is no static HTTP endpoint that an ECS task can consistently hit in order to get its own credentials.

This raises the level of effort for an attacker, since even if they were able to successfully exploit an SSRF vulnerability, they would not know the endpoint to hit to access the container’s credentials without first getting access to the AWS_CONTAINER_CREDENTIALS_RELATIVE_URI environment variable. The attacker would need to find and exploit one more vulnerability, such as local file inclusion or remote code execution, to combine with SSRF in order to get a set of credentials from a Fargate task. Fargate tasks do not have access to the EC2 IMDS.

From the perspective of the Capital One attack, if the workloads were running on Fargate, it’s entirely possible that the breach never would have occurred unless the attacker had also found a way to leak the environment variables. In my opinion, the CMDS does a great job of attempting to mitigate SSRF with the dynamic credential endpoint, especially since SSRF is an application level vulnerability which falls under the customer’s responsibility in the shared responsibility model.

Scenario 3: Metadata Service on ECS (Self-Managed)

This is the most convoluted scenario. Since the ECS tasks are being run using the EC2 launch type, the containers have access to the underlying host. Therefore, both the EC2 IMDS and the container’s metadata service are present in this configuration. To demonstrate this, I launched another ECS cluster, this time with tasks running with the EC2 launch type configuration and inawsvpc networking mode.

Since we have access to the EC2 running the container in this scenario, I don’t need to use ecs-exec and can simply SSH into the box. After SSH’ing into the EC2 from my local machine, we can see the running containers with docker ps , and then grab a shell onto our running task with docker exec :

Let’s find our container metadata endpoint so we can get a set of credentials for the role that the container is operating with. We’ll look for the AWS_CONTAINER_CREDENTIALS_RELATIVE_URI variable again. Also note that the AWS_EXECUTION_ENV reflects the EC2 launch type.

As expected, we are able to hit this dynamic endpoint to get a set of credentials.

We know that our container workload will always have access to the container metadata service. Since we have access to the underlying host in the EC2 launch type configuration, can we hit the EC2 IMDS as well?

As expected, the call returns successfully and we now are able to get a set of credentials with the context of the EC2 role — a privilege escalation.

So far, we have been using IMDSv1 on the EC2, which is the default set by AWS unless explicitly configured otherwise. Let’s force the EC2 to use IMDSv2 and see if anything changes in the container’s ability to access the EC2’s metadata service. First, from my local machine with proper AWS credentials, I modified the EC2.

Now, let’s try hitting IMDS from the container again.

The call fails. Because we are now using IMDSv2, the raw curl request fails, demonstrating how it protects against common SSRF attacks. By adapting the call to first get a session token and setting it as the header for the call to IMDSv2, we can successfully hit the EC2 IMDS again:

This demonstrates a very interesting point. Although enforcing IMDSv2 is a security upgrade and helps prevent against SSRF, a process running in the context of a container in this configuration can still access it, providing an easy path to privilege escalation if code execution is obtained in a compromised container. Therefore, access to the EC2 IMDS from the ECS task should be removed to help maintain isolation between host and container.

Picturing this configuration in the Capital One breach, there’s some variables to consider when figuring out if this type of ECS workload would provide any protection against the SSRF attack. If IMDSv1 is used on the EC2, and access to IMDS is not restricted, then an SSRF attack on a vulnerable container would lead to exfiltrated credentials with elevated privileges. However, if access to IMDS from the ECS task was disabled, then the attacker would be back to trying to access the container’s metadata service, which would require leaking an environment variable containing the dynamic credential endpoint.

Restricting Access to Metadata Service

In this context, we are referring to restricting access to the IMDS on the EC2 in environments where containers have some access to the underlying host (such as with the EC2 launch type in ECS and EKS). Restricting access to the container metadata service wouldn’t make sense, since the container needs this endpoint to get a set of valid credentials for itself to make AWS API calls.

In ECS, the recommendations from AWS depend on the networking mode being used. If the awsvpc networking mode is configured, you can set ECS_AWSVPC_BLOCK_IMDS=true in the ECS configuration file in /etc/ecs/ecs.config. If bridge mode is being used, then the recommendation is to use iptables to block network traffic from the docker0 bridge.

In EKS, the recommendations are to require the underlying EC2 instance to use IMDSv2 only and updating the http-put-response-hop-limit to 1, which will result in a network timeout if the pod tries to hit the IMDS service on the EC2. It’s also recommended to use Instance Roles for Service Accounts (IRSA) to delegate IAM access to specific pods, rather than relying on the permissions of the worker node.

These actions are not required if the Fargate offerings for these container orchestration systems are used, as serverless workloads are not able to interact with the host they are running on.

Takeaways

In this post, we took a deeper look at the metadata services running on traditional EC2 machines and on container services. The guiding question we started with was: How important is it to restrict access to the metadata service in container-based workloads? Based on our walkthrough, there’s several key takeaways to consider:

There is a distinct difference between the metadata service on an EC2 (IMDS) and the metadata service present on containers. IMDS provides a static endpoint with some differences in constructing the request depending on if v1 or v2 is being used. The container metadata service provides a dynamic endpoint which is injected as an environment variable at runtime.
Serverless workloads don’t have the ability to access IMDS on the host they run on, eliminating this path to privilege escalation. However, self-managed container workloads (ECS/EKS) using the EC2 launch type have access to both their container metadata service and its host’s IMDS.
Both IMDSv2 and the container metadata service provide some mitigations against a common SSRF vulnerability. IMDSv2 leverages a session with a token received from a PUT request, while the container metadata service requires leaking of an environment variable. Both approaches raise the barrier for exfiltrating credentials from the metadata service.
IMDSv2 is not a silver bullet to prevent IAM privilege escalations from container to host. Containers should have their access to their host’s IMDS removed when applicable.

To answer our original question of whether it’s important to lock down interactions with metadata services in a running container, it all depends on the types of workloads that we are trying to secure. If our workloads are all running on Fargate, the attack surface is smaller and we can spend more of our time focusing on fixing vulnerabilities in the application layer. If our workloads aren’t serverless and we have to manage hosts, the attack surface grows and we need to carefully consider how we restrict access to IMDS to contain blast radius of a successful exploit.

Thanks for reading! Please drop any questions or notes in the comment section, and stay tuned for future releases from our blog.

References

2021 Cloud Breach Review: https://blog.christophetd.fr/cloud-security-breaches-and-vulnerabilities-2021-in-review/
Threat Actors Leveraging IMDS: https://sysdig.com/blog/teamtnt-aws-credentials/
Capital One Data Breach Deep Dive: https://blog.appsecco.com/an-ssrf-privileged-aws-keys-and-the-capital-one-breach-4c3c2cded3af
EC2 IMDS: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html
Container Metadata Service: https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-metadata-endpoint-v4.html
AWS announcement of IMDSv2: https://aws.amazon.com/blogs/security/defense-in-depth-open-firewalls-reverse-proxies-ssrf-vulnerabilities-ec2-instance-metadata-service/
AWS ecs-exec: https://aws.amazon.com/blogs/containers/new-using-amazon-ecs-exec-access-your-containers-fargate-ec2/
Restricting IMDS on ECS: https://aws.amazon.com/premiumsupport/knowledge-center/ecs-container-ec2-metadata/
EKS IMDS Best Practices: https://aws.github.io/aws-eks-best-practices/security/docs/iam/#restrict-access-to-the-instance-profile-assigned-to-the-worker-node
Awesome writeup on attacking the metadata service: https://pumasecurity.io/resources/blog/cloud-security-instance-metadata/