Important AWS EKS Terminologies

Understanding key terminologies in Amazon EKS is essential for working with the platform effectively. Below, we explain each term, including edge cases and potential issues you might encounter in real-world scenarios.

  • 1. Cluster

    An EKS Cluster is the core of your EKS environment, containing all the resources needed to run your Kubernetes workloads.

    • Edge Case: Cluster Not Accessible:
      • If the control plane is down or misconfigured, the cluster might become inaccessible. You won’t be able to interact with Kubernetes resources using the kubectl command, and the Kubernetes API may not respond.
    • Explanation:
      • Always configure proper network access (public or private endpoints) to your cluster.
      • Ensure that IAM roles are properly set up for kubectl access.
  • Possible Attack Scenario: Misconfigured network access to the control plane or IAM roles.

  • 2. Node

    A Node is a worker machine, typically an EC2 instance, that runs your application pods.

    • Edge Case: Node Not Joining the Cluster:

      • Sometimes, a node might fail to join the cluster, leading to resource shortages. This can happen due to misconfigured security groups, missing IAM roles, or incorrect bootstrap scripts.
    • Explanation:

      • Ensure correct IAM roles and security groups for worker nodes.
      • Always validate the bootstrap process for each node.
    • Possible Attack Scenario: If IAM roles are not attached properly or security groups block communication between nodes and the control plane, nodes won't join.

  • 3. Pod

    A Pod is the smallest deployable unit in Kubernetes and can contain one or more containers.

    • Edge Case: Pod Not Scheduling:

      • A pod may not schedule on a node due to insufficient resources (CPU or memory), taints, or affinity rules.
    • Explanation:

    • Always monitor resource utilization and ensure enough capacity for new pods.

    • Check taints and tolerations that might block certain pods from being scheduled on nodes.

    • Possible Attack Scenario: Resource limits on nodes or misconfigured scheduling by attacker can prevent pods from running.

  • 4. Control Plane

    The Control Plane in EKS is managed by AWS and includes critical components like the API server, scheduler, and etcd.

    • Edge Case: Control Plane Not Accessible:

      • You might face scenarios where the control plane is not accessible due to network configuration issues or IAM role misconfigurations. This can make your cluster unreachable.
    • Explanation:

      • Always ensure that the control plane endpoint (public or private) is configured correctly.
      • Verify IAM roles and policies to allow access to the Kubernetes API server.
    • Possible Attack Scenario: Attacker controlled misconfigured VPC or endpoint access settings, or modified IAM policies that deny access to the control plane.

  • 5. Kubelet

    The kubelet is an agent that runs on each node, ensuring the containers are running as expected.

    • Edge Case: Kubelet Not Communicating with the API Server:

      • If the kubelet fails to communicate with the API server, the node might become NotReady, meaning it won’t accept new pods.
    • Explanation:

      • Check network connectivity between the node and the control plane.
      • Ensure the kubelet service is running and has sufficient permissions.
    • Possible Attack Scenario: Kubelet runs on 10255, if node is public & open to 0.0.0.0/0, there are chances that kubelet is accessible at 10250 port or 10255 port. Apart from this is permission of Nodes/Proxy is present then attacker can control the kubelet & access any pod.

  • 6. Kubernetes API Server

    The API Server is the central communication hub of the Kubernetes cluster.

    • Edge Case: API Server Rate Limits:

      • If there are too many requests, the API server might hit rate limits, leading to 503 errors or delayed responses.
    • Explanation:

      • Use rate limiting and caching for monitoring tools to avoid overloading the API server.
      • Monitor API server logs for signs of rate limiting.
    • Possible Attack Scenario: Publicly exposed API server can be used by attacker, in case cluster config is leaked or user is compromised.

  • 7. IAM (Identity and Access Management)

    IAM is used to manage who can access your cluster and what actions they can perform.

    • Edge Case: IAM Role Misconfiguration:

      • If an IAM role is missing required permissions, users might not be able to access the EKS cluster or resources within it, leading to access denied errors.
    • Explanation:

      • Always ensure that IAM roles are properly configured with the correct policies.
      • Regularly audit your IAM roles to prevent over-permissive access.
    • Possible Attack Scenario: Incorrectly configured IAM policies can lead to privilege escalations. This is from AWS perspective.

  • 8. RBAC (Role-Based Access Control)

    RBAC controls access to Kubernetes resources based on the roles assigned to users and applications.

    • Edge Case: Over-Permissioned Roles:

      • A common misconfiguration is assigning overly broad permissions through ClusterRoleBindings, leading to privilege escalation risks.
    • Explanation:

      • Implement the principle of least privilege when configuring RBAC.
      • Regularly review RBAC configurations to ensure that roles are not over-permissioned.
    • Possible Attack Scenario: Misconfigured RBAC that grants excessive access to service accounts or users. This is from cluster perspective

  • 9. Fargate

    Fargate allows you to run containers in EKS without managing the underlying infrastructure.

    • Edge Case: Fargate Pod Limits:

      • Fargate has certain limitations, such as pod memory and CPU limits. If a pod exceeds these limits, it may not schedule or could be evicted.
    • Explanation:

      • Ensure your pod's resource requests and limits are within Fargate's allowed range.
      • Monitor resource usage to avoid evictions.
    • Possible Attack Scenario: Misconfigured resource requests that exceed the Fargate limits as it is serverless. Attacker with RCE can exploit ECS Fargate by exploiting http://169.254.170.2/v2/metadata/ endpoint. Stackoverflow

Credits:

  • https://aws.amazon.com/eks/
  • https://securitylabs.datadoghq.com/articles/amazon-eks-attacking-securing-cloud-identities/