Kubernetes Networking — Services, Ingress & Network Policies

Where This Fits

Kubernetes Networking — Where This Fits You are the central platform team at an enterprise bank. Your EKS/GKE clusters sit in Workload Account VPCs that are spokes off the Network Hub. Every pod gets an IP from the VPC CIDR (EKS) or alias IP range (GKE). All north-south traffic passes through the Network Hub for inspection. This page covers everything inside and at the edge of the cluster.

1. The Kubernetes Networking Model

Kubernetes imposes three fundamental networking requirements:

Every pod gets its own IP address — no NAT between pods
All pods can communicate with all other pods without NAT (flat network)
All nodes can communicate with all pods without NAT

The Container Networking Interface (CNI) plugin implements these requirements on each cloud.

Kubernetes Flat Network Model

How the CNI Works

CNI Plugin — Pod Creation Flow

EKS — AWS VPC CNI

The AWS VPC CNI gives each pod a real ENI secondary IP from your VPC subnet.

How it works:

Each node gets a primary ENI + additional ENIs based on instance type
Each ENI gets multiple secondary IPs from the subnet CIDR
Pods are assigned these secondary IPs — they are real VPC IPs
aws-node DaemonSet manages the IP warm pool

IP capacity per node:

Instance Type	Max ENIs	IPs/ENI	Max Pods
m5.large	3	10	29
m5.xlarge	4	15	58
m5.2xlarge	4	15	58
m5.4xlarge	8	30	234

GKE — VPC-native Networking

GKE uses alias IP ranges to give pods IPs from a secondary range on the VPC subnet.

How it works:

Each node gets a primary IP from the node subnet
Pods get IPs from a secondary IP range (alias range) on the same subnet
These are real VPC IPs — no overlay, no encapsulation
Google’s network fabric routes alias IPs natively

VPC-native cluster IP ranges: GKE VPC-native Subnet IP Ranges

Key settings:

Maximum pods per node:      32 (default) / 64 / 110 (configurable at pool creation)
Pod IP range per node:      /24 (for 110 max-pods) or /26 (for 32 max-pods)

Dataplane V2 (Cilium-based):

eBPF-based networking — faster than iptables
Built-in NetworkPolicy enforcement (no Calico needed)
Native visibility and observability
Enabled by default on new Autopilot clusters

# Enable prefix delegation — assigns /28 prefixes instead of individual IPs
# Each prefix = 16 IPs, dramatically increases pod density
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: aws-node
  namespace: kube-system
spec:
  template:
    spec:
      containers:
      - name: aws-node
        env:
        - name: ENABLE_PREFIX_DELEGATION
          value: "true"
        - name: WARM_PREFIX_TARGET
          value: "1"
        # With prefix delegation, m5.large goes from 29 to 110 pods

# ENIConfig — route pod traffic through a separate subnet
# Use case: nodes in public subnet, pods in private subnet
apiVersion: crd.k8s.amazonaws.com/v1alpha1
kind: ENIConfig
metadata:
  name: us-east-1a
spec:
  securityGroups:
    - sg-0a1b2c3d4e5f6a7b8    # Pod security group
  subnet: subnet-0abcd1234pod  # Dedicated pod subnet
---
# Set on aws-node DaemonSet:
# AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG=true
# ENI_CONFIG_LABEL_DEF=topology.kubernetes.io/zone

# SecurityGroupPolicy — assign SGs directly to pods (not just nodes)
apiVersion: vpcresources.k8s.aws/v1beta1
kind: SecurityGroupPolicy
metadata:
  name: payments-sg-policy
  namespace: payments
spec:
  podSelector:
    matchLabels:
      app: payment-processor
  securityGroups:
    groupIds:
      - sg-0payments1234abcd   # SG that allows port 5432 to RDS
      - sg-0baseline5678efgh   # Baseline SG for all pods

# EKS add-on for VPC CNI with prefix delegation
resource "aws_eks_addon" "vpc_cni" {
  cluster_name                = aws_eks_cluster.main.name
  addon_name                  = "vpc-cni"
  addon_version               = "v1.18.5-eksbuild.1"
  resolve_conflicts_on_update = "OVERWRITE"

  configuration_values = jsonencode({
    env = {
      ENABLE_PREFIX_DELEGATION = "true"
      WARM_PREFIX_TARGET       = "1"
    }
    enableNetworkPolicy = "true"  # Native network policy support (since 2023)
  })
}

resource "google_container_cluster" "main" {
  name     = "prod-gke-01"
  location = "me-central1"
  project  = var.workload_project_id

  # VPC-native networking (alias IPs)
  networking_mode = "VPC_NATIVE"
  network         = data.google_compute_network.shared_vpc.self_link
  subnetwork      = data.google_compute_subnetwork.gke_subnet.self_link

  ip_allocation_policy {
    cluster_secondary_range_name  = "pods"      # Name of secondary range
    services_secondary_range_name = "services"  # Name of secondary range
  }

  # Dataplane V2 — eBPF-based networking (includes NetworkPolicy)
  datapath_provider = "ADVANCED_DATAPATH"  # Enables Dataplane V2 (Cilium)

  # Private cluster — nodes have no public IPs
  private_cluster_config {
    enable_private_nodes    = true
    enable_private_endpoint = true     # Control plane also private
    master_ipv4_cidr_block  = "172.16.0.0/28"
  }

  # Authorized networks for control plane access
  master_authorized_networks_config {
    cidr_blocks {
      cidr_block   = "10.0.0.0/8"     # Internal network only
      display_name = "internal"
    }
  }
}

2. Kubernetes Services

Services provide stable endpoints for a set of pods. There are four types:

ClusterIP (default)

# Internal-only service — accessible within the cluster
apiVersion: v1
kind: Service
metadata:
  name: payment-service
  namespace: payments
spec:
  type: ClusterIP      # Default — internal only
  selector:
    app: payment-processor
  ports:
    - name: http
      port: 8080        # Service port (what clients connect to)
      targetPort: 8080   # Container port
    - name: grpc
      port: 9090
      targetPort: 9090

How ClusterIP Service Works

NodePort

# Exposes service on each node's IP at a static port (30000-32767)
apiVersion: v1
kind: Service
metadata:
  name: payment-service-np
  namespace: payments
spec:
  type: NodePort
  selector:
    app: payment-processor
  ports:
    - port: 8080
      targetPort: 8080
      nodePort: 30080     # Fixed port on every node

LoadBalancer

# Creates a cloud load balancer (NLB on EKS, TCP/UDP LB on GKE)
apiVersion: v1
kind: Service
metadata:
  name: payment-service-lb
  namespace: payments
  annotations:
    # EKS — creates NLB (default with AWS LB Controller)
    service.beta.kubernetes.io/aws-load-balancer-type: "external"
    service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: "ip"
    service.beta.kubernetes.io/aws-load-balancer-scheme: "internal"
    # GKE — creates internal TCP load balancer
    # cloud.google.com/l4-rbs: "enabled"
    # networking.gke.io/load-balancer-type: "Internal"
spec:
  type: LoadBalancer
  selector:
    app: payment-processor
  ports:
    - port: 443
      targetPort: 8080

ExternalName

# DNS alias to an external service — no proxying, just CNAME
apiVersion: v1
kind: Service
metadata:
  name: external-postgres
  namespace: payments
spec:
  type: ExternalName
  externalName: prod-db.cluster-abc123.us-east-1.rds.amazonaws.com
  # Any pod calling external-postgres.payments.svc.cluster.local
  # gets a CNAME response pointing to the RDS endpoint

Headless Services

# Headless service — no ClusterIP, DNS returns individual pod IPs
# Used for StatefulSets where clients need to address specific pods
apiVersion: v1
kind: Service
metadata:
  name: kafka-headless
  namespace: streaming
spec:
  clusterIP: None        # This makes it headless
  selector:
    app: kafka
  ports:
    - port: 9092
      targetPort: 9092
# DNS records created:
#   kafka-headless.streaming.svc.cluster.local → returns all pod IPs
#   kafka-0.kafka-headless.streaming.svc.cluster.local → 10.0.1.15
#   kafka-1.kafka-headless.streaming.svc.cluster.local → 10.0.2.22
#   kafka-2.kafka-headless.streaming.svc.cluster.local → 10.0.3.18

3. DNS in Kubernetes — CoreDNS

CoreDNS is the cluster DNS server. Every pod’s /etc/resolv.conf points to CoreDNS.

Pod /etc/resolv.conf:

nameserver 10.96.0.10                            ← CoreDNS ClusterIP
search payments.svc.cluster.local svc.cluster.local cluster.local
options ndots:5

DNS Resolution Flow

Pod tries to resolve "payment-service":

1. ndots:5 means any name with < 5 dots gets search domains appended
2. Tries: payment-service.payments.svc.cluster.local  ← FOUND (if same namespace)
3. Tries: payment-service.svc.cluster.local
4. Tries: payment-service.cluster.local
5. Tries: payment-service  (absolute lookup)

For external names like "api.stripe.com" (2 dots < 5):
1. Tries: api.stripe.com.payments.svc.cluster.local   ← NOT FOUND
2. Tries: api.stripe.com.svc.cluster.local             ← NOT FOUND
3. Tries: api.stripe.com.cluster.local                 ← NOT FOUND
4. Tries: api.stripe.com                               ← FOUND (external DNS)

CoreDNS Configuration

# CoreDNS Corefile (stored in ConfigMap "coredns" in kube-system)
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health {
            lameduck 5s
        }
        ready
        kubernetes cluster.local in-addr.arpa ip6.arpa {
            pods insecure
            fallthrough in-addr.arpa ip6.arpa
            ttl 30
        }
        prometheus :9153
        forward . /etc/resolv.conf {
            max_concurrent 1000
        }
        cache 30
        loop
        reload
        loadbalance
    }

    # Forward bank internal domains to on-prem DNS
    bank.internal:53 {
        errors
        cache 30
        forward . 10.100.0.53 10.100.0.54 {
            force_tcp
        }
    }

Service DNS Names

Service Type	DNS Name	Resolves To
ClusterIP	`svc.ns.svc.cluster.local`	ClusterIP (virtual IP)
Headless	`svc.ns.svc.cluster.local`	Set of Pod IPs (A records)
Headless (StatefulSet)	`pod-0.svc.ns.svc.cluster.local`	Individual pod IP
ExternalName	`svc.ns.svc.cluster.local`	CNAME to external host

4. Ingress

Ingress exposes HTTP/HTTPS routes from outside the cluster to services inside.

Ingress Traffic Flow

EKS Ingress — AWS Load Balancer Controller

On EKS, the AWS Load Balancer Controller creates ALBs (for Ingress) and NLBs (for LoadBalancer Services). It must be deployed as a Helm chart with an IRSA role.

GKE Ingress — Google Cloud Load Balancing

GKE natively integrates with Google Cloud Load Balancing. No separate controller needed.

NEG-backed services (Network Endpoint Groups): NEG-backed services are GKE’s killer feature — the load balancer sends traffic directly to pod IPs, skipping an extra hop through kube-proxy. This gives you lower latency and better health checking. Always enable NEGs for production services.

# IRSA role for LB Controller
module "aws_lb_controller_irsa" {
  source  = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
  version = "5.39.0"

  role_name = "${var.cluster_name}-aws-lb-controller"

  attach_load_balancer_controller_policy = true

  oidc_providers = {
    main = {
      provider_arn               = module.eks.oidc_provider_arn
      namespace_service_accounts = ["kube-system:aws-load-balancer-controller"]
    }
  }
}

# Helm release
resource "helm_release" "aws_lb_controller" {
  name       = "aws-load-balancer-controller"
  repository = "https://aws.github.io/eks-charts"
  chart      = "aws-load-balancer-controller"
  namespace  = "kube-system"
  version    = "1.8.1"

  set {
    name  = "clusterName"
    value = module.eks.cluster_name
  }
  set {
    name  = "serviceAccount.annotations.eks\\.amazonaws\\.com/role-arn"
    value = module.aws_lb_controller_irsa.iam_role_arn
  }
  set {
    name  = "vpcId"
    value = module.vpc.vpc_id
  }
}

# Ingress with Google Cloud LB
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: api-ingress
  namespace: api
  annotations:
    kubernetes.io/ingress.class: "gce"               # External HTTP(S) LB
    # kubernetes.io/ingress.class: "gce-internal"     # Internal HTTP(S) LB
    kubernetes.io/ingress.global-static-ip-name: "api-bank-ip"
    networking.gke.io/managed-certificates: "api-bank-cert"
    networking.gke.io/v1beta1.FrontendConfig: "api-frontend-config"
spec:
  rules:
    - host: api.bank.com
      http:
        paths:
          - path: /payments
            pathType: Prefix
            backend:
              service:
                name: payment-service
                port:
                  number: 8080
          - path: /accounts
            pathType: Prefix
            backend:
              service:
                name: account-service
                port:
                  number: 8080
---
# Managed SSL certificate
apiVersion: networking.gke.io/v1
kind: ManagedCertificate
metadata:
  name: api-bank-cert
  namespace: api
spec:
  domains:
    - api.bank.com
---
# Frontend config (SSL policy, redirects)
apiVersion: networking.gke.io/v1beta1
kind: FrontendConfig
metadata:
  name: api-frontend-config
  namespace: api
spec:
  sslPolicy: bank-ssl-policy
  redirectToHttps:
    enabled: true
    responseCodeName: MOVED_PERMANENTLY_DEFAULT

# NEG annotation — load balancer targets pod IPs directly, bypassing kube-proxy
apiVersion: v1
kind: Service
metadata:
  name: payment-service
  namespace: payments
  annotations:
    cloud.google.com/neg: '{"ingress": true}'     # Create NEG for Ingress
    # cloud.google.com/neg: '{"exposed_ports":{"8080":{"name":"payment-neg"}}}'
spec:
  type: ClusterIP    # NEG works with ClusterIP — no NodePort needed
  selector:
    app: payment-processor
  ports:
    - port: 8080
      targetPort: 8080

5. Gateway API — The Future of Ingress

Gateway API is the successor to Ingress with richer routing, role-based resource model, and multi-protocol support. It is GA in both EKS and GKE as of 2025.

Gateway API — Role Separation

Gateway API Resource Model

Gateway — EKS
Gateway — GKE

# GatewayClass — managed by platform team
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
  name: aws-alb
spec:
  controllerName: gateway.k8s.aws/alb   # AWS LB Controller v2.7+
---
# Gateway — one shared gateway for multiple teams
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: shared-external-gateway
  namespace: gateway-system
  annotations:
    alb.ingress.kubernetes.io/scheme: internet-facing
    alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-1:111111111111:certificate/abc123
spec:
  gatewayClassName: aws-alb
  listeners:
    - name: https
      protocol: HTTPS
      port: 443
      tls:
        mode: Terminate
        certificateRefs:
          - name: api-bank-cert
      allowedRoutes:
        namespaces:
          from: Selector
          selector:
            matchLabels:
              gateway-access: "true"    # Only labelled namespaces can attach
---
# HTTPRoute — defined by payment team in their namespace
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: payment-routes
  namespace: payments
spec:
  parentRefs:
    - name: shared-external-gateway
      namespace: gateway-system
  hostnames:
    - "api.bank.com"
  rules:
    - matches:
        - path:
            type: PathPrefix
            value: /payments
      backendRefs:
        - name: payment-service
          port: 8080
          weight: 100
    # Canary: 5% traffic to v2
    - matches:
        - path:
            type: PathPrefix
            value: /payments
          headers:
            - name: x-canary
              value: "true"
      backendRefs:
        - name: payment-service-v2
          port: 8080

# GatewayClass — GKE provides several built-in classes
# gke-l7-global-external-managed     — Global external Application LB
# gke-l7-regional-external-managed   — Regional external Application LB
# gke-l7-rilb                        — Regional internal Application LB
# gke-l7-gxlb                        — Classic global external (legacy)

apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: shared-external-gateway
  namespace: gateway-system
  annotations:
    networking.gke.io/certmap: bank-cert-map   # Certificate Manager cert map
spec:
  gatewayClassName: gke-l7-global-external-managed
  listeners:
    - name: https
      protocol: HTTPS
      port: 443
      allowedRoutes:
        namespaces:
          from: Selector
          selector:
            matchLabels:
              gateway-access: "true"
  addresses:
    - type: NamedAddress
      value: api-bank-global-ip    # Pre-provisioned static IP
---
# HTTPRoute — payment team
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: payment-routes
  namespace: payments
spec:
  parentRefs:
    - name: shared-external-gateway
      namespace: gateway-system
  hostnames:
    - "api.bank.com"
  rules:
    - matches:
        - path:
            type: PathPrefix
            value: /payments
      backendRefs:
        - name: payment-service
          port: 8080
      filters:
        - type: ResponseHeaderModifier
          responseHeaderModifier:
            add:
              - name: Strict-Transport-Security
                value: "max-age=31536000; includeSubDomains"
    # Traffic splitting for canary
    - matches:
        - path:
            type: PathPrefix
            value: /payments/v2
      backendRefs:
        - name: payment-service
          port: 8080
          weight: 95
        - name: payment-service-v2
          port: 8080
          weight: 5

Ingress vs Gateway API — When to Migrate

Aspect	Ingress	Gateway API
Role separation	Single resource, one team manages all	GatewayClass/Gateway (platform), Routes (app teams)
Protocol support	HTTP/HTTPS only	HTTP, HTTPS, gRPC, TCP, TLS, UDP
Traffic splitting	Not native (annotation hacks)	Native weight-based routing
Header matching	Not native	Native header/query-param matching
Cross-namespace	Requires annotation workarounds	Native parentRef across namespaces
Maturity	Stable, widely adopted	GA since K8s 1.29, cloud support solid
Recommendation	Legacy clusters, simple use cases	All new deployments

6. Network Policies

Network Policies are Kubernetes-native firewalls that control pod-to-pod and pod-to-external traffic.

Default Deny — The Mandatory Baseline

# ALWAYS start with deny-all in every namespace
# Without this, all traffic is allowed by default
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: payments
spec:
  podSelector: {}          # Applies to ALL pods in namespace
  policyTypes:
    - Ingress
    - Egress
  # No ingress/egress rules = deny everything

Allow Specific Traffic

# Allow payment-processor to receive traffic from api-gateway
# and make outbound calls to postgres and external payment APIs
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-payment-processor
  namespace: payments
spec:
  podSelector:
    matchLabels:
      app: payment-processor
  policyTypes:
    - Ingress
    - Egress
  ingress:
    # Allow from api-gateway namespace
    - from:
        - namespaceSelector:
            matchLabels:
              name: api-gateway
          podSelector:
            matchLabels:
              app: api-gateway
      ports:
        - protocol: TCP
          port: 8080
  egress:
    # Allow to postgres in same namespace
    - to:
        - podSelector:
            matchLabels:
              app: postgres
      ports:
        - protocol: TCP
          port: 5432
    # Allow to external payment processor (CIDR)
    - to:
        - ipBlock:
            cidr: 203.0.113.0/24    # Payment processor IP range
      ports:
        - protocol: TCP
          port: 443
    # Allow DNS (always needed with deny-all egress)
    - to:
        - namespaceSelector: {}
          podSelector:
            matchLabels:
              k8s-app: kube-dns
      ports:
        - protocol: UDP
          port: 53
        - protocol: TCP
          port: 53

Multi-Tenant Isolation Pattern

# Each team's namespace gets:
# 1. Deny all by default
# 2. Allow intra-namespace communication
# 3. Allow ingress from shared ingress namespace
# 4. Allow egress to DNS and specific shared services

# Step 1: Deny all (see above)

# Step 2: Allow intra-namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-same-namespace
  namespace: team-a
spec:
  podSelector: {}
  policyTypes:
    - Ingress
  ingress:
    - from:
        - podSelector: {}    # Same namespace (empty selector = all pods in this NS)

---
# Step 3: Allow from ingress controller namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-from-ingress
  namespace: team-a
spec:
  podSelector: {}
  policyTypes:
    - Ingress
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              name: ingress-system

---
# Step 4: Allow DNS + shared services egress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-egress-baseline
  namespace: team-a
spec:
  podSelector: {}
  policyTypes:
    - Egress
  egress:
    - to:
        - namespaceSelector:
            matchLabels:
              name: kube-system
      ports:
        - protocol: UDP
          port: 53
    - to:
        - namespaceSelector:
            matchLabels:
              name: monitoring
      ports:
        - protocol: TCP
          port: 9090     # Prometheus scraping

7. cert-manager — Automated TLS

cert-manager automates TLS certificate issuance and renewal in Kubernetes.

cert-manager Flow

# ClusterIssuer — Let's Encrypt for external, ACM PCA for internal
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: platform-team@bank.com
    privateKeySecretRef:
      name: letsencrypt-prod-key
    solvers:
      - http01:
          ingress:
            class: alb
---
# For internal services — AWS ACM Private CA
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: acm-pca-issuer
spec:
  acmpca:
    arn: arn:aws:acm-pca:us-east-1:111111111111:certificate-authority/abc123
    region: us-east-1
---
# Certificate request
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: api-bank-cert
  namespace: api
spec:
  secretName: api-bank-tls
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
  dnsNames:
    - api.bank.com
    - "*.api.bank.com"
  duration: 2160h      # 90 days
  renewBefore: 360h    # Renew 15 days before expiry

8. ExternalDNS — Automatic DNS Records

ExternalDNS watches Ingress/Gateway/Service resources and creates DNS records automatically.

# ExternalDNS Deployment (simplified)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: external-dns
  namespace: kube-system
spec:
  template:
    spec:
      serviceAccountName: external-dns   # Needs IRSA/WI for Route53/Cloud DNS
      containers:
        - name: external-dns
          image: registry.k8s.io/external-dns/external-dns:v0.14.2
          args:
            - --source=ingress
            - --source=service
            - --source=gateway-httproute
            - --provider=aws           # or --provider=google
            - --domain-filter=bank.com
            - --aws-zone-type=private   # Only manage private hosted zones
            - --policy=upsert-only      # Never delete records (safety)
            - --registry=txt
            - --txt-owner-id=eks-prod-cluster

Interview Scenarios

Scenario 1: “Explain how traffic flows from the internet to a pod in EKS vs GKE”

Answer:

EKS Flow
GKE Flow

Internet → Route53 → CloudFront (optional CDN)
    → WAF (in Network Hub Account, associated with ALB)
    → ALB (created by AWS LB Controller from Ingress/Gateway)
    → Target Group (IP mode — targets pod IPs directly via VPC CNI)
    → Pod (real VPC IP, e.g., 10.0.3.47)

Detailed packet trace:

1. Client resolves api.bank.com → Route53 returns ALB DNS name
2. Client connects to ALB (HTTPS on port 443)
3. ALB terminates TLS (ACM certificate)
4. ALB evaluates Ingress rules — matches /payments path
5. ALB forwards to Target Group (IP target type)
6. Target Group health-checks pod at /healthz:8080
7. Packet goes to ENI on the node hosting the pod
8. VPC CNI routes packet directly to pod's veth interface
9. Pod receives request on port 8080

Key points:
- With IP target type, ALB sends directly to pod IP — no kube-proxy hop
- Pod's security group must allow traffic from ALB's security group
- VPC CNI means pod IP is a real VPC IP — traceable in VPC Flow Logs

Internet → Cloud DNS → Global External Application LB
    → Cloud Armor (WAF — attached to backend service)
    → Backend Service (NEG-backed — targets pod IPs directly)
    → Pod (alias IP from secondary range, e.g., 10.4.12.88)

Detailed packet trace:

1. Client resolves api.bank.com → Cloud DNS returns LB's global anycast IP
2. Client connects to nearest Google POP (Anycast)
3. Google's global network routes to the region hosting the cluster
4. Application LB terminates TLS (managed certificate)
5. LB evaluates URL map (generated from Ingress/HTTPRoute)
6. LB forwards to Network Endpoint Group (NEG)
7. NEG targets pod IPs directly (no kube-proxy hop)
8. Packet arrives at pod via Google's VPC network fabric
9. Pod receives request on port 8080

Key points:
- NEG-backed means LB → pod directly, no NodePort needed
- Global LB gives you anycast — clients connect to nearest edge
- Cloud Armor policies evaluated before traffic reaches the cluster
- Pod IP is a VPC alias IP — visible in VPC Flow Logs

Scenario 2: “Design network policies for a multi-tenant cluster — each team isolated”

Answer:

“I would implement a defense-in-depth approach with namespace-level isolation as the baseline.”

Design:

Network Policy Design — 5 Team Namespaces

Implementation — applied via GitOps for every team namespace:

# 1. Default deny-all (see Section 6 above)
# 2. Allow intra-namespace (see Section 6 above)
# 3. Allow from ingress (see Section 6 above)
# 4. Allow DNS egress (see Section 6 above)

# 5. If Team A needs to call Team B's API (explicit cross-namespace)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-from-team-a
  namespace: team-b
spec:
  podSelector:
    matchLabels:
      app: shared-api
  policyTypes:
    - Ingress
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              name: team-a
          podSelector:
            matchLabels:
              app: team-a-client
      ports:
        - protocol: TCP
          port: 8080

Key interview points:

Default deny is a must — without it, NetworkPolicy is additive only (allow rules)
Always allow DNS egress or nothing works
Cross-namespace access should be explicitly requested and approved via PR
Monitoring namespace needs ingress to all namespaces for Prometheus scraping
Enforce policies via OPA Gatekeeper — “every namespace must have default-deny”

Scenario 3: “What’s the difference between Ingress and Gateway API? When would you migrate?”

Answer:

“Ingress is the first-generation traffic routing API. Gateway API is its successor with a role-oriented design that matches how platform teams and app teams actually work.”

Why migrate:

Role separation — With Ingress, a single resource controls everything. The platform team and app teams fight over the same YAML. Gateway API splits this: platform team owns GatewayClass + Gateway, app teams own HTTPRoutes. This maps to enterprise RBAC naturally.
Multi-protocol — Ingress only does HTTP/HTTPS. Gateway API handles gRPC, TCP, TLS passthrough natively. For a bank with gRPC microservices, this is essential.
Traffic splitting — Canary deployments with Ingress require annotation hacks that differ between controllers. Gateway API has native weight in backendRefs — standard across all implementations.
Cross-namespace routing — With 20 teams, each team deploys their HTTPRoute in their namespace, all pointing to a single shared Gateway. No need for shared Ingress resources.

When NOT to migrate:

Simple clusters with few services — Ingress works fine
If your Ingress controller (NGINX, Traefik) does not support Gateway API yet
If your GitOps tooling has not been updated to handle Gateway API CRDs

Migration approach:

Run both in parallel — Gateway API and Ingress can coexist
Migrate one service at a time, validate with traffic mirroring
Platform team creates Gateway resources first, then app teams add HTTPRoutes

Scenario 4: “Pods can’t resolve external DNS. How do you debug?”

Answer:

“This is almost always a CoreDNS issue, an egress network policy blocking DNS, or an ndots misconfiguration. Here is my systematic debugging approach.”

Step 1: Verify DNS from the pod

# Run a debug pod
kubectl run dnstest --image=busybox:1.36 -it --rm -- sh

# Test internal DNS
nslookup payment-service.payments.svc.cluster.local

# Test external DNS
nslookup api.stripe.com

# Test with explicit DNS server
nslookup api.stripe.com 10.96.0.10   # CoreDNS ClusterIP

Step 2: Check CoreDNS health

# Is CoreDNS running?
kubectl get pods -n kube-system -l k8s-app=kube-dns

# Check CoreDNS logs
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=50

# Check CoreDNS metrics
kubectl port-forward -n kube-system svc/kube-dns 9153:9153
curl localhost:9153/metrics | grep coredns_dns_requests_total

Step 3: Check Network Policies

# Is there a deny-all egress that blocks DNS?
kubectl get networkpolicies -n payments

# Look for egress rules allowing UDP 53 to kube-system
kubectl get networkpolicy allow-egress-baseline -n payments -o yaml

Step 4: Check /etc/resolv.conf in the pod

kubectl exec -it payment-pod-abc123 -n payments -- cat /etc/resolv.conf
# Verify:
# - nameserver points to CoreDNS ClusterIP (10.96.0.10)
# - search domains include svc.cluster.local
# - ndots:5 is present

Step 5: Common fixes

# Fix 1: If external DNS fails but internal works — CoreDNS upstream issue
# Check CoreDNS Corefile — is 'forward' configured correctly?

# Fix 2: If ndots causing slow external resolution
# Add to pod spec:
spec:
  dnsConfig:
    options:
      - name: ndots
        value: "2"    # Reduces search domain attempts for external names

# Fix 3: If Network Policy blocks DNS
# Ensure egress allows UDP 53 to kube-system (see Network Policy section)

Scenario 5: “Design service-to-service communication for 50 microservices”

Answer:

“At 50 microservices, you need a structured approach. I would evaluate three tiers of communication patterns based on the nature of the interaction.”

Communication Design for 50 Microservices

For a bank with 50 microservices, my specific recommendation:

Service mesh (Istio ambient mode) — mTLS everywhere, required for regulatory compliance. Ambient mode avoids the sidecar overhead (no 50 extra containers per node).
Direct ClusterIP for synchronous — payment-service calls fraud-detection-service via gRPC ClusterIP. Service mesh handles retries and circuit breaking transparently.
SQS/Pub-Sub for async — after payment is processed, publish a PaymentCompleted event. Notification service, ledger service, and analytics service all subscribe independently.
Network Policies enforce boundaries — each team’s namespace has deny-all, with explicit policies for allowed service-to-service communication. This creates a documented, auditable communication map.
DNS conventions — {service}.{team}.svc.cluster.local. CoreDNS resolves it. Teams do not need to know pod IPs.

# Example: Payment service calling fraud-detection
# Direct service call via ClusterIP + Istio mTLS
apiVersion: v1
kind: Service
metadata:
  name: fraud-detection
  namespace: risk
spec:
  type: ClusterIP
  selector:
    app: fraud-detection
  ports:
    - name: grpc
      port: 9090
      targetPort: 9090

# Payment service calls: fraud-detection.risk.svc.cluster.local:9090
# Istio handles: mTLS, retries (3x with backoff), circuit breaking (5xx > 50%)

Monitoring at 50-service scale:

Kiali for service mesh visualization — see the full communication graph
Distributed tracing (Tempo/Jaeger) — trace requests across all 50 services
Golden signals per service: latency, traffic, errors, saturation

Quick Reference

Service Type Decision Tree

Common Annotations Cheatsheet

Purpose	EKS Annotation	GKE Annotation
Internal LB	`service.beta.kubernetes.io/aws-load-balancer-scheme: internal`	`networking.gke.io/load-balancer-type: Internal`
IP target mode	`alb.ingress.kubernetes.io/target-type: ip`	`cloud.google.com/neg: '{"ingress": true}'`
SSL cert	`alb.ingress.kubernetes.io/certificate-arn: ...`	`networking.gke.io/managed-certificates: ...`
WAF	`alb.ingress.kubernetes.io/wafv2-acl-arn: ...`	Cloud Armor via BackendPolicy
Health check path	`alb.ingress.kubernetes.io/healthcheck-path: /healthz`	Via BackendConfig `healthCheck.requestPath`
Shared LB	`alb.ingress.kubernetes.io/group.name: shared`	Single Ingress/Gateway with multiple rules

Kubernetes Networking — Services, Ingress & Network Policies

Where This Fits

1. The Kubernetes Networking Model

How the CNI Works

EKS — AWS VPC CNI

GKE — VPC-native Networking

2. Kubernetes Services

ClusterIP (default)

NodePort

LoadBalancer

ExternalName

Headless Services

3. DNS in Kubernetes — CoreDNS

DNS Resolution Flow

CoreDNS Configuration

Service DNS Names

4. Ingress

EKS Ingress — AWS Load Balancer Controller

GKE Ingress — Google Cloud Load Balancing

5. Gateway API — The Future of Ingress

Ingress vs Gateway API — When to Migrate

6. Network Policies

Default Deny — The Mandatory Baseline

Allow Specific Traffic

Multi-Tenant Isolation Pattern

7. cert-manager — Automated TLS

8. ExternalDNS — Automatic DNS Records

Interview Scenarios

Scenario 1: “Explain how traffic flows from the internet to a pod in EKS vs GKE”

Scenario 2: “Design network policies for a multi-tenant cluster — each team isolated”

Scenario 3: “What’s the difference between Ingress and Gateway API? When would you migrate?”

Scenario 4: “Pods can’t resolve external DNS. How do you debug?”

Scenario 5: “Design service-to-service communication for 50 microservices”

Quick Reference

Service Type Decision Tree

Common Annotations Cheatsheet

References

AWS

GCP

Tools & Frameworks