Skip to content

Kubernetes Networking — Services, Ingress & Network Policies

Kubernetes Networking — Where This Fits You are the central platform team at an enterprise bank. Your EKS/GKE clusters sit in Workload Account VPCs that are spokes off the Network Hub. Every pod gets an IP from the VPC CIDR (EKS) or alias IP range (GKE). All north-south traffic passes through the Network Hub for inspection. This page covers everything inside and at the edge of the cluster.


Kubernetes imposes three fundamental networking requirements:

  1. Every pod gets its own IP address — no NAT between pods
  2. All pods can communicate with all other pods without NAT (flat network)
  3. All nodes can communicate with all pods without NAT

The Container Networking Interface (CNI) plugin implements these requirements on each cloud.

Kubernetes Flat Network Model

CNI Plugin — Pod Creation Flow

The AWS VPC CNI gives each pod a real ENI secondary IP from your VPC subnet.

How it works:

  1. Each node gets a primary ENI + additional ENIs based on instance type
  2. Each ENI gets multiple secondary IPs from the subnet CIDR
  3. Pods are assigned these secondary IPs — they are real VPC IPs
  4. aws-node DaemonSet manages the IP warm pool

IP capacity per node:

Instance TypeMax ENIsIPs/ENIMax Pods
m5.large31029
m5.xlarge41558
m5.2xlarge41558
m5.4xlarge830234

GKE uses alias IP ranges to give pods IPs from a secondary range on the VPC subnet.

How it works:

  1. Each node gets a primary IP from the node subnet
  2. Pods get IPs from a secondary IP range (alias range) on the same subnet
  3. These are real VPC IPs — no overlay, no encapsulation
  4. Google’s network fabric routes alias IPs natively

VPC-native cluster IP ranges: GKE VPC-native Subnet IP Ranges

Key settings:

Maximum pods per node: 32 (default) / 64 / 110 (configurable at pool creation)
Pod IP range per node: /24 (for 110 max-pods) or /26 (for 32 max-pods)

Dataplane V2 (Cilium-based):

  • eBPF-based networking — faster than iptables
  • Built-in NetworkPolicy enforcement (no Calico needed)
  • Native visibility and observability
  • Enabled by default on new Autopilot clusters
# Enable prefix delegation — assigns /28 prefixes instead of individual IPs
# Each prefix = 16 IPs, dramatically increases pod density
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: aws-node
namespace: kube-system
spec:
template:
spec:
containers:
- name: aws-node
env:
- name: ENABLE_PREFIX_DELEGATION
value: "true"
- name: WARM_PREFIX_TARGET
value: "1"
# With prefix delegation, m5.large goes from 29 to 110 pods
# ENIConfig — route pod traffic through a separate subnet
# Use case: nodes in public subnet, pods in private subnet
apiVersion: crd.k8s.amazonaws.com/v1alpha1
kind: ENIConfig
metadata:
name: us-east-1a
spec:
securityGroups:
- sg-0a1b2c3d4e5f6a7b8 # Pod security group
subnet: subnet-0abcd1234pod # Dedicated pod subnet
---
# Set on aws-node DaemonSet:
# AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG=true
# ENI_CONFIG_LABEL_DEF=topology.kubernetes.io/zone
# SecurityGroupPolicy — assign SGs directly to pods (not just nodes)
apiVersion: vpcresources.k8s.aws/v1beta1
kind: SecurityGroupPolicy
metadata:
name: payments-sg-policy
namespace: payments
spec:
podSelector:
matchLabels:
app: payment-processor
securityGroups:
groupIds:
- sg-0payments1234abcd # SG that allows port 5432 to RDS
- sg-0baseline5678efgh # Baseline SG for all pods

Services provide stable endpoints for a set of pods. There are four types:

# Internal-only service — accessible within the cluster
apiVersion: v1
kind: Service
metadata:
name: payment-service
namespace: payments
spec:
type: ClusterIP # Default — internal only
selector:
app: payment-processor
ports:
- name: http
port: 8080 # Service port (what clients connect to)
targetPort: 8080 # Container port
- name: grpc
port: 9090
targetPort: 9090

How ClusterIP Service Works

# Exposes service on each node's IP at a static port (30000-32767)
apiVersion: v1
kind: Service
metadata:
name: payment-service-np
namespace: payments
spec:
type: NodePort
selector:
app: payment-processor
ports:
- port: 8080
targetPort: 8080
nodePort: 30080 # Fixed port on every node
# Creates a cloud load balancer (NLB on EKS, TCP/UDP LB on GKE)
apiVersion: v1
kind: Service
metadata:
name: payment-service-lb
namespace: payments
annotations:
# EKS — creates NLB (default with AWS LB Controller)
service.beta.kubernetes.io/aws-load-balancer-type: "external"
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: "ip"
service.beta.kubernetes.io/aws-load-balancer-scheme: "internal"
# GKE — creates internal TCP load balancer
# cloud.google.com/l4-rbs: "enabled"
# networking.gke.io/load-balancer-type: "Internal"
spec:
type: LoadBalancer
selector:
app: payment-processor
ports:
- port: 443
targetPort: 8080
# DNS alias to an external service — no proxying, just CNAME
apiVersion: v1
kind: Service
metadata:
name: external-postgres
namespace: payments
spec:
type: ExternalName
externalName: prod-db.cluster-abc123.us-east-1.rds.amazonaws.com
# Any pod calling external-postgres.payments.svc.cluster.local
# gets a CNAME response pointing to the RDS endpoint
# Headless service — no ClusterIP, DNS returns individual pod IPs
# Used for StatefulSets where clients need to address specific pods
apiVersion: v1
kind: Service
metadata:
name: kafka-headless
namespace: streaming
spec:
clusterIP: None # This makes it headless
selector:
app: kafka
ports:
- port: 9092
targetPort: 9092
# DNS records created:
# kafka-headless.streaming.svc.cluster.local → returns all pod IPs
# kafka-0.kafka-headless.streaming.svc.cluster.local → 10.0.1.15
# kafka-1.kafka-headless.streaming.svc.cluster.local → 10.0.2.22
# kafka-2.kafka-headless.streaming.svc.cluster.local → 10.0.3.18

CoreDNS is the cluster DNS server. Every pod’s /etc/resolv.conf points to CoreDNS.

Pod /etc/resolv.conf:
nameserver 10.96.0.10 ← CoreDNS ClusterIP
search payments.svc.cluster.local svc.cluster.local cluster.local
options ndots:5
Pod tries to resolve "payment-service":
1. ndots:5 means any name with < 5 dots gets search domains appended
2. Tries: payment-service.payments.svc.cluster.local ← FOUND (if same namespace)
3. Tries: payment-service.svc.cluster.local
4. Tries: payment-service.cluster.local
5. Tries: payment-service (absolute lookup)
For external names like "api.stripe.com" (2 dots < 5):
1. Tries: api.stripe.com.payments.svc.cluster.local ← NOT FOUND
2. Tries: api.stripe.com.svc.cluster.local ← NOT FOUND
3. Tries: api.stripe.com.cluster.local ← NOT FOUND
4. Tries: api.stripe.com ← FOUND (external DNS)
# CoreDNS Corefile (stored in ConfigMap "coredns" in kube-system)
apiVersion: v1
kind: ConfigMap
metadata:
name: coredns
namespace: kube-system
data:
Corefile: |
.:53 {
errors
health {
lameduck 5s
}
ready
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
ttl 30
}
prometheus :9153
forward . /etc/resolv.conf {
max_concurrent 1000
}
cache 30
loop
reload
loadbalance
}
# Forward bank internal domains to on-prem DNS
bank.internal:53 {
errors
cache 30
forward . 10.100.0.53 10.100.0.54 {
force_tcp
}
}
Service TypeDNS NameResolves To
ClusterIPsvc.ns.svc.cluster.localClusterIP (virtual IP)
Headlesssvc.ns.svc.cluster.localSet of Pod IPs (A records)
Headless (StatefulSet)pod-0.svc.ns.svc.cluster.localIndividual pod IP
ExternalNamesvc.ns.svc.cluster.localCNAME to external host

Ingress exposes HTTP/HTTPS routes from outside the cluster to services inside.

Ingress Traffic Flow

Ingress Traffic Flow

EKS Ingress — AWS Load Balancer Controller

Section titled “EKS Ingress — AWS Load Balancer Controller”

On EKS, the AWS Load Balancer Controller creates ALBs (for Ingress) and NLBs (for LoadBalancer Services). It must be deployed as a Helm chart with an IRSA role.

GKE Ingress — Google Cloud Load Balancing

Section titled “GKE Ingress — Google Cloud Load Balancing”

GKE natively integrates with Google Cloud Load Balancing. No separate controller needed.

NEG-backed services (Network Endpoint Groups): NEG-backed services are GKE’s killer feature — the load balancer sends traffic directly to pod IPs, skipping an extra hop through kube-proxy. This gives you lower latency and better health checking. Always enable NEGs for production services.

# IRSA role for LB Controller
module "aws_lb_controller_irsa" {
source = "terraform-aws-modules/iam/aws//modules/iam-role-for-service-accounts-eks"
version = "5.39.0"
role_name = "${var.cluster_name}-aws-lb-controller"
attach_load_balancer_controller_policy = true
oidc_providers = {
main = {
provider_arn = module.eks.oidc_provider_arn
namespace_service_accounts = ["kube-system:aws-load-balancer-controller"]
}
}
}
# Helm release
resource "helm_release" "aws_lb_controller" {
name = "aws-load-balancer-controller"
repository = "https://aws.github.io/eks-charts"
chart = "aws-load-balancer-controller"
namespace = "kube-system"
version = "1.8.1"
set {
name = "clusterName"
value = module.eks.cluster_name
}
set {
name = "serviceAccount.annotations.eks\\.amazonaws\\.com/role-arn"
value = module.aws_lb_controller_irsa.iam_role_arn
}
set {
name = "vpcId"
value = module.vpc.vpc_id
}
}

Gateway API is the successor to Ingress with richer routing, role-based resource model, and multi-protocol support. It is GA in both EKS and GKE as of 2025.

Gateway API — Role Separation

Gateway API Resource Model

# GatewayClass — managed by platform team
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: aws-alb
spec:
controllerName: gateway.k8s.aws/alb # AWS LB Controller v2.7+
---
# Gateway — one shared gateway for multiple teams
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: shared-external-gateway
namespace: gateway-system
annotations:
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/certificate-arn: arn:aws:acm:us-east-1:111111111111:certificate/abc123
spec:
gatewayClassName: aws-alb
listeners:
- name: https
protocol: HTTPS
port: 443
tls:
mode: Terminate
certificateRefs:
- name: api-bank-cert
allowedRoutes:
namespaces:
from: Selector
selector:
matchLabels:
gateway-access: "true" # Only labelled namespaces can attach
---
# HTTPRoute — defined by payment team in their namespace
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: payment-routes
namespace: payments
spec:
parentRefs:
- name: shared-external-gateway
namespace: gateway-system
hostnames:
- "api.bank.com"
rules:
- matches:
- path:
type: PathPrefix
value: /payments
backendRefs:
- name: payment-service
port: 8080
weight: 100
# Canary: 5% traffic to v2
- matches:
- path:
type: PathPrefix
value: /payments
headers:
- name: x-canary
value: "true"
backendRefs:
- name: payment-service-v2
port: 8080

Ingress vs Gateway API — When to Migrate

Section titled “Ingress vs Gateway API — When to Migrate”
AspectIngressGateway API
Role separationSingle resource, one team manages allGatewayClass/Gateway (platform), Routes (app teams)
Protocol supportHTTP/HTTPS onlyHTTP, HTTPS, gRPC, TCP, TLS, UDP
Traffic splittingNot native (annotation hacks)Native weight-based routing
Header matchingNot nativeNative header/query-param matching
Cross-namespaceRequires annotation workaroundsNative parentRef across namespaces
MaturityStable, widely adoptedGA since K8s 1.29, cloud support solid
RecommendationLegacy clusters, simple use casesAll new deployments

Network Policies are Kubernetes-native firewalls that control pod-to-pod and pod-to-external traffic.

# ALWAYS start with deny-all in every namespace
# Without this, all traffic is allowed by default
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: payments
spec:
podSelector: {} # Applies to ALL pods in namespace
policyTypes:
- Ingress
- Egress
# No ingress/egress rules = deny everything
# Allow payment-processor to receive traffic from api-gateway
# and make outbound calls to postgres and external payment APIs
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-payment-processor
namespace: payments
spec:
podSelector:
matchLabels:
app: payment-processor
policyTypes:
- Ingress
- Egress
ingress:
# Allow from api-gateway namespace
- from:
- namespaceSelector:
matchLabels:
name: api-gateway
podSelector:
matchLabels:
app: api-gateway
ports:
- protocol: TCP
port: 8080
egress:
# Allow to postgres in same namespace
- to:
- podSelector:
matchLabels:
app: postgres
ports:
- protocol: TCP
port: 5432
# Allow to external payment processor (CIDR)
- to:
- ipBlock:
cidr: 203.0.113.0/24 # Payment processor IP range
ports:
- protocol: TCP
port: 443
# Allow DNS (always needed with deny-all egress)
- to:
- namespaceSelector: {}
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- protocol: UDP
port: 53
- protocol: TCP
port: 53
# Each team's namespace gets:
# 1. Deny all by default
# 2. Allow intra-namespace communication
# 3. Allow ingress from shared ingress namespace
# 4. Allow egress to DNS and specific shared services
# Step 1: Deny all (see above)
# Step 2: Allow intra-namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-same-namespace
namespace: team-a
spec:
podSelector: {}
policyTypes:
- Ingress
ingress:
- from:
- podSelector: {} # Same namespace (empty selector = all pods in this NS)
---
# Step 3: Allow from ingress controller namespace
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-from-ingress
namespace: team-a
spec:
podSelector: {}
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ingress-system
---
# Step 4: Allow DNS + shared services egress
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-egress-baseline
namespace: team-a
spec:
podSelector: {}
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
name: kube-system
ports:
- protocol: UDP
port: 53
- to:
- namespaceSelector:
matchLabels:
name: monitoring
ports:
- protocol: TCP
port: 9090 # Prometheus scraping

cert-manager automates TLS certificate issuance and renewal in Kubernetes.

cert-manager Flow

# ClusterIssuer — Let's Encrypt for external, ACM PCA for internal
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
server: https://acme-v02.api.letsencrypt.org/directory
email: platform-team@bank.com
privateKeySecretRef:
name: letsencrypt-prod-key
solvers:
- http01:
ingress:
class: alb
---
# For internal services — AWS ACM Private CA
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: acm-pca-issuer
spec:
acmpca:
arn: arn:aws:acm-pca:us-east-1:111111111111:certificate-authority/abc123
region: us-east-1
---
# Certificate request
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: api-bank-cert
namespace: api
spec:
secretName: api-bank-tls
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
dnsNames:
- api.bank.com
- "*.api.bank.com"
duration: 2160h # 90 days
renewBefore: 360h # Renew 15 days before expiry

ExternalDNS watches Ingress/Gateway/Service resources and creates DNS records automatically.

# ExternalDNS Deployment (simplified)
apiVersion: apps/v1
kind: Deployment
metadata:
name: external-dns
namespace: kube-system
spec:
template:
spec:
serviceAccountName: external-dns # Needs IRSA/WI for Route53/Cloud DNS
containers:
- name: external-dns
image: registry.k8s.io/external-dns/external-dns:v0.14.2
args:
- --source=ingress
- --source=service
- --source=gateway-httproute
- --provider=aws # or --provider=google
- --domain-filter=bank.com
- --aws-zone-type=private # Only manage private hosted zones
- --policy=upsert-only # Never delete records (safety)
- --registry=txt
- --txt-owner-id=eks-prod-cluster

Scenario 1: “Explain how traffic flows from the internet to a pod in EKS vs GKE”

Section titled “Scenario 1: “Explain how traffic flows from the internet to a pod in EKS vs GKE””

Answer:

Internet → Route53 → CloudFront (optional CDN)
→ WAF (in Network Hub Account, associated with ALB)
→ ALB (created by AWS LB Controller from Ingress/Gateway)
→ Target Group (IP mode — targets pod IPs directly via VPC CNI)
→ Pod (real VPC IP, e.g., 10.0.3.47)
Detailed packet trace:
1. Client resolves api.bank.com → Route53 returns ALB DNS name
2. Client connects to ALB (HTTPS on port 443)
3. ALB terminates TLS (ACM certificate)
4. ALB evaluates Ingress rules — matches /payments path
5. ALB forwards to Target Group (IP target type)
6. Target Group health-checks pod at /healthz:8080
7. Packet goes to ENI on the node hosting the pod
8. VPC CNI routes packet directly to pod's veth interface
9. Pod receives request on port 8080
Key points:
- With IP target type, ALB sends directly to pod IP — no kube-proxy hop
- Pod's security group must allow traffic from ALB's security group
- VPC CNI means pod IP is a real VPC IP — traceable in VPC Flow Logs

Scenario 2: “Design network policies for a multi-tenant cluster — each team isolated”

Section titled “Scenario 2: “Design network policies for a multi-tenant cluster — each team isolated””

Answer:

“I would implement a defense-in-depth approach with namespace-level isolation as the baseline.”

Design:

Network Policy Design — 5 Team Namespaces

Implementation — applied via GitOps for every team namespace:

# 1. Default deny-all (see Section 6 above)
# 2. Allow intra-namespace (see Section 6 above)
# 3. Allow from ingress (see Section 6 above)
# 4. Allow DNS egress (see Section 6 above)
# 5. If Team A needs to call Team B's API (explicit cross-namespace)
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-from-team-a
namespace: team-b
spec:
podSelector:
matchLabels:
app: shared-api
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
name: team-a
podSelector:
matchLabels:
app: team-a-client
ports:
- protocol: TCP
port: 8080

Key interview points:

  • Default deny is a must — without it, NetworkPolicy is additive only (allow rules)
  • Always allow DNS egress or nothing works
  • Cross-namespace access should be explicitly requested and approved via PR
  • Monitoring namespace needs ingress to all namespaces for Prometheus scraping
  • Enforce policies via OPA Gatekeeper — “every namespace must have default-deny”

Scenario 3: “What’s the difference between Ingress and Gateway API? When would you migrate?”

Section titled “Scenario 3: “What’s the difference between Ingress and Gateway API? When would you migrate?””

Answer:

“Ingress is the first-generation traffic routing API. Gateway API is its successor with a role-oriented design that matches how platform teams and app teams actually work.”

Why migrate:

  1. Role separation — With Ingress, a single resource controls everything. The platform team and app teams fight over the same YAML. Gateway API splits this: platform team owns GatewayClass + Gateway, app teams own HTTPRoutes. This maps to enterprise RBAC naturally.

  2. Multi-protocol — Ingress only does HTTP/HTTPS. Gateway API handles gRPC, TCP, TLS passthrough natively. For a bank with gRPC microservices, this is essential.

  3. Traffic splitting — Canary deployments with Ingress require annotation hacks that differ between controllers. Gateway API has native weight in backendRefs — standard across all implementations.

  4. Cross-namespace routing — With 20 teams, each team deploys their HTTPRoute in their namespace, all pointing to a single shared Gateway. No need for shared Ingress resources.

When NOT to migrate:

  • Simple clusters with few services — Ingress works fine
  • If your Ingress controller (NGINX, Traefik) does not support Gateway API yet
  • If your GitOps tooling has not been updated to handle Gateway API CRDs

Migration approach:

  • Run both in parallel — Gateway API and Ingress can coexist
  • Migrate one service at a time, validate with traffic mirroring
  • Platform team creates Gateway resources first, then app teams add HTTPRoutes

Scenario 4: “Pods can’t resolve external DNS. How do you debug?”

Section titled “Scenario 4: “Pods can’t resolve external DNS. How do you debug?””

Answer:

“This is almost always a CoreDNS issue, an egress network policy blocking DNS, or an ndots misconfiguration. Here is my systematic debugging approach.”

Step 1: Verify DNS from the pod

Terminal window
# Run a debug pod
kubectl run dnstest --image=busybox:1.36 -it --rm -- sh
# Test internal DNS
nslookup payment-service.payments.svc.cluster.local
# Test external DNS
nslookup api.stripe.com
# Test with explicit DNS server
nslookup api.stripe.com 10.96.0.10 # CoreDNS ClusterIP

Step 2: Check CoreDNS health

Terminal window
# Is CoreDNS running?
kubectl get pods -n kube-system -l k8s-app=kube-dns
# Check CoreDNS logs
kubectl logs -n kube-system -l k8s-app=kube-dns --tail=50
# Check CoreDNS metrics
kubectl port-forward -n kube-system svc/kube-dns 9153:9153
curl localhost:9153/metrics | grep coredns_dns_requests_total

Step 3: Check Network Policies

Terminal window
# Is there a deny-all egress that blocks DNS?
kubectl get networkpolicies -n payments
# Look for egress rules allowing UDP 53 to kube-system
kubectl get networkpolicy allow-egress-baseline -n payments -o yaml

Step 4: Check /etc/resolv.conf in the pod

Terminal window
kubectl exec -it payment-pod-abc123 -n payments -- cat /etc/resolv.conf
# Verify:
# - nameserver points to CoreDNS ClusterIP (10.96.0.10)
# - search domains include svc.cluster.local
# - ndots:5 is present

Step 5: Common fixes

# Fix 1: If external DNS fails but internal works — CoreDNS upstream issue
# Check CoreDNS Corefile — is 'forward' configured correctly?
# Fix 2: If ndots causing slow external resolution
# Add to pod spec:
spec:
dnsConfig:
options:
- name: ndots
value: "2" # Reduces search domain attempts for external names
# Fix 3: If Network Policy blocks DNS
# Ensure egress allows UDP 53 to kube-system (see Network Policy section)

Scenario 5: “Design service-to-service communication for 50 microservices”

Section titled “Scenario 5: “Design service-to-service communication for 50 microservices””

Answer:

“At 50 microservices, you need a structured approach. I would evaluate three tiers of communication patterns based on the nature of the interaction.”

Communication Design for 50 Microservices

For a bank with 50 microservices, my specific recommendation:

  1. Service mesh (Istio ambient mode) — mTLS everywhere, required for regulatory compliance. Ambient mode avoids the sidecar overhead (no 50 extra containers per node).

  2. Direct ClusterIP for synchronous — payment-service calls fraud-detection-service via gRPC ClusterIP. Service mesh handles retries and circuit breaking transparently.

  3. SQS/Pub-Sub for async — after payment is processed, publish a PaymentCompleted event. Notification service, ledger service, and analytics service all subscribe independently.

  4. Network Policies enforce boundaries — each team’s namespace has deny-all, with explicit policies for allowed service-to-service communication. This creates a documented, auditable communication map.

  5. DNS conventions{service}.{team}.svc.cluster.local. CoreDNS resolves it. Teams do not need to know pod IPs.

# Example: Payment service calling fraud-detection
# Direct service call via ClusterIP + Istio mTLS
apiVersion: v1
kind: Service
metadata:
name: fraud-detection
namespace: risk
spec:
type: ClusterIP
selector:
app: fraud-detection
ports:
- name: grpc
port: 9090
targetPort: 9090
# Payment service calls: fraud-detection.risk.svc.cluster.local:9090
# Istio handles: mTLS, retries (3x with backoff), circuit breaking (5xx > 50%)

Monitoring at 50-service scale:

  • Kiali for service mesh visualization — see the full communication graph
  • Distributed tracing (Tempo/Jaeger) — trace requests across all 50 services
  • Golden signals per service: latency, traffic, errors, saturation

Service Type Decision Tree

PurposeEKS AnnotationGKE Annotation
Internal LBservice.beta.kubernetes.io/aws-load-balancer-scheme: internalnetworking.gke.io/load-balancer-type: Internal
IP target modealb.ingress.kubernetes.io/target-type: ipcloud.google.com/neg: '{"ingress": true}'
SSL certalb.ingress.kubernetes.io/certificate-arn: ...networking.gke.io/managed-certificates: ...
WAFalb.ingress.kubernetes.io/wafv2-acl-arn: ...Cloud Armor via BackendPolicy
Health check pathalb.ingress.kubernetes.io/healthcheck-path: /healthzVia BackendConfig healthCheck.requestPath
Shared LBalb.ingress.kubernetes.io/group.name: sharedSingle Ingress/Gateway with multiple rules