API Gateway & Management
Where This Fits
Section titled “Where This Fits”API Gateway sits between the Network Hub (WAF / CloudFront / Cloud Armor) and Workload Accounts (EKS / GKE / Cloud Run). The central infra team manages the gateway infrastructure, rate-limiting policies, authentication, and security. Tenant teams register their APIs and receive per-tenant throttling, keys, and usage plans.
API Gateway Fundamentals
Section titled “API Gateway Fundamentals”An API gateway is the single entry point for all client requests. It decouples clients from backend microservices, handling cross-cutting concerns centrally rather than in each service.
Core Responsibilities
Section titled “Core Responsibilities”| Responsibility | What It Does | Why It Matters |
|---|---|---|
| Routing | Maps URL paths to backend services | Clients use one domain, services evolve independently |
| Authentication | Validates JWT / OAuth2 tokens, API keys | Centralized auth instead of per-service implementation |
| Rate Limiting | Throttles requests per client, per API, per plan | Protects backends from abuse and noisy neighbors |
| Request Transform | Modifies headers, body, query params | Adapts client format to backend format |
| Response Caching | Caches responses at the gateway | Reduces backend load for read-heavy APIs |
| Observability | Access logs, latency metrics, error rates | Single point of visibility for all API traffic |
| Versioning | Routes to different backend versions | Allows API evolution without breaking clients |
API-First Design Principles
Section titled “API-First Design Principles”For an enterprise bank, APIs are products. The central infra team enforces:
- Design-first — OpenAPI spec before any code
- Consistent naming —
/v1/accounts/{id}/transactionsnot/getTransactionsByAccountId - Standard error format — RFC 7807 Problem Details across all APIs
- Pagination — cursor-based for large datasets, not offset-based
- Idempotency —
Idempotency-Keyheader for payment APIs (critical for banking) - Correlation IDs —
X-Request-IDpropagated through the entire chain
Authentication Patterns
Section titled “Authentication Patterns”OAuth2 / JWT Flow (External Clients)
Section titled “OAuth2 / JWT Flow (External Clients)”Authentication by Consumer Type
Section titled “Authentication by Consumer Type”| Consumer Type | Auth Method | Implementation |
|---|---|---|
| Mobile / SPA | OAuth2 + PKCE | Authorization code flow, short-lived tokens, refresh tokens |
| Third-party partners | OAuth2 Client Credentials + API Key | Usage plan for rate limiting, API key for identification |
| Internal services | Mutual TLS or IAM Auth | Service mesh (mTLS) or cloud-native IAM |
| B2B integrations | OAuth2 Client Credentials | Scoped to specific API operations |
Rate Limiting & Throttling
Section titled “Rate Limiting & Throttling”Strategy Layers
Section titled “Strategy Layers”Token Bucket Algorithm
Section titled “Token Bucket Algorithm”Most API gateways use token bucket for rate limiting:
- Bucket capacity = burst limit (e.g., 500 requests)
- Refill rate = steady-state limit (e.g., 100 requests/second)
- Each request consumes one token
- When bucket is empty, requests get HTTP 429 Too Many Requests
- Bucket refills at the steady rate
AWS API Gateway Throttling
Section titled “AWS API Gateway Throttling”AWS API Gateway throttling is three-tiered:
| Level | Default | Configurable |
|---|---|---|
| Account level | 10,000 req/sec across all APIs | Yes, via support ticket |
| Stage level | Inherits account | Yes, per stage |
| Method level | Inherits stage | Yes, per method + resource |
Usage Plans + API Keys for per-tenant throttling:
Apigee Rate Limiting Policies
Section titled “Apigee Rate Limiting Policies”Apigee rate limiting uses policies:
| Policy | Purpose | Scope |
|---|---|---|
| SpikeArrest | Smooths traffic bursts | Prevents sudden spikes (e.g., 10pm = ~1 every 100ms) |
| Quota | Enforces usage limits over time | Per-developer app, per API product (e.g., 10K/day) |
| ConcurrentRateLimit | Limits concurrent connections | Protects slow backends |
Apigee API Products for per-tenant management:
Developers register apps, subscribe to API Products, and receive client credentials. Apigee tracks usage per app against the product quotas.
API Versioning Strategies
Section titled “API Versioning Strategies”| Strategy | Example | Pros | Cons |
|---|---|---|---|
| URI path | /v1/accounts, /v2/accounts | Simple, explicit, cacheable | URL changes, pollutes resource path |
| Query param | /accounts?version=2 | Easy to add | Easy to forget, not RESTful |
| Header | Accept: application/vnd.bank.v2+json | Clean URLs | Hidden, harder to test in browser |
| Content negotiation | Accept: application/json; version=2 | Standard HTTP | Complex to implement |
Version Lifecycle Management
Section titled “Version Lifecycle Management”Internal vs External Gateway Patterns
Section titled “Internal vs External Gateway Patterns”Pattern 1: External API Gateway (North-South Traffic)
Section titled “Pattern 1: External API Gateway (North-South Traffic)”For traffic entering from the internet to backend services.
Pattern 2: Internal API Gateway (East-West Traffic)
Section titled “Pattern 2: Internal API Gateway (East-West Traffic)”For service-to-service communication within the enterprise.
When to Use Which
Section titled “When to Use Which”| Criterion | Service Mesh | Internal API Gateway |
|---|---|---|
| Environment | Kubernetes (EKS/GKE) | Mixed (ECS, Lambda, VMs, cross-account) |
| Latency | Sub-millisecond (in-pod sidecar) | 5-15ms (extra network hop) |
| Auth | mTLS automatic | IAM auth or JWT |
| Observability | Built-in (Envoy metrics) | CloudWatch / Cloud Logging |
| Rate limiting | Per-service policies | Usage plans |
| Complexity | High (Istio control plane) | Low (managed service) |
API Gateway Services
Section titled “API Gateway Services”AWS API Gateway Types
Section titled “AWS API Gateway Types”| Feature | REST API | HTTP API | WebSocket API |
|---|---|---|---|
| Protocol | REST | HTTP | WebSocket |
| Auth | IAM, Lambda, Cognito | JWT, IAM, Lambda | IAM, Lambda |
| Usage Plans | Yes | No | No |
| API Keys | Yes | No | No |
| Request Validation | Yes | No | No |
| Request Transform | VTL templates | Parameter mapping | Route selection |
| Caching | Yes (per stage) | No | No |
| WAF | Yes | No | No |
| Price | $3.50/million | $1.00/million | $1.00/million + connection |
| Latency | ~29ms overhead | ~10ms overhead | Persistent connection |
VPC Link (Private Integration)
Section titled “VPC Link (Private Integration)”Connect API Gateway to resources in private subnets:
GCP API Management — Apigee
Section titled “GCP API Management — Apigee”Apigee is Google’s enterprise API management platform with analytics, monetization, and developer portal.
Apigee Editions
Section titled “Apigee Editions”| Feature | Apigee X (Standard) | Apigee X (Enterprise) | Apigee Hybrid |
|---|---|---|---|
| Hosting | Google-managed | Google-managed | Runtime in your GKE |
| Environments | 2 | 4+ | Unlimited |
| SLA | 99.9% | 99.99% | Your infra SLA |
| Use case | Mid-size | Enterprise | Data residency / hybrid |
| Networking | Peering to your VPC | Peering + PSC | Runs in your VPC |
Multi-Tenant API Platform Architecture
Section titled “Multi-Tenant API Platform Architecture”The central infra team operates a shared API gateway platform that multiple tenant teams consume.
Key design decisions:
-
Single API Gateway with path-based routing to backend ALBs — not one gateway per service. Simplifies partner integration (one base URL, one API key).
-
REST API (not HTTP API) because we need usage plans, API keys, and WAF integration for partner-facing APIs.
-
Usage Plans by partner tier — Gold (1000 req/s), Silver (100 req/s), Bronze (10 req/s). Each partner gets an API key mapped to their usage plan.
-
Lambda Authorizer that validates the partner’s OAuth2 token AND checks the API key, returning an IAM policy that scopes access to only their permitted API paths.
-
Custom domain (
api.bank.com) with ACM certificate, Route 53 alias record. -
Versioning via URI path (
/v1/,/v2/). Whenv2launches,v1continues working with aSunsetheader. Partners get 6 months to migrate.
On GCP, I would use Apigee X instead — it provides the usage plans, developer portal, and analytics out of the box. Partners self-register through the developer portal.
Scenario 2: Per-Tenant Rate Limiting
Section titled “Scenario 2: Per-Tenant Rate Limiting”Q: “How do you implement per-tenant rate limiting so one partner can’t impact another?”
A: This is the noisy neighbor problem applied to APIs.
AWS approach — Usage Plans:
Each usage plan is an independent throttle bucket. Partner A hitting their 1,000 req/s limit does NOT affect Partner B’s 1,000 req/s allocation.
Beyond API Gateway throttling, I would add:
- WAF rate rules as an outer layer (per-IP, catches distributed attacks)
- Backend circuit breaker (Istio or app-level) to protect databases
- DynamoDB/Redis counter for custom business logic limits (e.g., max 100 payment initiations per partner per hour)
GCP approach — Apigee:
Use Quota policies per API Product. Each developer app subscribes to a product and gets its own quota counter. Apigee tracks this automatically and returns 429 with a Retry-After header when exceeded.
Scenario 3: Internal vs Service Mesh
Section titled “Scenario 3: Internal vs Service Mesh”Q: “Internal APIs: should you use an API Gateway or service mesh for service-to-service communication?”
A: It depends on the environment.
Use service mesh (Istio) when:
- All services run in Kubernetes (EKS/GKE)
- You need mutual TLS without application changes
- You want circuit breaking, retries, and timeouts as infrastructure
- Latency sensitivity is high (sidecar adds less than 1ms, gateway adds 5-15ms)
Use internal API Gateway when:
- Services span multiple compute platforms (EKS + ECS + Lambda)
- You need cross-account API access via VPC Link
- You want usage tracking and throttling between internal teams
- You need request/response transformation (different internal formats)
Our enterprise recommendation:
- Within a K8s cluster: service mesh (Istio). No API Gateway needed.
- Between K8s clusters: service mesh with multi-cluster Istio or internal API Gateway.
- K8s to non-K8s (Lambda, ECS): internal API Gateway with IAM auth via VPC Link.
- Cross-account: internal API Gateway is the cleanest option — no VPC peering needed for the API path, just VPC Link.
Scenario 4: API Authentication Design
Section titled “Scenario 4: API Authentication Design”Q: “Design API authentication for a mobile banking app, third-party partners, and internal services.”
A: Three distinct auth flows for three consumer types:
Lambda Authorizer handles the complexity — it inspects the request and determines which auth flow to apply:
Authorization: Bearer <JWT>— validate JWT signature, claims, scopesx-api-keyheader present — look up usage plan, validate client credentials- AWS SigV4 signed request — IAM auth for internal services
Scenario 5: API Versioning Strategy
Section titled “Scenario 5: API Versioning Strategy”Q: “How do you handle API versioning without breaking existing clients? You have 50 partner integrations.”
A: With 50 partners, breaking changes are expensive — each partner has their own development cycle.
Strategy: URI path versioning with parallel deployment
Implementation:
- Parallel deployment — v1 and v2 run simultaneously as separate target groups behind the same API Gateway
- Sunset header —
Sunset: Sat, 15 Mar 2027 00:00:00 GMTon all v1 responses starting 6 months before removal - Deprecation notice —
Deprecation: trueheader plusLinkheader pointing to migration guide - Usage tracking — monitor which partners still use v1 via API key analytics
- Partner communication — automated email when a partner’s v1 usage exceeds threshold after deprecation
- Graceful sunset — v1 returns
410 Gonewith a response body containing the v2 equivalent endpoint
API Gateway routing:
# v1 routes to old backendresource "aws_apigatewayv2_route" "accounts_v1" { api_id = aws_apigatewayv2_api.main.id route_key = "GET /v1/accounts/{id}" target = "integrations/${aws_apigatewayv2_integration.accounts_v1.id}"}
# v2 routes to new backendresource "aws_apigatewayv2_route" "accounts_v2" { api_id = aws_apigatewayv2_api.main.id route_key = "GET /v2/accounts/{id}" target = "integrations/${aws_apigatewayv2_integration.accounts_v2.id}"}Scenario 6: API Under Attack
Section titled “Scenario 6: API Under Attack”Q: “Your API is being hammered by a single client — 10x their normal traffic. Backend latency is spiking for all clients. How do you respond?”
A: This is an incident with clear containment, mitigation, and prevention phases.
Immediate (0-5 minutes):
- Identify the client — check API Gateway access logs for the API key or source IP generating the spike
- Throttle the specific client — if they are on a usage plan, reduce their rate limit immediately. If not, add a WAF rate-based rule for their IP
- Verify it is not an attack — is the traffic from a known partner (misconfigured retry loop) or an unknown source (DDoS/scraping)?
Containment (5-30 minutes):
If known partner (API key identified): → Reduce usage plan rate limit temporarily → Contact partner to fix their client → Check for retry storms (are they retrying 5xx responses in a tight loop?)
If unknown source: → WAF rate-based rule: block IP if > 2000 req/5min → If distributed IPs: enable AWS Shield Advanced / Cloud Armor adaptive protection → Geo-blocking if traffic from unexpected regionBackend protection (parallel):
- Enable API Gateway response caching for read endpoints — serve cached responses instead of hitting the backend
- Circuit breaker — if using Istio, configure outlier detection to eject unhealthy backends
- Scale backends — trigger HPA if pods are at capacity, but this treats the symptom not the cause
Prevention (post-incident):
- Mandatory usage plans for all API consumers — no unthrottled access
- WAF rate-based rules as a secondary throttle layer (catches traffic before it hits API Gateway)
- Alerting on per-client request rate anomalies (CloudWatch alarm on usage plan throttle count)
- Retry guidance in API docs — exponential backoff with jitter, not immediate retry
- Circuit breaker at client side — require partners to implement client-side circuit breakers (include in API contract)
The key insight is defense in depth: WAF rate rules (L7 edge) + API Gateway usage plan throttling (per-client) + backend circuit breaker (per-service). No single layer handles everything.