Network Security — Firewall, IPS/IDS, WAF
Where This Fits
Section titled “Where This Fits”Network security lives in the Network Hub Account. Every byte of traffic — inbound, outbound, and east-west — flows through the centralized inspection VPC before reaching workload accounts. The central infra team owns firewall rules, IPS signatures, WAF policies. Workload teams manage only their application-level security groups.
Defense in Depth — Network Layer
Section titled “Defense in Depth — Network Layer”Enterprise network security uses multiple layers. No single control is sufficient — each layer catches what the previous one missed.
NACLs vs Security Groups vs Network Firewall
Section titled “NACLs vs Security Groups vs Network Firewall”This is one of the most frequently asked interview questions. Understanding the differences — and when each is appropriate — is essential.
Network ACLs (NACLs)
Section titled “Network ACLs (NACLs)”- Layer: subnet boundary (Layer 3/4)
- Stateless: must define BOTH inbound and outbound rules separately. A request coming IN on port 443 needs an outbound rule for ephemeral ports (1024-65535) for the response.
- Rules: numbered, processed in order (lowest number first). First match wins.
- Default: allow all inbound and outbound (in default NACL)
- Scope: applies to ALL traffic entering/leaving the subnet — cannot target specific instances
- Use case: broad subnet-level controls (e.g., block a known-bad CIDR range)
Security Groups
Section titled “Security Groups”- Layer: ENI (Elastic Network Interface) level — attached to instances, ALBs, RDS, Lambda, etc.
- Stateful: if you allow inbound TCP 443, the response is automatically allowed outbound (no need for ephemeral port rules)
- Allow-only: you can only create ALLOW rules. There are no DENY rules. Anything not explicitly allowed is denied.
- Reference other SGs: rules can reference security group IDs instead of CIDRs — “allow port 5432 from sg-app” — this is far more maintainable than CIDR-based rules
- Default: deny all inbound, allow all outbound (can be restricted)
- Scope: per-ENI — different instances in the same subnet can have different security groups
Security Group Example — 3-Tier Application:
sg-alb: Inbound: 443 from 0.0.0.0/0 (or from CloudFront prefix list) Outbound: 8080 to sg-app
sg-app (EKS pods / EC2): Inbound: 8080 from sg-alb Outbound: 5432 to sg-rds 6379 to sg-redis 443 to 0.0.0.0/0 (for external API calls — via NAT)
sg-rds: Inbound: 5432 from sg-app Outbound: (none needed — stateful handles responses)
sg-redis: Inbound: 6379 from sg-app Outbound: (none needed)AWS Network Firewall
Section titled “AWS Network Firewall”- Layer: VPC level — deployed in a dedicated subnet within the inspection VPC
- Deep packet inspection: examines packet headers AND payload
- Stateful + stateless rule groups: stateless rules for simple allow/deny, stateful rules for protocol-aware inspection
- Suricata-compatible: IPS/IDS rules written in Suricata syntax — detects malware, C2, exploits
- Managed rule groups: AWS provides pre-built threat intelligence rules
- Logging: flow logs, alert logs, full packet capture — sent to S3, CloudWatch, Kinesis
- Scope: centralized in inspection VPC, processes all traffic routed through it via TGW
When to Use Each
Section titled “When to Use Each”| Control | Use When | Example |
|---|---|---|
| NACL | Broad subnet isolation, emergency IP blocking | Block a CIDR during incident response |
| Security Group | Application-level access control (primary tool) | Allow app → database on port 5432 |
| Network Firewall | Deep inspection, IPS/IDS, domain filtering, compliance | Detect malware C2 callbacks, block non-TLS egress |
GCP uses a different firewall model — VPC-level firewall rules with priority ordering and network tags or service accounts as targets.
VPC Firewall Rules
Section titled “VPC Firewall Rules”- Scope: VPC level (not subnet level like NACLs)
- Stateful: connections are tracked; response traffic is automatically allowed
- Priority-based: rules have a priority (0-65535), lowest number evaluated first
- Direction: ingress or egress (not both in one rule)
- Targets: all instances in VPC, instances with specific network tags, or instances using specific service accounts
- Default rules: implied deny-all-ingress, implied allow-all-egress (priority 65535)
- Allow and Deny: unlike AWS SGs, GCP firewall rules support both ALLOW and DENY actions
GCP Firewall Rules Example:
Priority 100: DENY ingress from 198.51.100.0/24 (known-bad range) Target: all instances Direction: ingress
Priority 1000: ALLOW ingress TCP 443 from 0.0.0.0/0 Target: tag "web-frontend" Direction: ingress
Priority 1000: ALLOW ingress TCP 8080 from tag "web-frontend" Target: tag "app-backend" Direction: ingress
Priority 1000: ALLOW ingress TCP 5432 from tag "app-backend" Target: tag "database" Direction: ingress
Priority 65534: DENY ingress from 0.0.0.0/0 (implied)Hierarchical Firewall Policies
Section titled “Hierarchical Firewall Policies”- Applied at organization or folder level
- Evaluated BEFORE VPC firewall rules (higher precedence)
- Central security team creates org-level policies that cannot be overridden by project owners
- Actions: ALLOW, DENY, or GOTO_NEXT (delegate to next level)
GCP Cloud NGFW Enterprise (IPS/IDS)
Section titled “GCP Cloud NGFW Enterprise (IPS/IDS)”- Next-Gen Firewall with built-in intrusion prevention and detection
- Powered by Palo Alto Networks threat intelligence (Google partnership)
- Deploys as a firewall endpoint in a zone — inspects traffic passing through
- Supports TLS inspection (decrypt, inspect, re-encrypt — requires CA certificate)
- Integrated with threat intelligence: auto-updated malware signatures, C2 domain feeds
- Configured via security profiles and security profile groups
AWS Security Groups vs GCP Firewall Rules — Comparison
Section titled “AWS Security Groups vs GCP Firewall Rules — Comparison”This is a frequent interview question because the two models differ significantly in philosophy. AWS uses a purely allow-based model scoped to individual network interfaces, while GCP uses a priority-based allow/deny model scoped to the entire VPC with hierarchical policy inheritance. Understanding these differences is essential for multi-cloud architects and for answering “which model is better and why” questions.
| Aspect | AWS Security Groups | GCP VPC Firewall Rules |
|---|---|---|
| Statefulness | Stateful | Stateful |
| Default behavior | Deny all inbound, allow all outbound | Default network: allow-internal + SSH/RDP/ICMP; Custom VPC: deny all |
| Actions | Allow only (implicit deny) | Allow AND Deny |
| Scope | Per ENI (network interface) | Per VPC (via targets) |
| Rule targets | SG ID or CIDR | Network tags, service accounts, or all instances |
| Priority | No priority (all rules evaluated, union of allows) | Priority 0-65535 (lower number = higher priority) |
| Cross-reference | Reference other SG IDs | Reference service accounts |
| Hierarchy | No hierarchy (flat per-VPC) | Hierarchical: Org Policy → Folder Policy → VPC Rules |
| Limits | 60 inbound + 60 outbound rules per SG, 5 SGs per ENI | 500 rules per project (quota, can be increased) |
| Best practice | Reference SG IDs instead of CIDRs | Use service accounts instead of network tags |
| Deny rules | Not possible (must remove allow rule) | Explicit deny with higher priority than allow |
| Logging | VPC Flow Logs (separate feature) | Firewall Rules Logging (per-rule toggle) |
Key architectural implications:
AWS’s allow-only model is simpler but less flexible. If you need to block a specific IP that was previously allowed by a broad CIDR rule, you cannot add a deny rule to the security group — you must narrow the CIDR range or use NACLs (which are stateless and operate at the subnet level, adding complexity). In practice, emergency IP blocking in AWS requires NACLs or AWS Network Firewall, not security groups.
GCP’s priority-based allow/deny model is more powerful. You can create a high-priority deny rule (e.g., priority 100: deny traffic from 198.51.100.0/24) that overrides a lower-priority allow rule (e.g., priority 1000: allow TCP 443 from 0.0.0.0/0). This makes incident response easier — block a bad actor without touching existing allow rules.
GCP’s hierarchical firewall policies are a significant enterprise advantage. The central security team can create organization-level policies (e.g., “deny all ingress on port 22 except from bastion subnet”) that CANNOT be overridden by project-level rules. In AWS, there is no equivalent hierarchy — each account’s security groups are independent, and you rely on SCPs (Service Control Policies) to restrict what security groups can be created, which is less granular.
GCP’s service account targeting is more secure than network tags. Tags are just strings — anyone with compute.instances.setTags IAM permission can add a tag to a VM and potentially match firewall rules they should not. Service accounts are IAM-controlled, providing cryptographic identity verification.
AWS Network Firewall — Deep Dive with IPS/IDS
Section titled “AWS Network Firewall — Deep Dive with IPS/IDS”AWS Network Firewall is a managed stateful firewall service powered by Suricata. It inspects traffic at the VPC level — including headers, payloads, and protocol behavior.
Architecture in the Inspection VPC
Section titled “Architecture in the Inspection VPC”Suricata IPS/IDS Rules
Section titled “Suricata IPS/IDS Rules”Network Firewall uses Suricata syntax for stateful rules. Suricata is an open-source IDS/IPS engine. You write rules that match specific traffic patterns — protocol, source/dest, content, keywords — and take actions (pass, drop, alert, reject).
Suricata Rule Syntax:action protocol source_ip source_port -> dest_ip dest_port (options;)
Example Rules for Enterprise Bank:
# Block traffic to known C2 (command-and-control) serversdrop tls $HOME_NET any -> $EXTERNAL_NET any \ (tls.sni; content:"malware-c2.example.com"; \ msg:"C2 callback blocked"; sid:1000001; rev:1;)
# Alert on SQL injection attempts in HTTP trafficalert http $EXTERNAL_NET any -> $HOME_NET any \ (http.uri; content:"UNION"; nocase; content:"SELECT"; nocase; \ msg:"Possible SQL injection in URI"; sid:1000002; rev:1;)
# Block unauthorized DNS-over-HTTPS (DoH) — enforce internal DNSdrop tls $HOME_NET any -> $EXTERNAL_NET 443 \ (tls.sni; content:"dns.google"; \ msg:"DNS-over-HTTPS blocked - use internal DNS"; sid:1000003; rev:1;)
# Detect outbound SSH tunneling (data exfiltration risk)alert tcp $HOME_NET any -> $EXTERNAL_NET 22 \ (msg:"Outbound SSH detected - review for tunneling"; \ flow:established,to_server; sid:1000004; rev:1;)
# Block known bad TLS certificate fingerprintsdrop tls $HOME_NET any -> $EXTERNAL_NET any \ (tls.cert_fingerprint; content:"ab:cd:ef:..."; \ msg:"Known malicious TLS certificate"; sid:1000005; rev:1;)
# Allow only HTTPS egress (block HTTP, non-standard ports)pass tls $HOME_NET any -> $EXTERNAL_NET 443 \ (msg:"HTTPS egress allowed"; sid:1000010; rev:1;)drop tcp $HOME_NET any -> $EXTERNAL_NET any \ (msg:"Non-HTTPS egress blocked"; sid:1000011; rev:1;)Managed Rule Groups
Section titled “Managed Rule Groups”AWS provides pre-built managed rule groups updated by AWS threat intelligence:
| Managed Rule Group | What It Detects |
|---|---|
AbusedLegitMalwareDomainsActionOrder | Domains hosting malware on legitimate services |
MalwareDomainsActionOrder | Known malware distribution domains |
BotNetCommandAndControlDomainsActionOrder | Known botnet C2 domains |
ThreatSignaturesDoSActionOrder | Denial of service attack patterns |
ThreatSignaturesExploitsActionOrder | Known exploit signatures |
ThreatSignaturesMalwareActionOrder | Malware traffic signatures |
ThreatSignaturesWebAttacksActionOrder | Web application attack patterns |
IDS vs IPS — Mode Configuration
Section titled “IDS vs IPS — Mode Configuration”- IPS (Intrusion Prevention System): inline inspection — traffic passes THROUGH the firewall. Can DROP malicious packets before they reach the workload. This is our bank’s configuration.
- IDS (Intrusion Detection System): passive monitoring — traffic is mirrored to the firewall. Can ALERT but cannot block. Useful for initial deployment to evaluate false positives before switching to IPS mode.
In AWS Network Firewall, the mode is controlled by the rule action:
drop= IPS (blocks traffic)alert= IDS (logs but allows traffic)reject= IPS + sends TCP RST or ICMP unreachable
Recommended rollout: deploy with alert actions first (IDS mode) for 2-4 weeks. Review alerts. Tune rules to eliminate false positives. Then change actions to drop (IPS mode).
WAF — Web Application Firewall
Section titled “WAF — Web Application Firewall”WAF protects web applications at Layer 7 (HTTP/HTTPS). It inspects request bodies, headers, URIs, and query strings for attack patterns.
AWS WAF attaches to CloudFront, ALB, API Gateway, or AppSync. It evaluates Web ACL rules against HTTP requests.
Key concepts:
- Web ACL: collection of rules with a default action (allow or block)
- Rule Group: reusable set of rules (managed or custom)
- Managed Rule Groups: pre-built by AWS or marketplace vendors
- Custom Rules: match on IP, geo, rate, string match, regex, size, SQL injection, XSS
- Rule actions: ALLOW, BLOCK, COUNT (monitor without blocking), CAPTCHA, CHALLENGE
Essential Managed Rule Groups for Enterprise:
| Rule Group | Purpose |
|---|---|
AWSManagedRulesCommonRuleSet | OWASP Top 10 — SQLi, XSS, LFI, RFI, path traversal |
AWSManagedRulesKnownBadInputsRuleSet | Log4j, Spring4Shell, known bad patterns |
AWSManagedRulesSQLiRuleSet | SQL injection (dedicated, deeper than Common) |
AWSManagedRulesLinuxRuleSet | Linux-specific exploits (for EC2/EKS workloads) |
AWSManagedRulesBotControlRuleSet | Bot detection — scrapers, scanners, credential stuffers |
AWSManagedRulesATPRuleSet | Account Takeover Prevention — credential stuffing detection |
AWSManagedRulesAmazonIpReputationList | Known bad IPs — botnets, proxies, Tor exit nodes |
AWSManagedRulesAnonymousIpList | VPN, proxy, hosting provider IPs |
Rate-Based Rules:
- Automatically block IPs that exceed a request threshold (e.g., 2000 requests in 5 minutes)
- Essential for DDoS mitigation at the application layer
- Can scope by URI, header, or query string (e.g., rate limit
/api/loginseparately)
GCP Cloud Armor provides DDoS protection and WAF capabilities. It is attached to Global External HTTP(S) Load Balancers, making it edge-deployed by default.
Key features:
- Security policies: ordered rules evaluated against incoming requests
- Pre-configured WAF rules: OWASP ModSecurity CRS (Core Rule Set) based
- Adaptive protection: ML-based anomaly detection — automatically detects and mitigates L7 DDoS
- Bot management: reCAPTCHA Enterprise integration for bot mitigation
- Threat intelligence: Google Threat Intelligence feed for known-bad IPs
- Rate limiting: per-client rate limiting with flexible key definitions (IP, header, path)
- Custom rules: CEL (Common Expression Language) for complex matching
Pre-configured WAF Rules:
| Rule ID | Protection |
|---|---|
sqli-v33-stable | SQL injection |
xss-v33-stable | Cross-site scripting |
lfi-v33-stable | Local file inclusion |
rfi-v33-stable | Remote file inclusion |
rce-v33-stable | Remote code execution |
methodenforcement-v33-stable | HTTP method enforcement |
scannerdetection-v33-stable | Scanner/bot detection |
protocolattack-v33-stable | Protocol-level attacks |
php-v33-stable | PHP injection attacks |
sessionfixation-v33-stable | Session fixation |
java-v33-stable | Java-based attacks (Log4j, etc.) |
nodejs-v33-stable | Node.js attacks |
Cloud Armor + Adaptive Protection:
DDoS Protection — Shield & Cloud Armor
Section titled “DDoS Protection — Shield & Cloud Armor”Shield Standard (free, automatic):
- Protects against L3/L4 volumetric attacks (SYN floods, UDP reflection, amplification)
- Active on ALL AWS accounts by default — no configuration needed
- Protects CloudFront, Route 53, Global Accelerator, ALB, NLB, EC2 Elastic IPs
Shield Advanced ($3,000/month per organization):
- Everything in Standard plus:
- L7 DDoS protection (requires WAF)
- DDoS Response Team (DRT) — AWS experts help during active attacks
- Cost protection: AWS credits for scale-up costs during DDoS
- Real-time metrics and attack forensics
- Automatic application-layer mitigations (creates WAF rules based on attack patterns)
- Health-based detection (uses Route 53 health checks to detect impact)
- Proactive engagement: DRT contacts you when they detect an attack targeting your resources
When to use Shield Advanced: regulated workloads (banks, healthcare), internet-facing applications with revenue impact from downtime, compliance requirements mandating DDoS mitigation documentation.
Standard Tier (included with Cloud Armor):
- L3/L4 DDoS protection on Global External LBs
- Always-on, no additional cost beyond Cloud Armor pricing
- Absorbs volumetric attacks at Google’s edge network (one of the largest networks in the world)
Cloud Armor Enterprise (managed protection plus):
- L7 DDoS protection with Adaptive Protection (ML-based)
- DDoS billing protection: credits for excess traffic costs during attacks
- Threat intelligence: Google’s threat intel feeds for IP reputation
- Named IP lists: Google-curated lists of known-bad IPs
- Advanced rate limiting and bot management
GCP advantage: Google’s global network handles massive DDoS attacks at the edge. The 2023 record-breaking 398M RPS DDoS attack was mitigated by Cloud Armor at Google’s edge before reaching any customer infrastructure.
Centralized Inspection Architecture — Full Design
Section titled “Centralized Inspection Architecture — Full Design”This is the complete network security architecture for our enterprise bank. All traffic — egress, ingress, and east-west — is inspected at the Network Hub Account.
Traffic Flows Through the Architecture
Section titled “Traffic Flows Through the Architecture”Egress (workload → internet):
- Pod in payments-prod VPC sends HTTPS to
api.stripe.com - Private subnet route table:
0.0.0.0/0 → TGW - TGW prod-rt:
0.0.0.0/0 → inspection VPC attachment - Inspection VPC TGW subnet RT:
0.0.0.0/0 → Network Firewall endpoint - Network Firewall inspects:
- Stateless: check against deny-list CIDRs
- Stateful: verify TLS to allowed domain (
api.stripe.comin allowlist) - IPS: scan for C2 patterns, malware signatures
- Result: PASS
- Firewall subnet RT:
0.0.0.0/0 → NAT GW - NAT GW translates private IP → Elastic IP → Internet → Stripe
Ingress (internet → workload):
- Customer hits
payments.bank.com - Route 53 → CloudFront (Shield absorbs L3/L4 DDoS)
- CloudFront → WAF inspects (OWASP rules, bot check, rate limit)
- CloudFront → ALB in inspection VPC public subnet (origin)
- ALB → Network Firewall inspects response/request
- After inspection → TGW → workload VPC → internal ALB → EKS pod
East-west (workload → workload):
- payments-prod pod calls trading-prod API at
10.11.1.50:8080 - payments VPC route table:
10.11.0.0/16matches10.0.0.0/8 → TGW - TGW prod-rt:
10.11.0.0/16 → trading-prod attachment - If east-west inspection required: static route
10.0.0.0/8 → inspection VPCin prod-rt- Traffic goes through Network Firewall before reaching trading-prod
- Significant latency impact — only enable for high-sensitivity workloads
- trading-prod VPC: security group allows 8080 from
10.10.0.0/16
Terraform — Network Firewall & WAF
Section titled “Terraform — Network Firewall & WAF”# ─── Firewall Policy ───────────────────────────────
resource "aws_networkfirewall_firewall_policy" "main" { name = "bank-inspection-policy"
firewall_policy { stateless_default_actions = ["aws:forward_to_sfe"] stateless_fragment_default_actions = ["aws:drop"]
# Stateless rule group — fast path deny-lists stateless_rule_group_reference { priority = 1 resource_arn = aws_networkfirewall_rule_group.stateless_deny.arn }
# Stateful rule groups — IPS/IDS stateful_engine_options { rule_order = "STRICT_ORDER" }
stateful_rule_group_reference { priority = 1 resource_arn = aws_networkfirewall_rule_group.ips_custom.arn }
stateful_rule_group_reference { priority = 2 resource_arn = aws_networkfirewall_rule_group.domain_allowlist.arn }
# AWS Managed threat intelligence rules stateful_rule_group_reference { priority = 10 resource_arn = "arn:aws:network-firewall:eu-west-1:aws-managed:stateful-rulegroup/AbusedLegitMalwareDomainsActionOrder" }
stateful_rule_group_reference { priority = 11 resource_arn = "arn:aws:network-firewall:eu-west-1:aws-managed:stateful-rulegroup/BotNetCommandAndControlDomainsActionOrder" }
stateful_rule_group_reference { priority = 12 resource_arn = "arn:aws:network-firewall:eu-west-1:aws-managed:stateful-rulegroup/ThreatSignaturesMalwareActionOrder" }
stateful_rule_group_reference { priority = 13 resource_arn = "arn:aws:network-firewall:eu-west-1:aws-managed:stateful-rulegroup/ThreatSignaturesExploitsActionOrder" } }}
# ─── Stateless Deny-List Rule Group ─────────────────
resource "aws_networkfirewall_rule_group" "stateless_deny" { name = "stateless-deny-list" capacity = 100 type = "STATELESS"
rule_group { rules_source { stateless_rules_and_custom_actions { # Drop traffic from known bad CIDRs stateless_rule { priority = 1 rule_definition { actions = ["aws:drop"] match_attributes { source { address_definition = "198.51.100.0/24" # Example bad range } } } }
# Drop all inbound ICMP from internet (anti-reconnaissance) stateless_rule { priority = 2 rule_definition { actions = ["aws:drop"] match_attributes { source { address_definition = "0.0.0.0/0" } destination { address_definition = "10.0.0.0/8" } protocols = [1] # ICMP } } } } } }}
# ─── Custom IPS Rules (Suricata) ────────────────────
resource "aws_networkfirewall_rule_group" "ips_custom" { name = "bank-ips-rules" capacity = 500 type = "STATEFUL"
rule_group { rule_variables { ip_sets { key = "HOME_NET" ip_set { definition = ["10.0.0.0/8"] } } ip_sets { key = "EXTERNAL_NET" ip_set { definition = ["0.0.0.0/0"] } } }
rules_source { rules_string = <<-RULES # Block C2 callbacks to known threat domains drop tls $HOME_NET any -> $EXTERNAL_NET any (tls.sni; content:"malicious-domain.com"; nocase; msg:"C2 callback blocked"; sid:2000001; rev:1;)
# Detect outbound SSH (potential tunneling/exfiltration) alert tcp $HOME_NET any -> $EXTERNAL_NET 22 (msg:"Outbound SSH - review for tunneling"; flow:established,to_server; sid:2000002; rev:1;)
# Block DNS over HTTPS (enforce internal DNS) drop tls $HOME_NET any -> $EXTERNAL_NET 443 (tls.sni; content:"dns.google"; msg:"DoH blocked"; sid:2000003; rev:1;) drop tls $HOME_NET any -> $EXTERNAL_NET 443 (tls.sni; content:"cloudflare-dns.com"; msg:"DoH blocked"; sid:2000004; rev:1;) drop tls $HOME_NET any -> $EXTERNAL_NET 443 (tls.sni; content:"dns.quad9.net"; msg:"DoH blocked"; sid:2000005; rev:1;)
# Alert on large outbound data transfers (exfiltration detection) alert tcp $HOME_NET any -> $EXTERNAL_NET any (msg:"Large outbound transfer >10MB"; flow:established,to_server; dsize:>10000000; sid:2000006; rev:1;)
# Block non-TLS HTTP egress (enforce encryption) drop tcp $HOME_NET any -> $EXTERNAL_NET 80 (msg:"Unencrypted HTTP egress blocked"; flow:established,to_server; sid:2000007; rev:1;)
# Allow HTTPS egress (explicit pass after inspection) pass tls $HOME_NET any -> $EXTERNAL_NET 443 (msg:"HTTPS egress allowed"; flow:established,to_server; sid:2000099; rev:1;) RULES }
stateful_rule_options { capacity = 500 } }}
# ─── Domain Allowlist ───────────────────────────────
resource "aws_networkfirewall_rule_group" "domain_allowlist" { name = "domain-allowlist" capacity = 200 type = "STATEFUL"
rule_group { rule_variables { ip_sets { key = "HOME_NET" ip_set { definition = ["10.0.0.0/8"] } } }
rules_source { rules_source_list { generated_rules_type = "ALLOWLIST" target_types = ["TLS_SNI", "HTTP_HOST"] targets = [ ".amazonaws.com", # AWS services ".aws.amazon.com", # AWS console ".docker.io", # Container images ".docker.com", ".github.com", # Source control ".githubusercontent.com", ".stripe.com", # Payment processor ".twilio.com", # Communications ".datadoghq.com", # Monitoring (if used) ".grafana.net", # Grafana Cloud ".bank.com", # Our own domains ] } } }}
# ─── Firewall Instance ──────────────────────────────
resource "aws_networkfirewall_firewall" "main" { name = "bank-inspection-firewall" firewall_policy_arn = aws_networkfirewall_firewall_policy.main.arn vpc_id = aws_vpc.inspection.id
dynamic "subnet_mapping" { for_each = aws_subnet.firewall[*].id content { subnet_id = subnet_mapping.value } }
tags = { Name = "bank-inspection-firewall" }}
# ─── Logging Configuration ──────────────────────────
resource "aws_networkfirewall_logging_configuration" "main" { firewall_arn = aws_networkfirewall_firewall.main.arn
logging_configuration { # Alert logs → CloudWatch (for real-time alerting) log_destination_config { log_destination = { logGroup = aws_cloudwatch_log_group.fw_alerts.name } log_destination_type = "CloudWatchLogs" log_type = "ALERT" }
# Flow logs → S3 (for compliance and forensics) log_destination_config { log_destination = { bucketName = aws_s3_bucket.fw_logs.id prefix = "network-firewall/flow" } log_destination_type = "S3" log_type = "FLOW" } }}
resource "aws_cloudwatch_log_group" "fw_alerts" { name = "/aws/network-firewall/alerts" retention_in_days = 90}
# CloudWatch alarm for firewall alertsresource "aws_cloudwatch_metric_alarm" "fw_alerts" { alarm_name = "network-firewall-ips-alerts" comparison_operator = "GreaterThanThreshold" evaluation_periods = 1 metric_name = "DroppedPackets" namespace = "AWS/NetworkFirewall" period = 300 statistic = "Sum" threshold = 100 alarm_description = "Network Firewall IPS dropped >100 packets in 5 min" alarm_actions = [aws_sns_topic.security_alerts.arn]
dimensions = { FirewallName = aws_networkfirewall_firewall.main.name }}resource "aws_wafv2_web_acl" "main" { name = "bank-web-acl" description = "Enterprise bank WAF - OWASP + bot control" scope = "REGIONAL" # or "CLOUDFRONT" for CloudFront
default_action { allow {} }
# ─── AWS Managed Rules: OWASP Common ──────────────
rule { name = "AWSManagedRulesCommonRuleSet" priority = 1
override_action { none {} # Use rule group's actions (block/count) }
statement { managed_rule_group_statement { name = "AWSManagedRulesCommonRuleSet" vendor_name = "AWS"
# Exclude rules that cause false positives (tune per app) rule_action_override { name = "SizeRestrictions_BODY" action_to_use { count {} # Count instead of block (file uploads) } } } }
visibility_config { cloudwatch_metrics_enabled = true metric_name = "AWSCommonRules" sampled_requests_enabled = true } }
# ─── SQL Injection Rules ──────────────────────────
rule { name = "AWSManagedRulesSQLiRuleSet" priority = 2
override_action { none {} }
statement { managed_rule_group_statement { name = "AWSManagedRulesSQLiRuleSet" vendor_name = "AWS" } }
visibility_config { cloudwatch_metrics_enabled = true metric_name = "AWSSQLiRules" sampled_requests_enabled = true } }
# ─── Known Bad Inputs (Log4j, etc.) ───────────────
rule { name = "AWSManagedRulesKnownBadInputsRuleSet" priority = 3
override_action { none {} }
statement { managed_rule_group_statement { name = "AWSManagedRulesKnownBadInputsRuleSet" vendor_name = "AWS" } }
visibility_config { cloudwatch_metrics_enabled = true metric_name = "AWSBadInputRules" sampled_requests_enabled = true } }
# ─── Bot Control ──────────────────────────────────
rule { name = "AWSManagedRulesBotControlRuleSet" priority = 4
override_action { none {} }
statement { managed_rule_group_statement { name = "AWSManagedRulesBotControlRuleSet" vendor_name = "AWS"
managed_rule_group_configs { aws_managed_rules_bot_control_rule_set { inspection_level = "COMMON" # or "TARGETED" for more aggressive } } } }
visibility_config { cloudwatch_metrics_enabled = true metric_name = "AWSBotControl" sampled_requests_enabled = true } }
# ─── IP Reputation ────────────────────────────────
rule { name = "AWSManagedRulesAmazonIpReputationList" priority = 5
override_action { none {} }
statement { managed_rule_group_statement { name = "AWSManagedRulesAmazonIpReputationList" vendor_name = "AWS" } }
visibility_config { cloudwatch_metrics_enabled = true metric_name = "AWSIpReputation" sampled_requests_enabled = true } }
# ─── Rate Limiting (Custom) ───────────────────────
rule { name = "RateLimitLogin" priority = 10
action { block {} }
statement { rate_based_statement { limit = 100 # 100 requests per 5 minutes aggregate_key_type = "IP"
scope_down_statement { byte_match_statement { search_string = "/api/auth/login" positional_constraint = "STARTS_WITH" field_to_match { uri_path {} } text_transformation { priority = 0 type = "LOWERCASE" } } } } }
visibility_config { cloudwatch_metrics_enabled = true metric_name = "RateLimitLogin" sampled_requests_enabled = true } }
# ─── Geo-Blocking (if required by compliance) ────
rule { name = "GeoBlock" priority = 0 # Highest priority
action { block { custom_response { response_code = 403 } } }
statement { not_statement { statement { geo_match_statement { country_codes = ["AE", "GB", "US", "IN", "DE", "FR", "SG"] } } } }
visibility_config { cloudwatch_metrics_enabled = true metric_name = "GeoBlock" sampled_requests_enabled = true } }
visibility_config { cloudwatch_metrics_enabled = true metric_name = "BankWebACL" sampled_requests_enabled = true }}
# Attach WAF to ALBresource "aws_wafv2_web_acl_association" "alb" { resource_arn = aws_lb.api.arn web_acl_arn = aws_wafv2_web_acl.main.arn}
# WAF logging to S3 via Kinesis Firehoseresource "aws_wafv2_web_acl_logging_configuration" "main" { log_destination_configs = [aws_kinesis_firehose_delivery_stream.waf_logs.arn] resource_arn = aws_wafv2_web_acl.main.arn
logging_filter { default_behavior = "DROP" # Only log blocked/counted requests
filter { behavior = "KEEP" requirement = "MEETS_ANY"
condition { action_condition { action = "BLOCK" } } condition { action_condition { action = "COUNT" } } } }}# ─── Hierarchical Firewall Policy (Org Level) ──────
resource "google_compute_firewall_policy" "org" { short_name = "bank-org-firewall-policy" parent = "organizations/${var.org_id}" description = "Organization-level firewall policy - cannot be overridden"}
# Block all traffic from embargoed countries (org-wide)resource "google_compute_firewall_policy_rule" "block_embargoed" { firewall_policy = google_compute_firewall_policy.org.id priority = 100 action = "deny" direction = "INGRESS" description = "Block traffic from embargoed regions"
match { src_threat_intelligences = ["iplist-known-malicious-ips"] layer4_configs { ip_protocol = "all" } }}
# Allow GCP health check probes (required for load balancers)resource "google_compute_firewall_policy_rule" "allow_health_checks" { firewall_policy = google_compute_firewall_policy.org.id priority = 200 action = "allow" direction = "INGRESS" description = "Allow GCP health check ranges"
match { src_ip_ranges = ["130.211.0.0/22", "35.191.0.0/16"] layer4_configs { ip_protocol = "tcp" } }}
# Allow IAP (Identity-Aware Proxy) for SSH/RDPresource "google_compute_firewall_policy_rule" "allow_iap" { firewall_policy = google_compute_firewall_policy.org.id priority = 300 action = "allow" direction = "INGRESS" description = "Allow IAP for secure access"
match { src_ip_ranges = ["35.235.240.0/20"] # IAP range layer4_configs { ip_protocol = "tcp" ports = ["22", "3389"] } }}
# Delegate everything else to folder/VPC policiesresource "google_compute_firewall_policy_rule" "goto_next" { firewall_policy = google_compute_firewall_policy.org.id priority = 65000 action = "goto_next" direction = "INGRESS"
match { src_ip_ranges = ["0.0.0.0/0"] layer4_configs { ip_protocol = "all" } }}
# Associate org policyresource "google_compute_firewall_policy_association" "org" { firewall_policy = google_compute_firewall_policy.org.id attachment_target = "organizations/${var.org_id}" name = "bank-org-policy-association"}
# ─── Network Firewall Policy (VPC Level + IPS) ─────
resource "google_compute_network_firewall_policy" "vpc" { project = var.host_project_id name = "bank-vpc-firewall-policy" description = "VPC-level firewall policy with IPS inspection"}
# IPS inspection rule — inspect all egress for threatsresource "google_compute_network_firewall_policy_rule" "ips_egress" { project = var.host_project_id firewall_policy = google_compute_network_firewall_policy.vpc.name priority = 100 action = "apply_security_profile_group" direction = "EGRESS" description = "IPS inspection on all egress traffic"
match { dest_ip_ranges = ["0.0.0.0/0"] layer4_configs { ip_protocol = "tcp" } }
security_profile_group = google_network_security_security_profile_group.ips.id}
# Allow internal traffic (east-west within VPC)resource "google_compute_network_firewall_policy_rule" "allow_internal" { project = var.host_project_id firewall_policy = google_compute_network_firewall_policy.vpc.name priority = 1000 action = "allow" direction = "INGRESS" description = "Allow internal VPC communication"
match { src_ip_ranges = ["10.0.0.0/8"] layer4_configs { ip_protocol = "all" } }}
# Deny all other ingressresource "google_compute_network_firewall_policy_rule" "deny_all_ingress" { project = var.host_project_id firewall_policy = google_compute_network_firewall_policy.vpc.name priority = 65000 action = "deny" direction = "INGRESS" description = "Default deny all ingress"
match { src_ip_ranges = ["0.0.0.0/0"] layer4_configs { ip_protocol = "all" } }}
# Associate with Shared VPCresource "google_compute_network_firewall_policy_association" "vpc" { project = var.host_project_id name = "bank-vpc-policy-association" firewall_policy = google_compute_network_firewall_policy.vpc.name attachment_target = google_compute_network.main.id}
# ─── Cloud NGFW Enterprise — IPS Security Profile ──
resource "google_network_security_security_profile" "ips" { name = "bank-ips-profile" parent = "organizations/${var.org_id}" type = "THREAT_PREVENTION" description = "IPS profile for enterprise bank"
threat_prevention_profile { severity_overrides { action = "DENY" severity = "CRITICAL" } severity_overrides { action = "DENY" severity = "HIGH" } severity_overrides { action = "ALERT" severity = "MEDIUM" } severity_overrides { action = "ALERT" severity = "LOW" } }}
resource "google_network_security_security_profile_group" "ips" { name = "bank-ips-profile-group" parent = "organizations/${var.org_id}" threat_prevention_profile = google_network_security_security_profile.ips.id description = "IPS security profile group"}
# Firewall endpoint (per zone) — where IPS inspection runsresource "google_network_security_firewall_endpoint" "zone_b" { name = "bank-fw-endpoint-ew1-b" parent = "organizations/${var.org_id}" location = "europe-west1-b" billing_project_id = var.host_project_id}
resource "google_network_security_firewall_endpoint" "zone_c" { name = "bank-fw-endpoint-ew1-c" parent = "organizations/${var.org_id}" location = "europe-west1-c" billing_project_id = var.host_project_id}
# Associate endpoints with VPCresource "google_network_security_firewall_endpoint_association" "zone_b" { name = "bank-fw-assoc-ew1-b" parent = "projects/${var.host_project_id}" location = "europe-west1-b" firewall_endpoint = google_network_security_firewall_endpoint.zone_b.id network = google_compute_network.main.id}
resource "google_network_security_firewall_endpoint_association" "zone_c" { name = "bank-fw-assoc-ew1-c" parent = "projects/${var.host_project_id}" location = "europe-west1-c" firewall_endpoint = google_network_security_firewall_endpoint.zone_c.id network = google_compute_network.main.id}
# ─── Cloud Armor Security Policy ────────────────────
resource "google_compute_security_policy" "main" { project = var.host_project_id name = "bank-cloud-armor-policy" description = "DDoS + WAF protection for internet-facing services"
# Enable Adaptive Protection (ML-based DDoS detection) adaptive_protection_config { layer_7_ddos_defense_config { enable = true rule_visibility = "STANDARD" } }
# OWASP SQL Injection rule { action = "deny(403)" priority = 1000 match { expr { expression = "evaluatePreconfiguredExpr('sqli-v33-stable')" } } description = "Block SQL injection" }
# OWASP XSS rule { action = "deny(403)" priority = 1001 match { expr { expression = "evaluatePreconfiguredExpr('xss-v33-stable')" } } description = "Block XSS" }
# OWASP LFI/RFI rule { action = "deny(403)" priority = 1002 match { expr { expression = "evaluatePreconfiguredExpr('lfi-v33-stable')" } } description = "Block local file inclusion" }
# OWASP RCE rule { action = "deny(403)" priority = 1003 match { expr { expression = "evaluatePreconfiguredExpr('rce-v33-stable')" } } description = "Block remote code execution" }
# Rate limiting rule { action = "throttle" priority = 2000 match { expr { expression = "request.path.matches('/api/auth/.*')" } } rate_limit_options { conform_action = "allow" exceed_action = "deny(429)" rate_limit_threshold { count = 100 interval_sec = 300 } enforce_on_key = "IP" } description = "Rate limit auth endpoints" }
# Geo-blocking rule { action = "deny(403)" priority = 500 match { expr { expression = "!inIpRange(origin.ip, '0.0.0.0/0') || !origin.region_code.matches('AE|GB|US|IN|DE|FR|SG')" } } description = "Geo-blocking - allow only approved regions" }
# Default allow rule { action = "allow" priority = 2147483647 match { versioned_expr = "SRC_IPS_V1" config { src_ip_ranges = ["*"] } } description = "Default allow" }}WAF — Enterprise Architecture & Terraform
Section titled “WAF — Enterprise Architecture & Terraform”Beyond the WAF basics covered above, this section focuses on enterprise WAF architecture patterns, Terraform implementation, and interview-ready scenarios for designing WAF rules at scale.
Why WAF — OWASP Top 10 Protection at Layer 7
Section titled “Why WAF — OWASP Top 10 Protection at Layer 7”WAF inspects HTTP/HTTPS requests at Layer 7 — examining request bodies, headers, URIs, query strings, and cookies for attack patterns. Without WAF, your ALB and application are directly exposed to SQL injection, XSS, SSRF, path traversal, and other OWASP Top 10 attacks. Network firewalls (L3/L4) cannot inspect HTTP payloads — they see encrypted TLS traffic as opaque bytes.
AWS WAF Architecture — Web ACL, Rules, Rule Groups
Section titled “AWS WAF Architecture — Web ACL, Rules, Rule Groups”Integration points: CloudFront (edge — recommended for internet-facing), ALB (regional), API Gateway (REST APIs), AppSync (GraphQL), Cognito (user pools).
Best practice: Deploy WAF at CloudFront (edge) for internet-facing apps. This blocks attacks before they reach your region, reducing load on ALBs and origin servers. For internal APIs exposed via ALB, attach WAF directly to the ALB.
Enterprise WAF Architecture
Section titled “Enterprise WAF Architecture”In this architecture:
- CloudFront terminates TLS at the edge, caches static content, absorbs volumetric attacks
- AWS WAF (attached to CloudFront) inspects all HTTP requests against rule groups
- Shield Standard protects CloudFront from L3/L4 DDoS automatically
- ALB performs target group routing to EKS pods
- API Gateway handles REST API requests with its own throttling and auth
Logging: WAF logs to CloudWatch Logs (real-time, expensive at scale), S3 (cost-effective, query with Athena), or Kinesis Data Firehose (streaming to SIEM/Splunk/Datadog).
Terraform — AWS WAF with Managed + Custom Rules
Section titled “Terraform — AWS WAF with Managed + Custom Rules”resource "aws_wafv2_web_acl" "main" { name = "enterprise-waf" description = "Enterprise WAF for production workloads" scope = "CLOUDFRONT" # or "REGIONAL" for ALB
default_action { allow {} }
# Rule 1: Rate-based — block IPs exceeding 2000 req/5min rule { name = "rate-limit-per-ip" priority = 1
action { block {} }
statement { rate_based_statement { limit = 2000 aggregate_key_type = "IP" } }
visibility_config { sampled_requests_enabled = true cloudwatch_metrics_enabled = true metric_name = "RateLimitPerIP" } }
# Rule 2: AWS Managed — IP Reputation rule { name = "aws-ip-reputation" priority = 2
override_action { none {} # Use managed rule actions (BLOCK) }
statement { managed_rule_group_statement { name = "AWSManagedRulesAmazonIpReputationList" vendor_name = "AWS" } }
visibility_config { sampled_requests_enabled = true cloudwatch_metrics_enabled = true metric_name = "AWSIPReputation" } }
# Rule 3: AWS Managed — Core Rule Set (OWASP Top 10) rule { name = "aws-common-rules" priority = 3
override_action { none {} }
statement { managed_rule_group_statement { name = "AWSManagedRulesCommonRuleSet" vendor_name = "AWS"
# Exclude specific rules that cause false positives rule_action_override { name = "SizeRestrictions_BODY" action_to_use { count {} # Count instead of block for large file uploads } } } }
visibility_config { sampled_requests_enabled = true cloudwatch_metrics_enabled = true metric_name = "AWSCommonRules" } }
# Rule 4: AWS Managed — SQL Injection rule { name = "aws-sqli" priority = 4
override_action { none {} }
statement { managed_rule_group_statement { name = "AWSManagedRulesSQLiRuleSet" vendor_name = "AWS" } }
visibility_config { sampled_requests_enabled = true cloudwatch_metrics_enabled = true metric_name = "AWSSQLi" } }
# Rule 5: AWS Managed — Bot Control rule { name = "aws-bot-control" priority = 5
override_action { none {} }
statement { managed_rule_group_statement { name = "AWSManagedRulesBotControlRuleSet" vendor_name = "AWS"
managed_rule_group_configs { aws_managed_rules_bot_control_rule_set { inspection_level = "COMMON" # or "TARGETED" for advanced } } } }
visibility_config { sampled_requests_enabled = true cloudwatch_metrics_enabled = true metric_name = "AWSBotControl" } }
# Rule 6: Custom — Geo-blocking sanctioned countries rule { name = "geo-block-sanctioned" priority = 6
action { block {} }
statement { geo_match_statement { country_codes = ["KP", "IR", "CU", "SY"] } }
visibility_config { sampled_requests_enabled = true cloudwatch_metrics_enabled = true metric_name = "GeoBlockSanctioned" } }
# Rule 7: Custom — Rate limit login endpoint specifically rule { name = "rate-limit-login" priority = 7
action { block {} }
statement { rate_based_statement { limit = 100 aggregate_key_type = "IP"
scope_down_statement { byte_match_statement { search_string = "/api/login" positional_constraint = "STARTS_WITH" field_to_match { uri_path {} } text_transformation { priority = 0 type = "LOWERCASE" } } } } }
visibility_config { sampled_requests_enabled = true cloudwatch_metrics_enabled = true metric_name = "RateLimitLogin" } }
visibility_config { sampled_requests_enabled = true cloudwatch_metrics_enabled = true metric_name = "EnterpriseWAF" }}
# Associate WAF with CloudFront distributionresource "aws_wafv2_web_acl_association" "cloudfront" { resource_arn = aws_cloudfront_distribution.main.arn web_acl_arn = aws_wafv2_web_acl.main.arn}
# WAF logging to S3 via Kinesis Firehoseresource "aws_wafv2_web_acl_logging_configuration" "main" { log_destination_configs = [aws_kinesis_firehose_delivery_stream.waf_logs.arn] resource_arn = aws_wafv2_web_acl.main.arn
redacted_fields { single_header { name = "authorization" } }}GCP Cloud Armor — WAF with Adaptive Protection
Section titled “GCP Cloud Armor — WAF with Adaptive Protection”Cloud Armor provides WAF and DDoS protection attached to Global External HTTP(S) Load Balancers. It evaluates security policy rules against incoming requests at Google’s edge network.
Key differentiator — Adaptive Protection:
Adaptive Protection uses ML to baseline normal traffic patterns for your application. When it detects anomalous L7 traffic (potential DDoS), it automatically generates suggested Cloud Armor rules to mitigate the attack. This is a significant advantage over AWS WAF, which does not have equivalent ML-based detection — you rely on rate-based rules and Shield Advanced’s DRT for L7 DDoS.
Bot Management:
Cloud Armor integrates with reCAPTCHA Enterprise for bot detection. Unlike AWS WAF’s Bot Control (which inspects request patterns), GCP can challenge suspicious traffic with CAPTCHA/reCAPTCHA without modifying your application code. The Cloud Armor rule redirects suspicious requests to a reCAPTCHA challenge page.
Terraform — GCP Cloud Armor Security Policy
Section titled “Terraform — GCP Cloud Armor Security Policy”resource "google_compute_security_policy" "enterprise_waf" { name = "enterprise-waf-policy" description = "Enterprise WAF for production workloads"
# Rule 1: Block sanctioned countries rule { action = "deny(403)" priority = 100
match { expr { expression = "origin.region_code == 'KP' || origin.region_code == 'IR' || origin.region_code == 'CU'" } }
description = "Block sanctioned countries" }
# Rule 2: Rate limit per IP rule { action = "rate_based_ban" priority = 200
match { versioned_expr = "SRC_IPS_V1" config { src_ip_ranges = ["*"] } }
rate_limit_options { conform_action = "allow" exceed_action = "deny(429)"
rate_limit_threshold { count = 1000 interval_sec = 60 }
ban_duration_sec = 600 # Ban for 10 minutes after exceeding }
description = "Rate limit: 1000 req/min per IP, ban for 10 min" }
# Rule 3: OWASP SQL Injection protection rule { action = "deny(403)" priority = 300
match { expr { expression = "evaluatePreconfiguredExpr('sqli-v33-stable')" } }
description = "OWASP SQL injection protection" }
# Rule 4: OWASP XSS protection rule { action = "deny(403)" priority = 400
match { expr { expression = "evaluatePreconfiguredExpr('xss-v33-stable')" } }
description = "OWASP XSS protection" }
# Rule 5: OWASP Remote Code Execution rule { action = "deny(403)" priority = 500
match { expr { expression = "evaluatePreconfiguredExpr('rce-v33-stable')" } }
description = "OWASP RCE protection" }
# Rule 6: OWASP Local File Inclusion rule { action = "deny(403)" priority = 600
match { expr { expression = "evaluatePreconfiguredExpr('lfi-v33-stable')" } }
description = "OWASP LFI protection" }
# Rule 7: Scanner/bot detection rule { action = "deny(403)" priority = 700
match { expr { expression = "evaluatePreconfiguredExpr('scannerdetection-v33-stable')" } }
description = "Scanner and bot detection" }
# Rule 8: Rate limit login endpoint rule { action = "rate_based_ban" priority = 800
match { expr { expression = "request.path.matches('/api/login')" } }
rate_limit_options { conform_action = "allow" exceed_action = "deny(429)"
rate_limit_threshold { count = 50 interval_sec = 60 }
ban_duration_sec = 3600 # Ban for 1 hour }
description = "Rate limit login: 50 req/min per IP" }
# Default rule: allow all other traffic rule { action = "allow" priority = 2147483647
match { versioned_expr = "SRC_IPS_V1" config { src_ip_ranges = ["*"] } }
description = "Default allow" }
# Enable Adaptive Protection adaptive_protection_config { layer_7_ddos_defense_config { enable = true } }}
# Attach to backend service (Global External ALB)resource "google_compute_backend_service" "api" { name = "api-backend" security_policy = google_compute_security_policy.enterprise_waf.id load_balancing_scheme = "EXTERNAL_MANAGED" protocol = "HTTP" # ... other config}WAF Interview Scenarios
Section titled “WAF Interview Scenarios”Interview — “Design WAF rules for a banking API exposed to partners”
Answer: (1) Layer the defenses: CloudFront → WAF → ALB → API Gateway → EKS. (2) IP allowlisting: Create a custom rule allowing ONLY partner IP ranges (known CIDRs). Block all other source IPs. This is the first rule (highest priority). (3) Mutual TLS (mTLS): Require client certificates — partners must present a valid cert. This is handled at ALB or API Gateway, not WAF. (4) Rate limiting per partner: Different rate limits per partner IP range or API key. High-value partners get higher limits. (5) OWASP protection: Even trusted partners can send malformed requests (compromised systems, bugs). Enable SQLi, XSS, LFI rules. (6) Request validation: Custom WAF rules to enforce required headers (API key, content-type), maximum body size, allowed HTTP methods (POST only for sensitive endpoints). (7) Logging: All WAF decisions logged to S3 + streamed to SIEM. Alert on blocked requests from partner IPs (may indicate their systems are compromised). (8) Deployment: Start in COUNT mode for 2 weeks to identify false positives, then switch to BLOCK.
Interview — “How do you handle WAF false positives without reducing security?”
Answer: (1) Identify the false positive: Check WAF logs — which rule blocked which request? Look at the request URI, headers, body. (2) Surgical exclusion: Do NOT disable the entire rule. Instead, create a rule exclusion for the specific condition. For example, if SizeRestrictions_BODY blocks file uploads on /api/upload, override ONLY that rule to COUNT mode and add a scope-down statement for /api/upload path. (3) Custom rule alternative: If a managed rule is too broad, switch it to COUNT and write a custom rule that covers the same attack pattern but with tighter scope. (4) Label + custom rule pattern: Set the managed rule to COUNT with a label, then create a custom rule that blocks based on the label AND excludes the false-positive path. (5) Testing pipeline: Always test WAF changes in a staging environment with production-like traffic before deploying to production. (6) Continuous tuning: Review WAF COUNT metrics weekly. New application features may trigger new false positives.
DDoS Protection Architecture
Section titled “DDoS Protection Architecture”DDoS protection requires defense at every layer — not just one product. A volumetric L3/L4 flood is handled differently than a slow L7 application-layer attack. This section covers the multi-layer defense strategy.
Multi-Layer Defense Model
Section titled “Multi-Layer Defense Model”Layer 3/4 (Network/Transport): AWS: Shield Standard (free, always-on on CloudFront/ALB/NLB/Route53/EIP) GCP: Cloud Armor built-in (Global LB absorbs at Google edge) ↓Layer 7 (Application): AWS: Shield Advanced ($3K/mo) + WAF rate-based rules + Bot Control GCP: Cloud Armor Adaptive Protection (ML-based anomaly detection) ↓Edge (Content Delivery): AWS: CloudFront absorbs volumetric attacks at 450+ edge locations GCP: Global LB anycast absorbs at Google's edge (one of largest networks globally) ↓Application Layer: API Gateway throttling (per-client rate limits) K8s HPA auto-scaling (scale pods to absorb legitimate traffic spikes) Circuit breakers (Envoy/Istio — prevent cascading failures) Connection draining (graceful handling of dropped connections)AWS Shield Standard vs Advanced
Section titled “AWS Shield Standard vs Advanced”Shield Standard (free, automatic):
- Active on ALL AWS accounts by default — zero configuration
- Protects against L3/L4 volumetric attacks: SYN floods, UDP reflection, DNS amplification
- Covers CloudFront, Route 53, Global Accelerator, ALB, NLB, Elastic IPs
- Cannot see attack metrics or get notifications (happens silently)
Shield Advanced ($3,000/month per organization):
- Everything in Standard, plus:
- 24/7 DDoS Response Team (DRT) — AWS experts assist during active attacks. They can modify WAF rules on your behalf.
- Cost protection — AWS credits your account for scaling costs incurred during a DDoS attack (e.g., Auto Scaling adding instances, data transfer spikes)
- Advanced metrics — real-time attack visibility in CloudWatch: attack vectors, volume, duration
- Proactive engagement — DRT proactively contacts you when they detect an attack targeting your resources (requires Route 53 health checks configured)
- Automatic application-layer mitigations — Shield Advanced can automatically create WAF rules based on observed attack patterns
- Health-based detection — uses Route 53 health check status to detect when an attack is impacting your application (not just traffic volume)
- Group protection — protect up to 1,000 resources under one subscription
When Shield Advanced is worth $3K/month:
- Revenue-generating internet-facing applications (if 1 hour of downtime costs > $36K/year, Shield Advanced pays for itself)
- Regulated industries requiring documented DDoS mitigation (banking: PCI DSS, healthcare: HIPAA)
- Applications with SLA commitments to customers
- When you need cost protection — a major DDoS attack can cause thousands in Auto Scaling and data transfer costs that Shield Advanced refunds
Full DDoS Defense Architecture
Section titled “Full DDoS Defense Architecture”Terraform for Shield Advanced:
resource "aws_shield_protection" "cloudfront" { name = "cloudfront-shield" resource_arn = aws_cloudfront_distribution.main.arn}
resource "aws_shield_protection" "alb" { name = "alb-shield" resource_arn = aws_lb.main.arn}
# Shield Advanced requires subscription (done once via console or CLI)# aws shield create-subscription
# Proactive engagement requires Route53 health checkresource "aws_route53_health_check" "primary" { fqdn = "api.example.com" port = 443 type = "HTTPS" resource_path = "/health" failure_threshold = 3 request_interval = 10
tags = { Name = "primary-api-health" }}
resource "aws_shield_proactive_engagement" "main" { enabled = true
emergency_contact { email_address = "security@example.com" phone_number = "+971501234567" contact_notes = "24/7 Security Operations Center" }}GCP Cloud Armor Standard vs Enterprise (Plus)
Section titled “GCP Cloud Armor Standard vs Enterprise (Plus)”Standard (included with Cloud Armor pricing):
- L3/L4 DDoS protection on Global External LBs — absorbed at Google’s edge
- Manual security policy rules (allow/deny/rate-limit)
- Geo-blocking, IP allowlisting/denylisting
- Pre-configured OWASP WAF rules
Cloud Armor Enterprise (formerly Managed Protection Plus):
- Everything in Standard, plus:
- Adaptive Protection — ML-based anomaly detection for L7 DDoS. Baselines normal traffic, detects deviations, auto-generates mitigation rules
- DDoS billing protection — credits for excess traffic costs during attacks
- Named IP lists — Google-curated threat intelligence lists (Tor exit nodes, known botnets, anonymizing proxies)
- Threat intelligence — Google’s global threat data feeds for IP reputation
- Advanced rate limiting — per-client, per-path, per-header rate limiting with ban-on-exceed
GCP’s inherent advantage: Google’s global network is one of the largest in the world. The 2023 record-breaking 398 million RPS DDoS attack was mitigated by Cloud Armor at Google’s edge before reaching any customer infrastructure. The Global External LB’s anycast design means attack traffic is distributed across hundreds of edge POPs rather than concentrating at one location.
GCP DDoS Architecture
Section titled “GCP DDoS Architecture”Terraform for Cloud Armor Enterprise:
resource "google_compute_security_policy" "ddos_protection" { name = "ddos-protection-policy"
# Adaptive Protection — ML-based L7 DDoS detection adaptive_protection_config { layer_7_ddos_defense_config { enable = true rule_visibility = "STANDARD" # or "PREMIUM" for auto-deploy rules } }
# Rule 1: Block known-bad IPs (Google threat intelligence) rule { action = "deny(403)" priority = 100 match { expr { expression = "evaluateThreatIntelligence('iplist-known-malicious-ips')" } } description = "Block known malicious IPs" }
# Rule 2: Block Tor exit nodes rule { action = "deny(403)" priority = 200 match { expr { expression = "evaluateThreatIntelligence('iplist-tor-exit-nodes')" } } description = "Block Tor exit nodes" }
# Rule 3: Aggressive rate limiting during attacks rule { action = "rate_based_ban" priority = 300 match { versioned_expr = "SRC_IPS_V1" config { src_ip_ranges = ["*"] } } rate_limit_options { conform_action = "allow" exceed_action = "deny(429)" rate_limit_threshold { count = 500 interval_sec = 60 } ban_duration_sec = 3600 } description = "Rate limit: 500 req/min, ban 1 hour" }
# Default: allow rule { action = "allow" priority = 2147483647 match { versioned_expr = "SRC_IPS_V1" config { src_ip_ranges = ["*"] } } description = "Default allow" }}DDoS Cost Justification
Section titled “DDoS Cost Justification”“When is $3K/month for Shield Advanced worth it?”
| Factor | Without Shield Advanced | With Shield Advanced |
|---|---|---|
| Attack response | Your team manually investigates and mitigates. 4-8 hours of senior engineer time at $200/hr = $800-$1,600 per incident | DRT handles mitigation. Your team monitors. |
| Scaling costs | You pay for Auto Scaling during attack. Major attack can cost $5K-$50K+ in compute/data transfer | AWS credits scaling costs (cost protection) |
| Revenue loss | Application degradation during attack. 1 hour downtime for a $10M/yr app = $1,140/hr | Faster mitigation = less downtime |
| Annual cost | Incident costs are unpredictable, potentially > $36K/yr | Fixed $36K/yr with predictable outcomes |
| Compliance | Must demonstrate DDoS mitigation capability (PCI DSS, SOC 2) | Shield Advanced provides compliance documentation |
Rule of thumb: If your internet-facing application generates >$500K/year in revenue OR operates in a regulated industry, Shield Advanced pays for itself.
DDoS Interview Scenario
Section titled “DDoS Interview Scenario”Interview — “Your banking app is under DDoS attack. Walk through your response plan.”
Answer: (1) Detection (0-5 min) — CloudWatch alarms fire on Shield Advanced attack metrics (or Adaptive Protection alerts on GCP). PagerDuty pages the on-call engineer. Route 53 health checks may start failing, triggering failover routing. (2) Triage (5-15 min) — Check WAF and Shield dashboards: is this L3/L4 (volumetric) or L7 (application layer)? Check attack source IPs, geographic distribution, attack vectors. If L3/L4, Shield Standard/Cloud Armor is already mitigating — verify. If L7, proceed to step 3. (3) L7 mitigation (15-30 min) — Review WAF logs for attack patterns (common URI, user-agent, headers). Add targeted WAF rules: block offending user-agents, rate-limit specific paths under attack, geo-block if attack originates from specific countries. If Shield Advanced, engage DRT (they can push WAF rules on your behalf). (4) Scale (parallel) — Verify HPA is scaling pods. Verify Karpenter/ASG is adding nodes. Check ALB connection counts. If API Gateway, verify throttling is in place. (5) Communication — Notify stakeholders (status page update). If customer-facing, activate incident response communication plan. (6) Post-incident (24-48 hrs) — Review attack forensics. Update WAF rules permanently if attack pattern is novel. Request Shield Advanced cost protection credit. Update runbook with lessons learned. Conduct blameless post-mortem.
Interview Scenarios
Section titled “Interview Scenarios”Scenario 1: “Design a centralized network inspection architecture for all internet egress across 50 workload accounts”
Section titled “Scenario 1: “Design a centralized network inspection architecture for all internet egress across 50 workload accounts””Answer:
I would implement a hub-spoke model with a dedicated inspection VPC in the Network Hub Account:
Architecture:
Key design decisions:
- No IGW in workload VPCs — enforced via SCP that denies
ec2:CreateInternetGatewayin workload accounts - Appliance mode ON on the inspection VPC TGW attachment — ensures symmetric routing for stateful inspection
- Multi-AZ deployment — Network Firewall endpoints in all 3 AZs, NAT GWs in all 3 AZs
- TGW route table segmentation — prod and non-prod have separate route tables. Both route 0.0.0.0/0 to inspection, but they cannot reach each other’s VPCs unless explicitly propagated
- Logging — all firewall alerts go to CloudWatch for real-time monitoring + S3 for long-term compliance. SIEM integration (Splunk/Sentinel) consumes from S3
Scaling considerations: AWS Network Firewall auto-scales. For 50 VPCs, TGW can handle up to 5,000 attachments per region. Data processing cost at $0.02/GB is the main cost driver — estimate monthly egress and budget accordingly.
Scenario 2: “How do you implement IPS/IDS in the cloud?”
Section titled “Scenario 2: “How do you implement IPS/IDS in the cloud?””Answer:
IPS/IDS in the cloud replaces the traditional on-prem intrusion detection appliances with cloud-native services:
AWS — Network Firewall with Suricata:
- Deploy Network Firewall in the centralized inspection VPC
- Write rules in Suricata syntax (open-source IDS/IPS engine)
- Start in IDS mode (alert action) for 2-4 weeks to baseline traffic and identify false positives
- Switch to IPS mode (drop action) for blocking after tuning
- Use AWS managed threat intelligence rule groups for immediate coverage (malware domains, C2 servers, exploit signatures)
- Custom rules for bank-specific threats: block DNS-over-HTTPS (enforce internal DNS), detect large outbound transfers (exfiltration), enforce TLS-only egress
GCP — Cloud NGFW Enterprise:
- Deploy firewall endpoints in each zone where inspection is needed
- Create security profiles with threat prevention enabled
- Powered by Palo Alto Networks threat intelligence — auto-updated signatures
- Supports TLS inspection (decrypt → inspect → re-encrypt) with your CA certificate
- Configure severity-based actions: CRITICAL/HIGH = DENY, MEDIUM = ALERT, LOW = LOG
Alerting and SIEM integration:
- AWS: Network Firewall alert logs → CloudWatch Logs → CloudWatch Alarm → SNS → PagerDuty. Long-term: S3 → Splunk/Sentinel
- GCP: Firewall logs → Cloud Logging → Log-based metrics → alerting policy → PagerDuty. Export to BigQuery or SIEM via Pub/Sub
Key difference from on-prem: Cloud IPS does not require you to manage hardware, update Suricata/Snort versions, or maintain rule feeds. AWS and GCP managed rule groups are continuously updated. You focus on custom rules specific to your environment and tuning false positives.
Scenario 3: “How do you protect a public-facing API from DDoS and bot traffic?”
Section titled “Scenario 3: “How do you protect a public-facing API from DDoS and bot traffic?””Answer:
I would implement a multi-layer defense using edge services:
For the banking API specifically:
- Enable Shield Advanced ($3K/month) — provides DDoS Response Team, cost protection during attacks, and proactive engagement
- WAF bot control in Targeted mode — uses browser fingerprinting and behavioral analysis
- Rate-based rules scoped to sensitive endpoints: login (100 req/5min), password reset (10 req/5min), account creation (5 req/5min)
- Enable WAF logging and pipe to SIEM for SOC visibility
- API key + mutual TLS for B2B API consumers — bots cannot obtain valid certificates
Scenario 4: “Explain the difference between NACLs, Security Groups, and Network Firewall — when to use each”
Section titled “Scenario 4: “Explain the difference between NACLs, Security Groups, and Network Firewall — when to use each””Answer:
These operate at different layers of the network stack:
| Aspect | NACL | Security Group | Network Firewall |
|---|---|---|---|
| Layer | Subnet boundary | ENI (instance) | VPC (inspection point) |
| Statefulness | Stateless (must allow both directions) | Stateful (responses auto-allowed) | Both (stateless fast-path + stateful deep inspect) |
| Rules | Allow + Deny, numbered order | Allow only, all rules evaluated | Allow + Deny + Alert, Suricata syntax |
| Protocol awareness | IP/port only | IP/port only | Deep packet inspection (payload, TLS SNI, HTTP headers) |
| Scope | Applies to all traffic in subnet | Per-instance, can reference other SGs | All traffic routed through firewall |
| Use case | Emergency blocking, broad isolation | Primary app-level access control | IPS/IDS, domain filtering, compliance |
Layered approach at our bank:
- Network Firewall (inspection VPC): catches threats, enforces domain allowlists, IPS/IDS — this is the perimeter
- NACLs (workload subnets): enforce tier isolation — data subnets only accept traffic from private subnets. Used as an additional barrier in case SG misconfiguration
- Security Groups (per resource): primary access control —
sg-appallows 8080 fromsg-alb,sg-rdsallows 5432 fromsg-app
A common mistake I see is relying solely on security groups. If a developer accidentally opens a security group to 0.0.0.0/0, the NACL and Network Firewall still protect. Defense in depth means no single control failure exposes the workload.
Scenario 5: “Design GCP firewall architecture for an org with 100+ projects using Shared VPC”
Section titled “Scenario 5: “Design GCP firewall architecture for an org with 100+ projects using Shared VPC””Answer:
GCP’s hierarchical firewall model is purpose-built for this:
Key design decisions for 100+ projects:
- Organization policy is immutable — project owners cannot override org-level DENY rules
- GOTO_NEXT — delegates to the next level, allowing folder and VPC policies to add rules without duplicating org rules
- Service account targets (not tags) — prevents privilege escalation where a user adds a permissive tag to their VM
- Cloud NGFW Enterprise on the Shared VPC — IPS inspection without needing a separate inspection VPC (unlike AWS)
- Logging — firewall rule logging enabled on all rules. Logs exported to Cloud Logging → SIEM
- Terraform modules — firewall policies defined as Terraform modules in the infra repo. Changes go through PR review. No one manually edits firewall rules in the console
Scenario 6: “A workload account team says they need direct internet access. How do you handle this?”
Section titled “Scenario 6: “A workload account team says they need direct internet access. How do you handle this?””Answer:
Short answer: deny the request and provide an alternative.
The whole point of centralized network security is that no workload VPC has direct internet access. Allowing it undermines the architecture and creates a security gap that cannot be monitored or controlled.
Step 1 — Understand the requirement: Ask WHY they need internet access. Common reasons:
- “We need to call a third-party API” → route through centralized NAT + inspection
- “We need to pull container images” → use VPC endpoint for ECR / Private Google Access for Artifact Registry
- “We need to install packages” → use internal package mirror in Shared Services or S3 endpoint
- “We need to receive webhooks” → use centralized ingress via ALB in Network Hub DMZ
- “We need to debug connectivity issues” → provide CloudShell or SSM Session Manager (no SSH, no public IP)
Step 2 — Provide the approved solution:
- Egress: all outbound traffic goes through TGW → inspection VPC → Network Firewall (IPS) → NAT GW. If they need to reach a specific external API, add the domain to the firewall allowlist. This is a change request, not a direct internet grant.
- Ingress: internet-facing load balancers live in the Network Hub DMZ VPC. Workload teams provide their target group; the central team configures the ALB listener rule routing traffic to the workload VPC via TGW.
Step 3 — Enforce via policy:
- SCP (AWS): deny
ec2:CreateInternetGateway,ec2:AttachInternetGateway,ec2:CreateNatGatewayin all workload accounts - Organization Policy (GCP):
constraints/compute.restrictVpcExternalIpAccessto deny external IPs on VMs. Shared VPC firewall policies at org level deny direct internet egress.
Step 4 — Document and communicate: Publish an internal wiki page: “How to get internet access for your workload” — explains the centralized egress architecture, how to request domain allowlisting, and SLA for request processing (e.g., domain allowlist changes processed within 4 business hours).
References
Section titled “References”- AWS Network Firewall Documentation — managed network firewall with Suricata-compatible IPS/IDS
- AWS WAF Documentation — web application firewall for ALB, CloudFront, and API Gateway
- Cloud Next Generation Firewall Documentation — L7 inspection, threat prevention, and hierarchical firewall policies
- Google Cloud Armor Documentation — DDoS protection and WAF for global and regional load balancers
Tools & Frameworks
Section titled “Tools & Frameworks”- Suricata Rules Documentation — rule syntax reference for the IPS/IDS engine used by AWS Network Firewall