AWS Certified Solutions Architect — Associate · SAA-C03

Design Cost-Optimized
Architectures

Domain 4 — Comprehensive Study Guide
Task Statements 4.1 · 4.2 · 4.3 · 4.4 · 4.5

📋 20% of Exam Score · ~13 Questions

Welcome to the Domain 4 study guide — the final domain of the AWS Certified Solutions Architect – Associate exam. At 20%, Domain 4 is the smallest domain but carries the same per-question weight as the others. Combined with the three previous domains, these four domains cover 100% of the SAA-C03 exam. Domain 4 focuses on the AWS Cost Optimization pillar of the Well-Architected Framework: choosing the right pricing models for compute, right-sizing resources, selecting cost-effective storage, optimizing data transfer costs, and using the right managed services to eliminate undifferentiated heavy lifting. The mindset shift: Domain 3 asked "how fast?" — Domain 4 asks "how cheap, without sacrificing what's needed?" Cost optimization is always about trade-offs between cost, performance, resilience, and operational overhead.

Domain 4 Overview

What You Need to Know

Task 4.1 — Cost-Effective Storage

S3 storage class selection for cost
S3 Lifecycle policies — automate tiering
EBS right-sizing and snapshot cleanup
EFS Intelligent-Tiering & IA tier
S3 Intelligent-Tiering for unknown patterns

Task 4.2 — Cost-Effective Compute

EC2 pricing models: On-Demand, Reserved, Spot, Savings Plans
Right-sizing with Compute Optimizer
Serverless cost model: Lambda, Fargate
Auto Scaling to eliminate over-provisioning
Graviton (ARM) instances for better price/performance

Task 4.3 — Cost-Effective Databases

RDS Reserved Instances
Aurora Serverless v2 for variable workloads
DynamoDB On-Demand vs. Provisioned cost
ElastiCache to reduce database read costs
S3 + Athena vs. Redshift cost comparison

Task 4.4 & 4.5 — Network & Managed Services

Data transfer pricing patterns
VPC Endpoints to eliminate NAT Gateway costs
CloudFront to reduce origin data transfer
Managed services vs. self-managed cost model
Cost allocation tags & monitoring tools

Domain 4 questions frequently present an over-architected or under-optimized scenario and ask which change would reduce cost while meeting the stated requirements. The core AWS cost optimization levers are: use the right pricing model (Reserved vs. Spot vs. On-Demand), right-size resources, use serverless where appropriate (pay only for what you use), move infrequently accessed data to cheaper storage tiers, and eliminate unnecessary data transfer charges. Task 4.4 and 4.5 are closely related — network cost optimization and managed services both involve reducing the operational overhead that contributes to total cost of ownership (TCO). A key exam pattern: questions often describe a workload with very specific requirements (e.g., "must be available 24/7," "can tolerate interruption," "traffic pattern is predictable") — match those requirements to the cost-optimal pricing model.

4.1

Cost-Effective Storage Solutions

S3 Storage Classes · Lifecycle Policies · EBS Right-Sizing · EFS IA · Intelligent-Tiering

Task 4.1 — S3 Storage Costs

S3 Storage Class Cost Optimization

Tiering by access frequency · Lifecycle automation · Intelligent-Tiering

S3 Standard
Most $$$

Intelligent-Tiering

Standard-IA

One Zone-IA

Glacier Instant

Glacier Flexible

Deep Archive
Least $$$

Lifecycle Policy Automation

Automatically transition objects to cheaper tiers after N days
Typical pattern: Standard → Standard-IA (30d) → Glacier (90d) → Deep Archive (180d)
Set expiration rules to auto-delete objects past retention period
Abort incomplete multipart uploads after N days — prevents hidden cost accumulation
Apply to entire bucket or filtered by prefix/tag

S3 Intelligent-Tiering

AWS monitors access frequency and moves objects automatically
Frequent Access tier ↔ Infrequent Access tier (30-day inactivity threshold)
Optional: Archive tiers (90 and 180 days of inactivity)
Small monthly monitoring fee per object (~$0.0025/1,000 objects)
Best for: data lakes, user uploads, logs — when access pattern is unknown or variable
No retrieval fees between Frequent/Infrequent tiers

When to Use Each Tier

Standard: Accessed daily/weekly. No minimum duration.
Standard-IA: Monthly backups, DR copies. ≥128 KB objects. 30-day min.
One Zone-IA: Re-creatable data only (thumbnails). 20% cheaper than Standard-IA. Single AZ risk.
Glacier Instant: Medical images, media archives. Accessed ~quarterly. 90-day min.
Glacier Flexible: Backups. 3–5 hour retrieval OK. 90-day min.
Deep Archive: 7–10 year compliance archive. 12–48h retrieval. 180-day min.

Standard-IA and Glacier tiers charge per-GB retrieval fees — if data is accessed frequently, the retrieval fees exceed the storage savings. Always calculate before choosing IA tiers for hot data.

"Unknown access pattern" → S3 Intelligent-Tiering. "Log files that must be retained 7 years, rarely accessed" → Lifecycle to Deep Archive. "Application backups stored for 30 days, then deleted" → Lifecycle expiration rule. Standard-IA has 30-day minimum storage charge — don't use for short-lived objects.

S3 cost optimization is about avoiding two common mistakes: storing everything in Standard (overpaying for storage) or using IA tiers for frequently accessed data (paying excessive retrieval fees). Lifecycle policies are the primary automation tool. They're just rules: "after 30 days, transition to Standard-IA; after 90 days, transition to Glacier Flexible Retrieval; after 365 days, expire (delete)." The exam tests whether you configure the right policy for the stated retention and access requirements. Intelligent-Tiering is the safe choice for unknown patterns — it incurs a small monitoring fee but automatically optimizes storage costs without any retrieval fees between the Frequent and Infrequent tiers. For large objects accessed unpredictably, it almost always saves money. Important cost trap: Standard-IA has a minimum storage duration charge of 30 days. If you store a 1 GB object for 5 days and delete it, you're charged for 30 days. Don't use IA tiers for data with short or unpredictable lifespans — Standard is cheaper for frequently cycled objects. Multipart upload cleanup: if a multipart upload is started and not completed, AWS charges for the uploaded parts indefinitely. A lifecycle rule to abort incomplete multipart uploads after 7 days prevents silent cost accumulation.

Task 4.1 — EBS & EFS Costs

Block & File Storage Cost Optimization

EBS right-sizing · Snapshot cleanup · EFS Intelligent-Tiering

EBS Cost Levers

Right-size volumes: EBS charges for provisioned size, not used size. Shrink oversized volumes.
gp3 over gp2: Same price per GB but gp3 independently provisions IOPS — avoid paying for unused IOPS burst on gp2.
st1 / sc1 for sequential workloads: HDD tiers are ~80% cheaper than SSD for the right access pattern (throughput-heavy, not IOPS-heavy).
Delete unattached volumes: Terminated EC2 instances may leave orphaned volumes — they still accrue charges. Use AWS Config rule to detect.
Snapshot management: Use Data Lifecycle Manager (DLM) to automate snapshot creation and deletion. Delete stale snapshots from older AMIs.

EFS Cost Optimization

EFS Intelligent-Tiering: Automatically moves files not accessed for 30 days to EFS IA tier (~92% cheaper than Standard)
EFS IA tier: Same durability and availability as Standard; retrieval fee per GB accessed
Throughput mode: Elastic Throughput (pay per GB transferred) vs. Provisioned (flat rate) — use Elastic for variable workloads
EFS One Zone: Stores data in a single AZ — 47% cheaper than Multi-AZ. Use for dev/test or data that can be recreated.

Storage Cost Monitoring

AWS Cost Explorer: Visualize storage costs by service and tag
S3 Storage Lens: Organization-wide S3 usage and activity insights
Trusted Advisor: Flags unattached EBS volumes, underutilized snapshots
Cost Allocation Tags: Tag buckets and volumes by team/project for chargeback

"EC2 instances terminated but costs still high" → check for unattached EBS volumes and old EBS snapshots. "Shared file system costs too high" → enable EFS Intelligent-Tiering or switch to EFS One Zone for dev environments. gp3 costs the same as gp2 but gives more control — always prefer gp3.

EBS is a significant cost driver in many AWS environments, often due to three issues: 1. Over-provisioned volumes: teams request large volumes "just in case." The fix is right-sizing — monitor CloudWatch VolumeConsumedReadWriteOps and DiskReadBytes to find underutilized volumes. 2. Orphaned volumes: when EC2 instances are terminated, their EBS volumes are not automatically deleted by default. Use the DeleteOnTermination flag to clean up automatically, or use AWS Config with the ec2-volume-inuse-check rule to find unattached volumes. 3. Snapshot accumulation: snapshots are incremental, but if you take daily snapshots and never delete old ones, you accumulate months or years of incremental changes. DLM automates retention-based deletion. EFS costs can surprise teams — the Standard tier is priced per GB-month, and large file systems can be expensive. EFS Intelligent-Tiering automatically moves cold files to the IA tier, which is 92% cheaper. For development or test environments where HA isn't critical, EFS One Zone is nearly half the price.

4.2

Cost-Effective Compute

EC2 Pricing Models · Savings Plans · Spot Instances · Right-Sizing · Serverless · Graviton

Task 4.2 — EC2 Pricing

EC2 Pricing Models

On-Demand · Reserved · Savings Plans · Spot · Dedicated

On-Demand

Pay per second (Linux) or per hour (Windows). No commitment. Most flexible, highest cost. Use for: short-term, unpredictable workloads; development/testing; new applications being sized. Baseline price — no discount.

Reserved Instances (RI)

1- or 3-year commitment to a specific instance type, OS, and region. Up to 72% discount vs On-Demand. Standard RI: locked to instance type. Convertible RI: can change family/OS/tenancy (up to 66% discount). Payment options: All Upfront (max discount), Partial Upfront, No Upfront.

Savings Plans

Flexible commitment to a $/hour spend level. Compute SP: applies to EC2 (any family/region), Lambda, Fargate — up to 66% discount. EC2 Instance SP: specific instance family + region — up to 72% discount. Recommended over RIs for most use cases due to flexibility.

Spot Instances

Bid on spare EC2 capacity. Up to 90% discount vs On-Demand. AWS can reclaim with 2-minute warning. Use for: fault-tolerant, stateless, flexible start/end time workloads — batch jobs, data processing, rendering, CI/CD, containerized tasks.

Dedicated Hosts / Instances

Physical server dedicated to you. Dedicated Host: you control socket/core placement, bring your own license (BYOL) for SQL Server, Oracle, Windows. Most expensive option. Use only when licensing or compliance requires physical isolation.

"Steady-state 24/7 workload, 1-year plan" → Reserved Instance or Savings Plan. "Batch jobs that can be interrupted" → Spot Instances. "Need to bring SQL Server license from on-premises" → Dedicated Host. "Mixed fleet flexibility across families and regions" → Compute Savings Plan over Standard RI.

EC2 pricing model selection is one of the most frequently tested topics in Domain 4. The exam gives you a workload description and asks which pricing model is most cost-effective. Reserved vs. Savings Plans: AWS now recommends Savings Plans over Reserved Instances for most workloads. Savings Plans are more flexible — they apply across instance families, sizes, and regions (Compute SP) or across sizes and OS within a family and region (EC2 Instance SP). The discount rates are similar. The key difference: RIs are specific instance commitments; Savings Plans are spending commitments. Spot Instances: the massive 90% discount comes with a catch — AWS can reclaim capacity with 2 minutes notice. This makes Spot appropriate only for workloads that can handle interruption: batch processing, background workers, containerized tasks, CI/CD pipelines, rendering farms. Not appropriate for web servers handling live user traffic unless combined with On-Demand/Reserved capacity for the baseline. Spot Fleet and ASG Spot: use Spot Fleet with multiple instance types and AZs to maintain capacity despite individual Spot interruptions. Auto Scaling Groups can mix On-Demand and Spot instances. Dedicated Hosts vs. Dedicated Instances: Dedicated Instances run on hardware dedicated to you but you don't have visibility into socket/core layout. Dedicated Hosts give you visibility and control over placement — required for some per-socket or per-core license models like Microsoft SQL Server.

Task 4.2 — Commitment Discounts

Savings Plans vs. Reserved Instances

Choosing the right commitment model for long-running workloads

Feature	Compute Savings Plan	EC2 Instance Savings Plan	Standard Reserved Instance
Max Discount	Up to 66%	Up to 72%	Up to 72%
Applies to	EC2 (any family/region), Lambda, Fargate	EC2 specific instance family + region	Specific instance type + OS + region
Flexibility	Highest — change instance family, size, region, OS, tenancy	Change size and OS within family	Locked — cannot change instance type or region
Commitment	$/hour spend level, 1 or 3 years	$/hour spend level, 1 or 3 years	Instance quantity, 1 or 3 years
Marketplace	Not resaleable	Not resaleable	Can sell unused capacity on RI Marketplace
Best for	Mixed workloads, microservices, containers + serverless	Steady EC2 workload, family flexibility needed	Exact instance type known long-term; want max discount

Savings Plans are recommended over RIs for most new workloads due to flexibility. Use Standard RI only when the exact instance type is known for years and you want to use the RI Marketplace. Compute Savings Plan covers Lambda and Fargate — RIs do not.

This comparison comes up frequently. The trend in AWS exam questions is toward Savings Plans being the recommended answer over Reserved Instances, because they're more flexible and cover Lambda and Fargate in addition to EC2. Compute Savings Plan is the most flexible: you commit to spending X dollars per hour on compute, and the discount applies automatically to EC2 (any instance type, any region), Lambda (per-request and per-duration), and Fargate (per vCPU and memory). If your architecture shifts from EC2 to containers or serverless over the commitment period, the Savings Plan continues to provide discounts. EC2 Instance Savings Plan is slightly less flexible — you commit to a specific instance family in a specific region, but you can change the size (e.g., m5.xlarge → m5.2xlarge) and OS within that family. Discount is slightly higher than Compute SP. Standard Reserved Instance is the most restrictive — specific instance type, OS, and region — but can be sold on the Reserved Instance Marketplace if your needs change. This is the only commitment type with resale flexibility. For the exam: "company plans to migrate from EC2 to containers in the next year" → Compute Savings Plan (covers both). "Company has stable EC2 fleet of m5.xlarge instances for 3 years" → EC2 Instance Savings Plan or Standard RI for maximum discount.

Task 4.2 — Spot Instances

Spot Instances — Maximum Savings for Flexible Workloads

Up to 90% discount · Interruption handling · Spot Fleet strategies

Spot — Good Fit ✅

Batch data processing (ETL, analytics, log processing)
Image / video rendering and transcoding
CI/CD build and test pipelines
Containerized microservices (stateless)
HPC / scientific simulation workloads
Machine learning model training
Background workers behind an SQS queue
Dev / test environments (acceptable downtime)

Spot — Poor Fit ❌

Stateful databases (RDS, production databases)
Production web servers handling live user traffic alone
Applications with no checkpointing or restart capability
Long-running jobs that cannot be safely interrupted mid-way
Any workload with an SLA requiring high availability

Spot Fleet & Interruption Handling

Spot Fleet: Request Spot across multiple instance types + AZs simultaneously — reduces interruption risk
2-minute warning: CloudWatch event + instance metadata → gracefully checkpoint and drain
Spot + On-Demand mix: ASG with base capacity On-Demand/Reserved, overflow on Spot
Capacity Rebalancing: Proactively replace Spot instances at elevated interruption risk

"Batch jobs that can be interrupted" → Spot. "Workers behind SQS processing images" → Spot ASG on queue depth. The pattern for resilient Spot is: diversify across instance types + AZs, handle the 2-minute warning, and use SQS/checkpointing so interrupted work can resume. Never use Spot as the only compute for user-facing production services.

Spot instances are the most powerful cost optimization tool in AWS, but require workload architecture to be interrupt-tolerant. The 2-minute interruption notice: AWS sends a CloudWatch event and updates the EC2 instance metadata when reclaiming a Spot instance. Well-designed workloads listen for this event and use the 2 minutes to checkpoint state, drain connections, or complete the current unit of work. SQS + Spot is the canonical resilient Spot pattern: workers pull jobs from an SQS queue. If a Spot instance is interrupted, the SQS message becomes visible again after the visibility timeout and another worker picks it up. No work is lost. Spot Fleet diversification: the more instance types you allow (e.g., m5.large, m4.large, m5a.large, c5.xlarge), the lower the probability of a simultaneous mass interruption affecting your fleet. AWS recommends at least 5–10 instance type options in a Spot Fleet. Capacity Rebalancing: when enabled on an ASG, AWS proactively launches a replacement Spot instance when an existing one is at elevated interruption risk, before the 2-minute warning. This reduces the window of reduced capacity.

Task 4.2 — Right-Sizing & Architecture

Right-Sizing, Graviton & Serverless Cost Models

Compute Optimizer · ARM instances · Lambda pay-per-use

AWS Compute Optimizer

Analyzes CloudWatch metrics to identify over-provisioned resources
Recommends optimal EC2 instance type, EBS volume type, Lambda memory, ECS task sizing, and ASG configuration
Free tier: account-level. Enhanced mode: cross-account, 3-month lookback, Graviton recommendations
Also available via Trusted Advisor (low-utilization EC2)
Act on recommendations to reduce cost without impacting performance

AWS Graviton (ARM) Instances

AWS-designed ARM64 processors: Graviton2 (M6g, C6g, R6g), Graviton3 (M7g, C7g, R7g)
Up to 40% better price/performance vs equivalent Intel/AMD x86 instances
Works with: Linux workloads, containerized apps, JVM, Python, Node, Go
Not compatible with: Windows, some legacy binaries requiring x86
Supported for Lambda (arm64 architecture — 20% cheaper than x86)

Serverless Cost Model

Lambda: Pay per invocation ($0.20/1M) + duration ($0.0000166667/GB-sec). Zero cost when idle. Scales to zero automatically.
Fargate: Pay per vCPU-hour + GB-hour per task. No idle cluster cost.
vs. EC2: Serverless eliminates cost of idle compute. Break-even depends on utilization — EC2 RI is cheaper at sustained high utilization; serverless is cheaper at low/variable utilization.

"Identify underutilized EC2 instances" → AWS Compute Optimizer or Trusted Advisor. "20–40% cost reduction without code changes" → migrate to Graviton (ARM) instances. "Workload runs only occasionally, not 24/7" → Lambda or Fargate eliminates idle compute cost vs. always-on EC2.

Right-sizing is often the lowest-effort, highest-impact cost optimization — many organizations simply use instances that are too large. AWS Compute Optimizer uses machine learning on 14 days of CloudWatch metrics (or 93 days with Enhanced mode) to identify resources that are overprovisioned and recommend smaller or more efficient alternatives. It's free to enable and often surfaces immediate savings. Graviton instances: AWS's own ARM-based processors deliver significantly better price/performance because AWS can optimize the chip design for cloud workloads. The m6g is ~10% cheaper than the m5 and 40% better price/performance. For containerized Linux workloads, Graviton should be the default choice. The main blocker is x86-only legacy code or Windows — most modern Linux workloads work fine on ARM64. Lambda at arm64 architecture: 20% cheaper per GB-second than x86, and often faster. Just change the architecture setting in Lambda configuration — no code changes for most runtimes. Serverless break-even: Lambda is not always cheaper than EC2. For a function that runs millions of times per day for 100ms each, the math might favor a small EC2 RI. But for functions that run unpredictably or infrequently, Lambda's zero-idle cost wins.

4.3

Cost-Effective Database Solutions

RDS Reserved · Aurora Serverless · DynamoDB Cost Modes · Caching to Reduce DB Spend

Task 4.3 — Database Costs

Database Cost Optimization

RDS Reserved · Aurora Serverless · DynamoDB Capacity · ElastiCache ROI

RDS Cost Levers

RDS Reserved Instances: 1- or 3-year commitment on RDS instance class. Up to 69% discount vs. On-Demand. Apply to Multi-AZ deployments too.
Right-size instance class: Use Performance Insights + CloudWatch to find underutilized RDS instances. CPU <40% and low connections → downsize.
Stop dev/test instances: RDS can be stopped for up to 7 days — saves compute cost while retaining storage.
Aurora vs. RDS MySQL: Aurora storage is auto-scaled and billed per GB used; RDS requires pre-allocated storage. For large DBs with variable growth, Aurora storage billing can be more cost-effective.
Single-AZ for dev/test: Multi-AZ doubles instance cost. Use Single-AZ for non-production.

Aurora Serverless v2

Scales from 0.5 to 128 Aurora Capacity Units (ACUs) instantly
Billed per ACU-second — no cost for idle capacity between requests
Best for: development, variable workloads, intermittent traffic (SaaS multi-tenant, seasonal apps)
Can auto-pause to 0 when idle (dev/test) — zero compute cost
Minimum ACU billing prevents true zero cost in production

DynamoDB: On-Demand vs. Provisioned

On-Demand: Pay per request. No capacity planning. Higher per-RCU/WCU cost but zero idle cost. Best for unpredictable traffic.
Provisioned + Auto Scaling: Set min/max RCUs/WCUs; auto scaling adjusts. Cheaper at sustained, predictable load. Best for consistent traffic patterns.
Reserved Capacity: Commit to provisioned throughput for 1–3 years. Up to 76% discount. Use for stable read/write-heavy tables.

"RDS running 24/7 for production, stable load" → RDS Reserved Instance. "Aurora database for a new SaaS app with unknown initial traffic" → Aurora Serverless v2. "DynamoDB table serving consistent high traffic" → Provisioned capacity + Reserved Capacity for max savings. Add ElastiCache in front of any read-heavy RDS/Aurora to dramatically reduce DB instance sizing.

Database cost optimization follows the same commitment logic as EC2: predictable, steady-state workloads benefit from reserved pricing; variable workloads benefit from on-demand or serverless billing. RDS Reserved Instances: unlike EC2 Reserved Instances, RDS RIs apply at the DB instance class level (db.m5.xlarge) and can be applied to any database engine running that instance class in the specified region. Multi-AZ deployments are supported — reserving a Multi-AZ instance class provides the RI discount for both the primary and standby instances. Aurora Serverless v2 auto-pause: when enabled for dev/test, the cluster pauses after a period of inactivity and incurs zero compute charges. When a connection is made, it wakes in milliseconds. This is a significant cost reduction for development databases that aren't used overnight or on weekends. ElastiCache as a cost optimization tool: adding a Redis cache in front of a heavily-read RDS database can reduce the required RDS instance size. If 80% of reads can be served from cache, the RDS instance handles 80% less load — you can downsize from db.r5.2xlarge to db.r5.xlarge, saving potentially thousands of dollars per month. The cache cost is typically much less than the cost difference between instance sizes.

4.4 · 4.5

Network Cost Optimization & Managed Services

Data Transfer Pricing · VPC Endpoints · CloudFront · Managed Service TCO

Task 4.4 — Network Costs

Data Transfer Cost Patterns

What's free · What costs money · How to reduce it

Free Data Transfer $0

Inbound to AWS from the internet (ingress is always free)
Within the same AZ (same AZ, same region) using private IPs
S3 → CloudFront (origin fetch from S3 is free)
EC2 ↔ S3 in same region (via internet endpoint or Gateway Endpoint)
Between services in the same region using Gateway Endpoints (S3, DynamoDB)
Direct Connect data-in from on-premises to AWS

Charged Data Transfer $$

Outbound from AWS to the internet (egress) — charged per GB
Cross-AZ traffic within the same region — charged both ways
Cross-region data transfer — charged per GB
NAT Gateway processing — charged per GB processed + hourly
VPC Peering cross-region — charged per GB
Direct Connect data-out from AWS to on-premises — charged per GB

VPC Endpoint Cost Savings

Gateway Endpoints (S3, DynamoDB): Free. Eliminates NAT Gateway processing charges for traffic to S3/DynamoDB from private subnets — potentially hundreds of dollars/month saved.
Interface Endpoints: Hourly charge per AZ + per GB. Cheaper than routing through NAT Gateway for high-volume service traffic.

CloudFront to Reduce Egress

CloudFront → internet egress is priced lower than direct EC2/S3 → internet
S3 → CloudFront origin fetch is free — only egress from CloudFront to users is charged
Cache hit ratio: every cache hit eliminates both origin compute and egress cost
Price Classes: restrict CloudFront to cheaper edge regions (e.g., North America only) if users are concentrated geographically

"EC2 in private subnet accessing S3 — high NAT Gateway costs" → Add S3 Gateway VPC Endpoint (free) to bypass NAT Gateway. "High internet egress costs for static assets" → Put CloudFront in front — cheaper egress rates + cache hits eliminate repeat egress. Cross-AZ traffic costs money — consolidate into fewer AZs only if HA requirements allow.

Data transfer costs catch many teams off guard. The most expensive surprise is cross-AZ traffic — when a web server in us-east-1a talks to a database in us-east-1b, both the request and response are charged. This can add up to thousands of dollars per month for high-traffic applications. The solution is to use the same AZ for tightly coupled instances, or deploy multiple copies within each AZ. NAT Gateway is a very common cost driver. When private subnet resources access S3 or DynamoDB via NAT Gateway, you pay NAT Gateway processing fees ($0.045/GB). Adding a Gateway VPC Endpoint is free and routes traffic directly to S3/DynamoDB without going through NAT Gateway. This is often worth hundreds of dollars per month in large environments. CloudFront pricing: the per-GB egress rate from CloudFront is lower than the standard EC2 or S3 egress rate. Additionally, S3 to CloudFront origin fetches are free — you only pay for CloudFront distributing to end users. High cache hit ratios mean most content is served from edge without touching the origin at all. CloudFront Price Classes let you restrict distribution to specific geographic regions. If all your users are in North America, there's no reason to pay for edge locations in Asia-Pacific or Australia.

Task 4.5 — Managed Services

Managed Services vs. Self-Managed TCO

Reducing operational overhead as a cost optimization strategy

Amazon RDS vs. EC2 + MySQL RDS manages patching, backups, Multi-AZ failover, and parameter tuning. Self-managed MySQL on EC2 requires DBAs for these tasks. Even at higher sticker cost, RDS often has lower TCO when engineering time is valued.

AWS Lambda vs. EC2 workers Lambda eliminates idle compute cost and all server management. No OS patching, capacity planning, or scaling configuration. Engineers focus on business logic. Best for variable, event-driven workloads.

Amazon ECS/Fargate vs. self-managed Kubernetes Fargate eliminates EC2 cluster management for containers. EKS with managed node groups reduces Kubernetes control plane overhead. Fargate has higher per-vCPU cost but zero cluster management cost — compare against team time to manage clusters.

Amazon OpenSearch vs. self-managed Elasticsearch OpenSearch Service handles cluster provisioning, patching, snapshots, and scaling. Self-managed Elasticsearch requires dedicated DevOps effort. Managed service pricing often justified by elimination of operational overhead.

AWS Cost Management Tools

AWS Cost Explorer: Visualize spend over time; forecast future costs; identify top cost drivers by service, region, tag
AWS Budgets: Set cost/usage/RI/Savings Plan thresholds; SNS alerts when exceeded
Cost Allocation Tags: Tag resources by team, project, environment; enable in billing console for per-tag cost breakdown
AWS Trusted Advisor: Cost optimization checks: idle EC2, unattached EBS, underutilized RIs, S3 bucket policy
Compute Optimizer: Right-sizing recommendations for EC2, Lambda, ECS, EBS
Billing Alarms: CloudWatch alarm on EstimatedCharges metric

Cost Optimization Mindset

Adopt Cloud Financial Management (CFM) practices
Treat cost as a non-functional requirement
Right-size before reserving capacity
Use spot/serverless before reserving
Monitor continuously — cost drifts over time

"Reduce operational overhead" → managed service answer (RDS over EC2+MySQL, Fargate over self-managed Kubernetes, Lambda over EC2 workers). "Alert when monthly spend exceeds $1,000" → AWS Budgets + SNS. "Identify which team is spending the most" → Cost Allocation Tags + Cost Explorer.

The managed services vs. self-managed comparison often appears as a scenario where the question asks which architecture reduces operational overhead, or which is more cost-effective when factoring in total cost of ownership. The exam doesn't always make you calculate TCO directly — it tests whether you recognize that managed services trade a higher sticker price for lower operational labor cost. When a question says "the team wants to focus on development, not infrastructure management," the answer is always the managed service. AWS Cost Management tools: these appear in both Domain 4 questions and in general architecture questions. Know which tool to recommend for which purpose: - Need to visualize and analyze past spending → Cost Explorer - Need to be alerted when spending exceeds a threshold → AWS Budgets - Need to assign costs to different business units → Cost Allocation Tags - Need automated recommendations for right-sizing → Compute Optimizer or Trusted Advisor Cost Allocation Tags require activation in the Billing and Cost Management console before they appear in Cost Explorer reports. Tags must be applied to resources AND activated in billing. AWS Organizations + consolidated billing: all accounts in an org contribute to volume discounts and Reserved Instance/Savings Plan sharing. A Reserved Instance purchased in the management account can be applied to workloads in any member account.

Domain 4 — Decision Guide

Cost Optimization Scenario Decision Tree

Map the requirement to the cost-optimal solution

Steady-state EC2 workload running 24/7, 1–3 year horizon

→

Savings Plan or Reserved Instance

Batch / background jobs that can be interrupted

→

Spot Instances (up to 90% savings)

Event-driven, infrequent, or variable compute workload

→

AWS Lambda (pay only for invocations)

EC2 instances appear oversized / CPU consistently low

→

AWS Compute Optimizer → right-size to smaller type

Linux workload, want 40% better price/performance

→

Migrate to Graviton (ARM) instance family

S3 data aging — rarely accessed after 30 days

→

S3 Lifecycle policy to Standard-IA → Glacier

Unknown S3 access patterns — want automatic tiering

→

S3 Intelligent-Tiering

Private subnet EC2 frequently accesses S3 via NAT Gateway

→

S3 Gateway VPC Endpoint (free — eliminates NAT processing)

High internet egress costs for static web content

→

CloudFront CDN (lower egress rate + cache hits)

Alert team when monthly AWS spend exceeds budget

→

AWS Budgets + SNS notification

This decision tree covers the most frequently tested cost optimization scenarios. Each row maps a problem description to the exam answer. Additional scenarios worth knowing: "Identify which EC2 instances are underutilized" → Trusted Advisor or Compute Optimizer. "Break down AWS costs by development team" → Enable Cost Allocation Tags, apply to resources, view in Cost Explorer. "Database running 24/7, stable load, relational" → RDS Reserved Instance. "Development database only used 9am–5pm weekdays" → Schedule RDS stop/start or use Aurora Serverless with auto-pause. "Large Redshift cluster idle on weekends" → Pause Redshift cluster on schedule (Serverless auto-suspends; provisioned cluster can be paused manually or via scheduled actions). "Reduce ECS Fargate cost for a containerized workload" → Use Fargate Spot for non-production; use Graviton2 task definition for production.

Quick Review

Exam Checklist — Domain 4

Can you answer these?

Task 4.1 — Storage Costs

S3 storage class selection by access frequency and retrieval SLA
S3 Lifecycle policy: transition and expiration rule configuration
S3 Intelligent-Tiering for unknown or variable access patterns
Standard-IA 30-day minimum charge — don't use for short-lived objects
EBS: gp3 over gp2; delete unattached volumes; DLM for snapshots
EFS Intelligent-Tiering and One Zone for cost reduction

Task 4.2 — Compute Costs

On-Demand (flexible) → RI / Savings Plan (committed) → Spot (interruptible)
Compute Savings Plan covers EC2 + Lambda + Fargate; RIs do not
Spot: up to 90% savings; 2-min warning; SQS for resilient worker pattern
Graviton ARM instances: ~40% better price/perf on Linux workloads
Compute Optimizer for right-sizing EC2, Lambda memory, EBS volume type
Lambda: zero idle cost; scales to zero automatically

Task 4.3 — Database Costs

RDS Reserved Instances for stable, always-on production databases
Aurora Serverless v2 for variable or intermittent workloads
DynamoDB Provisioned + Reserved Capacity for predictable high-traffic tables
ElastiCache in front of RDS to reduce instance sizing requirements
Stop dev/test RDS instances outside business hours to save compute

Tasks 4.4 & 4.5 — Network & Managed Services

Cross-AZ traffic costs money — same-AZ communication is free
S3 Gateway VPC Endpoint is free — eliminates NAT Gateway processing cost
CloudFront reduces egress costs; S3 → CloudFront origin fetch is free
AWS Budgets for spend alerts; Cost Explorer for analysis; Compute Optimizer for right-sizing
Cost Allocation Tags must be activated in billing console before appearing in Cost Explorer
Managed services (RDS, Lambda, Fargate) reduce operational overhead = lower TCO

This checklist represents the highest-probability topics for Domain 4. For a 20% domain, Domain 4 is very concept-dense — many small details can be tested. Most commonly missed items: 1. Standard-IA minimum 30-day charge — exam questions test whether you know not to put short-lived objects in IA tiers. 2. S3 Gateway VPC Endpoint is FREE — this is often the answer to "reduce data transfer costs from private subnet to S3." 3. Compute Savings Plans cover Lambda and Fargate; Reserved Instances do not — this distinction appears in questions about organizations adopting serverless. 4. Cost Allocation Tags need activation — simply applying tags to resources is not enough; they must be activated in the billing and cost management console. 5. Cross-AZ traffic is charged — this is a frequently tested "gotcha" that teams discover in real bills.

Quick Reference

Cost Optimization Service Quick Map

Compute Pricing Models

On-Demand → flexible, short-term, unpredictable
Reserved Instance → 1–3yr, specific type, max discount
Savings Plan → 1–3yr spend commit, flexible types
Spot → interruptible, batch/background, 90% savings
Dedicated Host → BYOL, physical isolation compliance
Lambda / Fargate → pay-per-use, zero idle cost

Storage Cost Tools

S3 Lifecycle → automate tier transitions & expiry
S3 Intelligent-Tiering → auto-tier unknown patterns
S3 One Zone-IA → re-creatable data, 20% cheaper
S3 Glacier Deep Archive → 7–10yr compliance, cheapest
EBS gp3 → decouple IOPS from size; same price as gp2
EFS One Zone + IA tier → dev/cold file storage

Right-Sizing Tools

Compute Optimizer → EC2, EBS, Lambda, ECS sizing
Trusted Advisor → idle EC2, unattached EBS, low-util RIs
Graviton (M6g, C6g, R6g) → 40% better price/perf
Lambda arm64 → 20% cheaper than x86
Performance Insights → find undersized/oversized RDS
CloudWatch + ASG → eliminate idle over-provisioning

Network Cost Reduction

S3 Gateway Endpoint → free; no NAT processing fees
CloudFront → lower egress rate + cache hit savings
CloudFront Price Classes → restrict to cheap regions
Same-AZ communication → avoid cross-AZ charges
Direct Connect → predictable, potentially cheaper egress
VPC Interface Endpoint → cheaper than NAT at volume

Cost Monitoring & Governance

AWS Cost Explorer → visualize & forecast spend
AWS Budgets → threshold alerts via SNS/email
Cost Allocation Tags → per-team/project chargeback
S3 Storage Lens → org-wide S3 cost insights
AWS Organizations → consolidated billing + RI sharing
Billing alarms → CloudWatch EstimatedCharges metric

Database Cost Levers

RDS Reserved Instances → 69% savings for stable DBs
Aurora Serverless v2 → elastic, pay-per-ACU-second
DynamoDB Reserved Capacity → 76% off for stable tables
ElastiCache → reduce DB read load → downsize DB
Stop dev RDS instances → save compute outside hours
Single-AZ → dev/test; Multi-AZ only where HA required

This reference card maps every major Domain 4 topic to its cost optimization service or technique. One final pattern worth noting: many exam questions in Domain 4 combine cost with another domain's requirements. For example: "must be highly available AND cost-optimized" — the answer might be Spot instances combined with On-Demand for the baseline (HA via diversification) plus Savings Plan for the On-Demand portion. Always read the full requirements before selecting a cost optimization — removing Multi-AZ might save money but violates an HA requirement. Using Spot for a stateful database might save money but violates a resilience requirement. Cost optimization must be done within the constraints of the architecture. With this domain complete, you've covered all four SAA-C03 domains: Security (30%), Resilience (26%), Performance (24%), and Cost (20%). Good luck on the exam!

Domain 4 Complete · All Domains Covered

You're ready for Domain 4

20% of SAA-C03 · Design Cost-Optimized Architectures
Good luck on the exam!

4.1 — Cost-Effective Storage 4.2 — Cost-Effective Compute 4.3 — Cost-Effective Databases 4.4 — Network Cost Optimization 4.5 — Managed Services TCO

Domain 1: Secure (30%) · Domain 2: Resilient (26%) · Domain 3: Performing (24%) · Domain 4: Cost (20%)

Congratulations — you've completed all four domains of the AWS Certified Solutions Architect – Associate study guide series. Domain 4 at 20% is the smallest domain, but its principles (right pricing model, right-sizing, lifecycle automation, data transfer awareness) are woven into questions across all four domains. Final preparation tips for the full exam: 1. Practice scenario-based questions — the SAA exam is almost entirely scenario-based, not rote memorization. You'll be given a business context and asked which combination of services and configurations best satisfies the stated requirements. 2. Use process of elimination — in most SAA questions, two answers are clearly wrong and two are plausible. Focus on distinguishing the best answer from the second-best. 3. Keywords matter — "lowest cost," "high availability," "no single point of failure," "millisecond latency," "must not be interrupted" — each phrase points to a specific set of services and patterns. 4. Domains 1 + 2 = 56% of the exam. Prioritize security and resilience if studying time is limited. 5. Hands-on experience is invaluable — labs in IAM, VPC, SQS, RDS Multi-AZ, and S3 Lifecycle cement the conceptual knowledge. Good luck!