AWS Certified AI Practitioner · AIF-C01
Fundamentals of
Generative AI
Domain 2 — Comprehensive Study Guide
Task Statements 2.1 · 2.2 · 2.3
24% of Exam Score — Largest Domain
Domain 2 OverviewWhat You Need to Know
Task 2.1
- GenAI foundational concepts
- Tokens, embeddings, vectors
- FM lifecycle
- Token-based pricing
- Context engineering
- Agentic AI concepts
Task 2.2
- Advantages of GenAI
- Limitations & risks
- Model selection factors
- Business value & metrics
Task 2.3
- AWS GenAI services
- Benefits of AWS for GenAI
- Security & compliance
- Cost tradeoffs
📋 Exam Weight
Domain 2 is 24% of scored content — the single heaviest domain, approximately 15–16 questions. Heavy on GenAI concepts, AWS services, and tradeoffs.
2.1
Basic Concepts of
Generative AI
Core Vocabulary · FM Lifecycle · Tokens · Context · Agentic AI
Task 2.1 — DefinitionsGenAI Core Vocabulary
Token
The basic unit of text an LLM processes — roughly a word or sub-word. "ChatGPT" ≈ 3 tokens. Models have a context-window token limit.
Chunking
Splitting large documents into smaller, overlapping segments so they fit within a model's context window; key for RAG pipelines.
Embedding
A numerical vector that captures the semantic meaning of text. Similar meanings produce vectors that are close together in vector space.
Vector / Vector Store
A database optimised for storing and querying embeddings by similarity (e.g. Amazon OpenSearch, pgvector). Powers semantic search and RAG.
Prompt Engineering
Crafting model inputs (prompts) to steer output quality — includes system prompts, few-shot examples, chain-of-thought instructions.
Foundation Model (FM)
A large model pre-trained on broad data that can be adapted to many tasks via prompting or fine-tuning (e.g. Claude, Titan, Llama).
Transformer (LLM)
Neural architecture using self-attention to model relationships between all tokens simultaneously; backbone of modern LLMs.
Task 2.1 — DefinitionsModel Types & Techniques
Multi-modal Model
Processes and/or generates multiple data types — text, image, audio, video — in a single model (e.g. Claude 3, GPT-4o, Gemini).
Diffusion Model
Generates images by learning to reverse a noise-addition process. Powers Stable Diffusion, DALL·E, Amazon Titan Image Generator.
RAG (Retrieval-Augmented Generation)
Combines a retrieval step (vector search over a knowledge base) with LLM generation to produce grounded, up-to-date answers.
Fine-tuning
Further training an FM on a smaller domain-specific dataset to specialise its knowledge or style. Less data needed than pre-training.
Context Window
The maximum number of tokens an LLM can "see" at once — includes the system prompt, conversation history, and current query.
Temperature
Controls output randomness. Low (0) = deterministic; high (1+) = creative/variable. Tuned via inference parameters.
Hallucination
When a model generates plausible-sounding but factually incorrect output. A key limitation to know for the exam.
Task 2.1 — TokensHow Tokenisation Works & Why It Matters
Example: "Generative AI on AWS" is tokenised into segments:
Generative
▸
AI
▸
on
▸
AWS
~5 tokens. English averages ~0.75 words per token. Code and non-English languages use more tokens per word.
Token-based Pricing
- Charged per input + output tokens
- Input tokens = prompt + context
- Output tokens typically cost more
- Longer prompts → higher cost per call
Performance Impact
- More tokens → higher latency
- Context window limits max conversation length
- Chunking manages large-doc token overflow
- Concise prompts improve speed & cost
Provisioned Throughput
- Reserve capacity for predictable workloads
- Lower per-token cost at volume
- Guarantees consistent latency
- On-demand = pay-as-you-go alternative
⚡ Exam Note
Know the tradeoff: on-demand pricing = flexible, no commitment; provisioned throughput = lower cost + guaranteed performance for steady high-volume workloads.
Task 2.1 — Use CasesGenAI Application Landscape
Content Generation
- Text & copywriting
- Image generation
- Video synthesis
- Audio / music
Language Tasks
- Summarisation
- Translation
- Classification
- Sentiment analysis
Code & Dev
- Code generation
- Code explanation
- Bug fixing
- Test generation
Business Apps
- AI assistants / chatbots
- Customer service agents
- Semantic search
- Recommendation engines
⚡ Exam Note
Differentiate use cases: summarisation / Q&A / translation → LLMs. Image generation → diffusion models. Multi-modal → models like Claude 3 or Titan. Code generation → Amazon Q Developer / CodeWhisperer.
Task 2.1 — LifecycleFoundation Model Lifecycle
01
Data Selection
Curate massive, diverse, high-quality corpus
02
Model Selection
Choose architecture (transformer size, modality)
03
Pre-training
Train on broad corpus; very high compute cost
04
Fine-tuning
Adapt to domain/task with smaller dataset
05
Evaluation
Benchmarks, human eval, safety testing
06
Deployment
Managed API or self-hosted endpoint
07
Feedback
RLHF, user feedback, continuous improvement
Pre-training vs Fine-tuning
Pre-training: from scratch, massive data & cost. Fine-tuning: starts from pre-trained weights, task-specific, much cheaper.
Bedrock Fine-tuning
Amazon Bedrock supports continued pre-training and fine-tuning for select models using your own labeled data.
RLHF in Feedback Loop
Human raters score outputs; a reward model guides further RL training to align model responses with human preferences.
Task 2.1 — Context EngineeringPrompt & Context Engineering
Prompt Engineering Techniques
Zero-shot
Ask the model directly with no examples. Works for well-known tasks.
Few-shot
Include 2–5 input/output examples in the prompt to guide format and style.
Chain-of-thought (CoT)
Ask the model to "think step by step" — improves reasoning on complex tasks.
Context Management
System Prompt
Sets the model's persona, instructions, and constraints before user interaction begins.
RAG Context Injection
Retrieved document chunks injected into the prompt at inference time — keeps knowledge current without retraining.
Conversation History
Prior turns included in the context window to maintain coherence. Token costs accumulate over long conversations.
⚡ Exam Note
Context engineering = crafting what goes INTO the context window (system prompt, retrieved docs, history). Prompt engineering = crafting the user-facing query. Both affect quality and cost.
Task 2.1 — Agentic AIAgentic AI & Multi-Agent Systems
Core Agentic Concepts
Agent
An LLM that can plan, call tools, observe results, and loop until a goal is reached — not just a single-turn completion.
Tool Usage
Agents invoke external tools (web search, code execution, APIs, databases) to extend beyond their training knowledge.
Memory Management
In-context memory (conversation), external memory (vector store), and procedural memory (learned skills/workflows).
Multi-Agent Patterns & Protocols
Model Context Protocol (MCP)
Open standard connecting agents to external systems (tools, data sources, APIs) via a consistent interface.
Orchestrator–Subagent Pattern
A supervisor agent decomposes a complex task and delegates subtasks to specialised subagents.
Workflow Orchestration
Defining agent pipelines with branching, parallelism, and error handling — e.g. Amazon Bedrock Agents, Strands Agents.
2.2
Capabilities & Limitations
of Generative AI
Advantages · Risks · Model Selection · Business Value
Task 2.2 — CapabilitiesAdvantages & Disadvantages of GenAI
✅ Advantages
Adaptability
One FM can handle many tasks — summarise, translate, classify, generate — without retraining per task.
Conversational Capability
Natural multi-turn dialogue; understands context, follows up, clarifies — enabling rich AI assistants.
Content Generation at Scale
Produces text, images, code, and audio faster than humans; unlocks personalisation at massive scale.
🚫 Disadvantages
Hallucinations
Model generates confident but incorrect or fabricated information. Mitigated by RAG, grounding, output validation.
Nondeterminism
Same prompt can yield different outputs across runs. Not suitable for tasks requiring exact, reproducible results.
Interpretability & Inaccuracy
"Black box" — hard to explain why a response was generated. Regulatory domains may require explainability.
Task 2.2 — SelectionFactors for Selecting a GenAI Model
| Factor | What to consider | Example tradeoff |
| Performance / Accuracy |
Benchmark scores, task-specific eval results |
Larger model = better but slower & pricier |
| Latency |
Time-to-first-token, tokens/second |
Real-time chat needs <1 s; async jobs tolerate more |
| Cost |
Token pricing, provisioned vs. on-demand |
Smaller/distilled models cut cost at acceptable quality |
| Compliance & Data Privacy |
Data residency, model provider agreements |
Self-hosted or VPC-isolated model for regulated data |
| Model Complexity |
Context window size, modality support |
Multimodal needed for image+text inputs |
| Capabilities & Constraints |
Max tokens, supported languages, output formats |
Code generation → model fine-tuned on code |
⚡ Exam Note
Amazon Bedrock Model Evaluation helps compare FM performance on your own data before committing to a model. Available for both automatic and human evaluation.
Task 2.2 — Business ValueMeasuring GenAI Business Value
Technical Metrics
- Accuracy — correct outputs / total evaluations
- Cross-domain performance — consistent across task types
- Latency — response time under load
- Hallucination rate — % of responses requiring correction
- Task completion rate — agent success %
Business Metrics
- ROI — net value generated vs. total cost
- Efficiency — time/cost saved per process
- Conversion rate — % of leads / trials converted
- Average Revenue Per User (ARPU)
- Customer Lifetime Value (CLV)
- Customer feedback / NPS / CSAT
⚡ Exam Note
The exam often asks: "How do you demonstrate business value for a GenAI application?" — point to ROI, efficiency gains, conversion rate, and CLV as key metrics alongside accuracy.
2.3
AWS Infrastructure &
GenAI Technologies
Services · Benefits · Security · Cost Tradeoffs
Task 2.3 — ServicesAWS GenAI Services & Tools
Amazon Bedrock
Fully managed API access to FMs (Anthropic, Meta, Mistral, Amazon Titan). Fine-tuning, knowledge bases, agents, guardrails.
Bedrock AgentCore
Managed runtime for deploying production agentic AI apps — handles memory, tool execution, and session management.
SageMaker AI
Full ML platform for custom model training, hosting, and MLOps. Supports fine-tuning and self-hosted FM deployment.
SageMaker JumpStart
Model hub with 300+ pre-trained models (open source & commercial) deployable to SageMaker endpoints in 1 click.
Amazon Q
GenAI assistant for business (Q Business) and developers (Q Developer). Connects to enterprise data via connectors.
Kiro
AI-powered IDE for agentic software development — spec-driven, automated code generation, and test writing.
Strands Agents
Open-source SDK for building agentic AI apps; integrates with Bedrock models and MCP-compatible tool servers.
Bedrock Guardrails
Policy controls for responsible AI — content filtering, PII redaction, topic deny-lists, and grounding checks.
Task 2.3 — Amazon BedrockAmazon Bedrock Feature Map
Model Access
- On-demand inference (pay-per-token)
- Provisioned throughput (reserved capacity)
- Cross-region inference routing
- Batch inference for large jobs
Knowledge Bases
- Managed RAG pipeline
- Auto-chunking & embedding
- Vector store integration (OpenSearch, Aurora)
- Keeps responses grounded in your data
Agents
- Orchestrates multi-step tasks
- Connects to APIs via Action Groups
- Integrates with Knowledge Bases
- Built-in memory & session management
Fine-tuning
- Continued pre-training with domain data
- Supervised fine-tuning (labelled examples)
- Models stored privately in your account
Guardrails
- Content filters (hate, violence, misconduct)
- PII detection & redaction
- Grounding & hallucination detection
- Applied to any Bedrock model
Model Evaluation
- Automatic benchmarking on your tasks
- Human evaluation workflows
- Compare models before deployment
Task 2.3 — BenefitsWhy Build GenAI Apps on AWS?
Lower Barrier to Entry
- Managed APIs — no GPU infra to run
- Pre-built integrations with AWS data stores
- SageMaker JumpStart 1-click deploy
Speed to Market
- Skip model training — use pre-trained FMs
- Bedrock Agents + Knowledge Bases = RAG in hours
- Serverless inference — no capacity planning
Cost-effectiveness
- Pay-per-token; no idle GPU cost
- Provisioned throughput for high-volume savings
- Model distillation reduces inference cost
Security & Compliance
- Data never leaves your AWS account by default
- VPC support, PrivateLink endpoints
- HIPAA, SOC, PCI, FedRAMP eligible
- AWS Shared Responsibility Model applies
Responsible AI
- Guardrails for safety & content control
- AWS AI Service Cards document model limitations
- Bedrock watermarking for generated content
Business Objectives
- Broad model choice — pick best fit
- Scales from prototype to production
- Integration with existing AWS stack (S3, Lambda, RDS)
Task 2.3 — Cost TradeoffsAWS GenAI Service Cost & Performance Tradeoffs
On-Demand Pricing
Pay per input + output token. No commitment. Best for variable/unpredictable traffic. Higher per-token rate.
Provisioned Throughput
Reserve model units per month. Lower per-token cost, guaranteed latency. Best for steady high-volume workloads.
Custom Models
Fine-tuned model hosted in your account. Training cost + hosting cost. Better accuracy for domain tasks; higher total spend.
Cross-Region Inference
Routes requests to least-loaded region for higher availability & throughput. Slight latency increase vs. single-region.
Batch Inference
Process large jobs asynchronously at up to 50% lower cost vs. real-time. No latency guarantee; results retrieved later.
Serverless / Lambda
Zero idle cost; scales to zero. Cold-start latency. Best for infrequent or bursty inference triggers.
⚡ Exam Note
Classic tradeoff question: "High throughput, predictable load, cost-optimised" → Provisioned Throughput. "Low volume, unpredictable" → On-Demand. "Offline scoring of millions of records" → Batch inference.
✓
Quick Review &
Exam Checklist
Domain 2 · Key Points to Lock In
Exam ChecklistCan You Answer These?
Task 2.1 — Must Know
- Token = unit of text LLMs process; pricing is per input + output token
- Embedding = semantic vector; vector store enables similarity search
- Chunking splits docs to fit context window (RAG pipelines)
- FM lifecycle: data → model select → pre-train → fine-tune → eval → deploy → feedback
- Diffusion models → images; transformers → text; multi-modal → both
- MCP connects agents to external tools & data sources
Task 2.2 — Must Know
- Hallucinations = factually wrong but confident output — biggest GenAI risk
- Nondeterminism = same prompt, different outputs
- Model selection: balance accuracy, latency, cost, compliance
- Business metrics: ROI, ARPU, CLV, conversion rate, efficiency
- Bedrock Model Evaluation = compare FMs on your data
Task 2.3 — Must Know
- Bedrock = managed FM API + knowledge bases + agents + guardrails
- SageMaker JumpStart = 1-click open-source model deployment
- On-demand vs. provisioned throughput vs. batch tradeoffs
- Bedrock data stays in your account — security & compliance boundary
- Guardrails = content filtering, PII, grounding checks
Service → Use Case Quick Map
- Bedrock → FM access, RAG, agents, guardrails
- Bedrock AgentCore → production agentic apps
- SageMaker AI → custom training & hosting
- JumpStart → open-source model hub
- Amazon Q → enterprise AI assistant
- Strands Agents / Kiro → dev-focused agentic tools
Domain 2 Complete
You're ready for
Domain 2
24% of AIF-C01 · Fundamentals of Generative AI
The exam's heaviest domain — now covered.
Task 2.1 — GenAI Concepts
Task 2.2 — Capabilities & Limits
Task 2.3 — AWS Infrastructure