AWS Certified AI Practitioner · AIF-C01

Fundamentals of
Generative AI

Domain 2 — Comprehensive Study Guide
Task Statements 2.1 · 2.2 · 2.3

24% of Exam Score — Largest Domain

Domain 2 OverviewWhat You Need to Know

Task 2.1

GenAI foundational concepts
Tokens, embeddings, vectors
FM lifecycle
Token-based pricing
Context engineering
Agentic AI concepts

Task 2.2

Advantages of GenAI
Limitations & risks
Model selection factors
Business value & metrics

Task 2.3

AWS GenAI services
Benefits of AWS for GenAI
Security & compliance
Cost tradeoffs

📋 Exam Weight

Domain 2 is 24% of scored content — the single heaviest domain, approximately 15–16 questions. Heavy on GenAI concepts, AWS services, and tradeoffs.

2.1

Basic Concepts of
Generative AI

Core Vocabulary · FM Lifecycle · Tokens · Context · Agentic AI

Task 2.1 — DefinitionsGenAI Core Vocabulary

Token The basic unit of text an LLM processes — roughly a word or sub-word. "ChatGPT" ≈ 3 tokens. Models have a context-window token limit.

Chunking Splitting large documents into smaller, overlapping segments so they fit within a model's context window; key for RAG pipelines.

Embedding A numerical vector that captures the semantic meaning of text. Similar meanings produce vectors that are close together in vector space.

Vector / Vector Store A database optimised for storing and querying embeddings by similarity (e.g. Amazon OpenSearch, pgvector). Powers semantic search and RAG.

Prompt Engineering Crafting model inputs (prompts) to steer output quality — includes system prompts, few-shot examples, chain-of-thought instructions.

Foundation Model (FM) A large model pre-trained on broad data that can be adapted to many tasks via prompting or fine-tuning (e.g. Claude, Titan, Llama).

Transformer (LLM) Neural architecture using self-attention to model relationships between all tokens simultaneously; backbone of modern LLMs.

Task 2.1 — DefinitionsModel Types & Techniques

Multi-modal Model Processes and/or generates multiple data types — text, image, audio, video — in a single model (e.g. Claude 3, GPT-4o, Gemini).

Diffusion Model Generates images by learning to reverse a noise-addition process. Powers Stable Diffusion, DALL·E, Amazon Titan Image Generator.

RAG (Retrieval-Augmented Generation) Combines a retrieval step (vector search over a knowledge base) with LLM generation to produce grounded, up-to-date answers.

Fine-tuning Further training an FM on a smaller domain-specific dataset to specialise its knowledge or style. Less data needed than pre-training.

Context Window The maximum number of tokens an LLM can "see" at once — includes the system prompt, conversation history, and current query.

Temperature Controls output randomness. Low (0) = deterministic; high (1+) = creative/variable. Tuned via inference parameters.

Hallucination When a model generates plausible-sounding but factually incorrect output. A key limitation to know for the exam.

Task 2.1 — TokensHow Tokenisation Works & Why It Matters

Example: "Generative AI on AWS" is tokenised into segments:

Generative ▸ AI ▸ on ▸ AWS

~5 tokens. English averages ~0.75 words per token. Code and non-English languages use more tokens per word.

Token-based Pricing

Charged per input + output tokens
Input tokens = prompt + context
Output tokens typically cost more
Longer prompts → higher cost per call

Performance Impact

More tokens → higher latency
Context window limits max conversation length
Chunking manages large-doc token overflow
Concise prompts improve speed & cost

Provisioned Throughput

Reserve capacity for predictable workloads
Lower per-token cost at volume
Guarantees consistent latency
On-demand = pay-as-you-go alternative

⚡ Exam Note

Know the tradeoff: on-demand pricing = flexible, no commitment; provisioned throughput = lower cost + guaranteed performance for steady high-volume workloads.

Task 2.1 — Use CasesGenAI Application Landscape

Content Generation

Text & copywriting
Image generation
Video synthesis
Audio / music

Language Tasks

Summarisation
Translation
Classification
Sentiment analysis

Code & Dev

Code generation
Code explanation
Bug fixing
Test generation

Business Apps

AI assistants / chatbots
Customer service agents
Semantic search
Recommendation engines

⚡ Exam Note

Differentiate use cases: summarisation / Q&A / translation → LLMs. Image generation → diffusion models. Multi-modal → models like Claude 3 or Titan. Code generation → Amazon Q Developer / CodeWhisperer.

Task 2.1 — LifecycleFoundation Model Lifecycle

Data Selection

Curate massive, diverse, high-quality corpus

Model Selection

Choose architecture (transformer size, modality)

Pre-training

Train on broad corpus; very high compute cost

Fine-tuning

Adapt to domain/task with smaller dataset

Evaluation

Benchmarks, human eval, safety testing

Deployment

Managed API or self-hosted endpoint

Feedback

RLHF, user feedback, continuous improvement

Pre-training vs Fine-tuning

Pre-training: from scratch, massive data & cost. Fine-tuning: starts from pre-trained weights, task-specific, much cheaper.

Bedrock Fine-tuning

Amazon Bedrock supports continued pre-training and fine-tuning for select models using your own labeled data.

RLHF in Feedback Loop

Human raters score outputs; a reward model guides further RL training to align model responses with human preferences.

Task 2.1 — Context EngineeringPrompt & Context Engineering

Prompt Engineering Techniques

Zero-shot

Ask the model directly with no examples. Works for well-known tasks.

Few-shot

Include 2–5 input/output examples in the prompt to guide format and style.

Chain-of-thought (CoT)

Ask the model to "think step by step" — improves reasoning on complex tasks.

Context Management

System Prompt

Sets the model's persona, instructions, and constraints before user interaction begins.

RAG Context Injection

Retrieved document chunks injected into the prompt at inference time — keeps knowledge current without retraining.

Conversation History

Prior turns included in the context window to maintain coherence. Token costs accumulate over long conversations.

⚡ Exam Note

Context engineering = crafting what goes INTO the context window (system prompt, retrieved docs, history). Prompt engineering = crafting the user-facing query. Both affect quality and cost.

Task 2.1 — Agentic AIAgentic AI & Multi-Agent Systems

Core Agentic Concepts

Agent

An LLM that can plan, call tools, observe results, and loop until a goal is reached — not just a single-turn completion.

Tool Usage

Agents invoke external tools (web search, code execution, APIs, databases) to extend beyond their training knowledge.

Memory Management

In-context memory (conversation), external memory (vector store), and procedural memory (learned skills/workflows).

Multi-Agent Patterns & Protocols

Model Context Protocol (MCP)

Open standard connecting agents to external systems (tools, data sources, APIs) via a consistent interface.

Orchestrator–Subagent Pattern

A supervisor agent decomposes a complex task and delegates subtasks to specialised subagents.

Workflow Orchestration

Defining agent pipelines with branching, parallelism, and error handling — e.g. Amazon Bedrock Agents, Strands Agents.

2.2

Capabilities & Limitations
of Generative AI

Advantages · Risks · Model Selection · Business Value

Task 2.2 — CapabilitiesAdvantages & Disadvantages of GenAI

✅ Advantages

Adaptability

One FM can handle many tasks — summarise, translate, classify, generate — without retraining per task.

Conversational Capability

Natural multi-turn dialogue; understands context, follows up, clarifies — enabling rich AI assistants.

Content Generation at Scale

Produces text, images, code, and audio faster than humans; unlocks personalisation at massive scale.

🚫 Disadvantages

Hallucinations

Model generates confident but incorrect or fabricated information. Mitigated by RAG, grounding, output validation.

Nondeterminism

Same prompt can yield different outputs across runs. Not suitable for tasks requiring exact, reproducible results.

Interpretability & Inaccuracy

"Black box" — hard to explain why a response was generated. Regulatory domains may require explainability.

Task 2.2 — SelectionFactors for Selecting a GenAI Model

Context window size, modality support

Factor	What to consider	Example tradeoff
Performance / Accuracy	Benchmark scores, task-specific eval results	Larger model = better but slower & pricier
Latency	Time-to-first-token, tokens/second	Real-time chat needs <1 s; async jobs tolerate more
Cost	Token pricing, provisioned vs. on-demand	Smaller/distilled models cut cost at acceptable quality
Compliance & Data Privacy	Data residency, model provider agreements	Self-hosted or VPC-isolated model for regulated data
Model Complexity	Multimodal needed for image+text inputs
Capabilities & Constraints	Max tokens, supported languages, output formats	Code generation → model fine-tuned on code

⚡ Exam Note

Amazon Bedrock Model Evaluation helps compare FM performance on your own data before committing to a model. Available for both automatic and human evaluation.

Task 2.2 — Business ValueMeasuring GenAI Business Value

Technical Metrics

Accuracy — correct outputs / total evaluations
Cross-domain performance — consistent across task types
Latency — response time under load
Hallucination rate — % of responses requiring correction
Task completion rate — agent success %

Business Metrics

ROI — net value generated vs. total cost
Efficiency — time/cost saved per process
Conversion rate — % of leads / trials converted
Average Revenue Per User (ARPU)
Customer Lifetime Value (CLV)
Customer feedback / NPS / CSAT

⚡ Exam Note

The exam often asks: "How do you demonstrate business value for a GenAI application?" — point to ROI, efficiency gains, conversion rate, and CLV as key metrics alongside accuracy.

2.3

AWS Infrastructure &
GenAI Technologies

Services · Benefits · Security · Cost Tradeoffs

Task 2.3 — ServicesAWS GenAI Services & Tools

Amazon Bedrock Fully managed API access to FMs (Anthropic, Meta, Mistral, Amazon Titan). Fine-tuning, knowledge bases, agents, guardrails.

Bedrock AgentCore Managed runtime for deploying production agentic AI apps — handles memory, tool execution, and session management.

SageMaker AI Full ML platform for custom model training, hosting, and MLOps. Supports fine-tuning and self-hosted FM deployment.

SageMaker JumpStart Model hub with 300+ pre-trained models (open source & commercial) deployable to SageMaker endpoints in 1 click.

Amazon Q GenAI assistant for business (Q Business) and developers (Q Developer). Connects to enterprise data via connectors.

Kiro AI-powered IDE for agentic software development — spec-driven, automated code generation, and test writing.

Strands Agents Open-source SDK for building agentic AI apps; integrates with Bedrock models and MCP-compatible tool servers.

Bedrock Guardrails Policy controls for responsible AI — content filtering, PII redaction, topic deny-lists, and grounding checks.

Task 2.3 — Amazon BedrockAmazon Bedrock Feature Map

Model Access

On-demand inference (pay-per-token)
Provisioned throughput (reserved capacity)
Cross-region inference routing
Batch inference for large jobs

Knowledge Bases

Managed RAG pipeline
Auto-chunking & embedding
Vector store integration (OpenSearch, Aurora)
Keeps responses grounded in your data

Agents

Orchestrates multi-step tasks
Connects to APIs via Action Groups
Integrates with Knowledge Bases
Built-in memory & session management

Fine-tuning

Continued pre-training with domain data
Supervised fine-tuning (labelled examples)
Models stored privately in your account

Guardrails

Content filters (hate, violence, misconduct)
PII detection & redaction
Grounding & hallucination detection
Applied to any Bedrock model

Model Evaluation

Automatic benchmarking on your tasks
Human evaluation workflows
Compare models before deployment

Task 2.3 — BenefitsWhy Build GenAI Apps on AWS?

Lower Barrier to Entry

Managed APIs — no GPU infra to run
Pre-built integrations with AWS data stores
SageMaker JumpStart 1-click deploy

Speed to Market

Skip model training — use pre-trained FMs
Bedrock Agents + Knowledge Bases = RAG in hours
Serverless inference — no capacity planning

Cost-effectiveness

Pay-per-token; no idle GPU cost
Provisioned throughput for high-volume savings
Model distillation reduces inference cost

Security & Compliance

Data never leaves your AWS account by default
VPC support, PrivateLink endpoints
HIPAA, SOC, PCI, FedRAMP eligible
AWS Shared Responsibility Model applies

Responsible AI

Guardrails for safety & content control
AWS AI Service Cards document model limitations
Bedrock watermarking for generated content

Business Objectives

Broad model choice — pick best fit
Scales from prototype to production
Integration with existing AWS stack (S3, Lambda, RDS)

Task 2.3 — Cost TradeoffsAWS GenAI Service Cost & Performance Tradeoffs

On-Demand Pricing Pay per input + output token. No commitment. Best for variable/unpredictable traffic. Higher per-token rate.

Provisioned Throughput Reserve model units per month. Lower per-token cost, guaranteed latency. Best for steady high-volume workloads.

Custom Models Fine-tuned model hosted in your account. Training cost + hosting cost. Better accuracy for domain tasks; higher total spend.

Cross-Region Inference Routes requests to least-loaded region for higher availability & throughput. Slight latency increase vs. single-region.

Batch Inference Process large jobs asynchronously at up to 50% lower cost vs. real-time. No latency guarantee; results retrieved later.

Serverless / Lambda Zero idle cost; scales to zero. Cold-start latency. Best for infrequent or bursty inference triggers.

⚡ Exam Note

Classic tradeoff question: "High throughput, predictable load, cost-optimised" → Provisioned Throughput. "Low volume, unpredictable" → On-Demand. "Offline scoring of millions of records" → Batch inference.

✓

Quick Review &
Exam Checklist

Domain 2 · Key Points to Lock In

Exam ChecklistCan You Answer These?

Task 2.1 — Must Know

Token = unit of text LLMs process; pricing is per input + output token
Embedding = semantic vector; vector store enables similarity search
Chunking splits docs to fit context window (RAG pipelines)
FM lifecycle: data → model select → pre-train → fine-tune → eval → deploy → feedback
Diffusion models → images; transformers → text; multi-modal → both
MCP connects agents to external tools & data sources

Task 2.2 — Must Know

Hallucinations = factually wrong but confident output — biggest GenAI risk
Nondeterminism = same prompt, different outputs
Model selection: balance accuracy, latency, cost, compliance
Business metrics: ROI, ARPU, CLV, conversion rate, efficiency
Bedrock Model Evaluation = compare FMs on your data

Task 2.3 — Must Know

Bedrock = managed FM API + knowledge bases + agents + guardrails
SageMaker JumpStart = 1-click open-source model deployment
On-demand vs. provisioned throughput vs. batch tradeoffs
Bedrock data stays in your account — security & compliance boundary
Guardrails = content filtering, PII, grounding checks

Service → Use Case Quick Map

Bedrock → FM access, RAG, agents, guardrails
Bedrock AgentCore → production agentic apps
SageMaker AI → custom training & hosting
JumpStart → open-source model hub
Amazon Q → enterprise AI assistant
Strands Agents / Kiro → dev-focused agentic tools

Domain 2 Complete

You're ready for
Domain 2

24% of AIF-C01 · Fundamentals of Generative AI
The exam's heaviest domain — now covered.

Task 2.1 — GenAI Concepts

Task 2.2 — Capabilities & Limits

Task 2.3 — AWS Infrastructure

Fundamentals ofGenerative AI

Domain 2 OverviewWhat You Need to Know

Basic Concepts ofGenerative AI

Task 2.1 — DefinitionsGenAI Core Vocabulary

Task 2.1 — DefinitionsModel Types & Techniques

Task 2.1 — TokensHow Tokenisation Works & Why It Matters

Task 2.1 — Use CasesGenAI Application Landscape

Task 2.1 — LifecycleFoundation Model Lifecycle

Task 2.1 — Context EngineeringPrompt & Context Engineering

Prompt Engineering Techniques

Context Management

Task 2.1 — Agentic AIAgentic AI & Multi-Agent Systems

Core Agentic Concepts

Multi-Agent Patterns & Protocols

Capabilities & Limitationsof Generative AI

Task 2.2 — CapabilitiesAdvantages & Disadvantages of GenAI

Task 2.2 — SelectionFactors for Selecting a GenAI Model

Task 2.2 — Business ValueMeasuring GenAI Business Value

Technical Metrics

Business Metrics

AWS Infrastructure &GenAI Technologies

Task 2.3 — ServicesAWS GenAI Services & Tools

Task 2.3 — Amazon BedrockAmazon Bedrock Feature Map

Task 2.3 — BenefitsWhy Build GenAI Apps on AWS?

Task 2.3 — Cost TradeoffsAWS GenAI Service Cost & Performance Tradeoffs

Quick Review &Exam Checklist

Exam ChecklistCan You Answer These?

You're ready forDomain 2

Fundamentals of
Generative AI

Basic Concepts of
Generative AI

Capabilities & Limitations
of Generative AI

AWS Infrastructure &
GenAI Technologies

Quick Review &
Exam Checklist

You're ready for
Domain 2