AWS Certified AI Practitioner · AIF-C01

Fundamentals of
Generative AI

Domain 2 — Comprehensive Study Guide
Task Statements 2.1 · 2.2 · 2.3

24% of Exam Score — Largest Domain

Domain 2 OverviewWhat You Need to Know

Task 2.1
  • GenAI foundational concepts
  • Tokens, embeddings, vectors
  • FM lifecycle
  • Token-based pricing
  • Context engineering
  • Agentic AI concepts
Task 2.2
  • Advantages of GenAI
  • Limitations & risks
  • Model selection factors
  • Business value & metrics
Task 2.3
  • AWS GenAI services
  • Benefits of AWS for GenAI
  • Security & compliance
  • Cost tradeoffs
📋 Exam Weight

Domain 2 is 24% of scored content — the single heaviest domain, approximately 15–16 questions. Heavy on GenAI concepts, AWS services, and tradeoffs.

2.1

Basic Concepts of
Generative AI

Core Vocabulary · FM Lifecycle · Tokens · Context · Agentic AI

Task 2.1 — DefinitionsGenAI Core Vocabulary

Token The basic unit of text an LLM processes — roughly a word or sub-word. "ChatGPT" ≈ 3 tokens. Models have a context-window token limit.
Chunking Splitting large documents into smaller, overlapping segments so they fit within a model's context window; key for RAG pipelines.
Embedding A numerical vector that captures the semantic meaning of text. Similar meanings produce vectors that are close together in vector space.
Vector / Vector Store A database optimised for storing and querying embeddings by similarity (e.g. Amazon OpenSearch, pgvector). Powers semantic search and RAG.
Prompt Engineering Crafting model inputs (prompts) to steer output quality — includes system prompts, few-shot examples, chain-of-thought instructions.
Foundation Model (FM) A large model pre-trained on broad data that can be adapted to many tasks via prompting or fine-tuning (e.g. Claude, Titan, Llama).
Transformer (LLM) Neural architecture using self-attention to model relationships between all tokens simultaneously; backbone of modern LLMs.

Task 2.1 — DefinitionsModel Types & Techniques

Multi-modal Model Processes and/or generates multiple data types — text, image, audio, video — in a single model (e.g. Claude 3, GPT-4o, Gemini).
Diffusion Model Generates images by learning to reverse a noise-addition process. Powers Stable Diffusion, DALL·E, Amazon Titan Image Generator.
RAG (Retrieval-Augmented Generation) Combines a retrieval step (vector search over a knowledge base) with LLM generation to produce grounded, up-to-date answers.
Fine-tuning Further training an FM on a smaller domain-specific dataset to specialise its knowledge or style. Less data needed than pre-training.
Context Window The maximum number of tokens an LLM can "see" at once — includes the system prompt, conversation history, and current query.
Temperature Controls output randomness. Low (0) = deterministic; high (1+) = creative/variable. Tuned via inference parameters.
Hallucination When a model generates plausible-sounding but factually incorrect output. A key limitation to know for the exam.

Task 2.1 — TokensHow Tokenisation Works & Why It Matters

Example: "Generative AI on AWS" is tokenised into segments:

Generative AI on AWS

~5 tokens. English averages ~0.75 words per token. Code and non-English languages use more tokens per word.

Token-based Pricing
  • Charged per input + output tokens
  • Input tokens = prompt + context
  • Output tokens typically cost more
  • Longer prompts → higher cost per call
Performance Impact
  • More tokens → higher latency
  • Context window limits max conversation length
  • Chunking manages large-doc token overflow
  • Concise prompts improve speed & cost
Provisioned Throughput
  • Reserve capacity for predictable workloads
  • Lower per-token cost at volume
  • Guarantees consistent latency
  • On-demand = pay-as-you-go alternative
⚡ Exam Note

Know the tradeoff: on-demand pricing = flexible, no commitment; provisioned throughput = lower cost + guaranteed performance for steady high-volume workloads.

Task 2.1 — Use CasesGenAI Application Landscape

Content Generation
  • Text & copywriting
  • Image generation
  • Video synthesis
  • Audio / music
Language Tasks
  • Summarisation
  • Translation
  • Classification
  • Sentiment analysis
Code & Dev
  • Code generation
  • Code explanation
  • Bug fixing
  • Test generation
Business Apps
  • AI assistants / chatbots
  • Customer service agents
  • Semantic search
  • Recommendation engines
⚡ Exam Note

Differentiate use cases: summarisation / Q&A / translation → LLMs. Image generation → diffusion models. Multi-modal → models like Claude 3 or Titan. Code generation → Amazon Q Developer / CodeWhisperer.

Task 2.1 — LifecycleFoundation Model Lifecycle

01
Data Selection
Curate massive, diverse, high-quality corpus
02
Model Selection
Choose architecture (transformer size, modality)
03
Pre-training
Train on broad corpus; very high compute cost
04
Fine-tuning
Adapt to domain/task with smaller dataset
05
Evaluation
Benchmarks, human eval, safety testing
06
Deployment
Managed API or self-hosted endpoint
07
Feedback
RLHF, user feedback, continuous improvement
Pre-training vs Fine-tuning

Pre-training: from scratch, massive data & cost. Fine-tuning: starts from pre-trained weights, task-specific, much cheaper.

Bedrock Fine-tuning

Amazon Bedrock supports continued pre-training and fine-tuning for select models using your own labeled data.

RLHF in Feedback Loop

Human raters score outputs; a reward model guides further RL training to align model responses with human preferences.

Task 2.1 — Context EngineeringPrompt & Context Engineering

Prompt Engineering Techniques

Zero-shot

Ask the model directly with no examples. Works for well-known tasks.

Few-shot

Include 2–5 input/output examples in the prompt to guide format and style.

Chain-of-thought (CoT)

Ask the model to "think step by step" — improves reasoning on complex tasks.

Context Management

System Prompt

Sets the model's persona, instructions, and constraints before user interaction begins.

RAG Context Injection

Retrieved document chunks injected into the prompt at inference time — keeps knowledge current without retraining.

Conversation History

Prior turns included in the context window to maintain coherence. Token costs accumulate over long conversations.

⚡ Exam Note

Context engineering = crafting what goes INTO the context window (system prompt, retrieved docs, history). Prompt engineering = crafting the user-facing query. Both affect quality and cost.

Task 2.1 — Agentic AIAgentic AI & Multi-Agent Systems

Core Agentic Concepts

Agent

An LLM that can plan, call tools, observe results, and loop until a goal is reached — not just a single-turn completion.

Tool Usage

Agents invoke external tools (web search, code execution, APIs, databases) to extend beyond their training knowledge.

Memory Management

In-context memory (conversation), external memory (vector store), and procedural memory (learned skills/workflows).

Multi-Agent Patterns & Protocols

Model Context Protocol (MCP)

Open standard connecting agents to external systems (tools, data sources, APIs) via a consistent interface.

Orchestrator–Subagent Pattern

A supervisor agent decomposes a complex task and delegates subtasks to specialised subagents.

Workflow Orchestration

Defining agent pipelines with branching, parallelism, and error handling — e.g. Amazon Bedrock Agents, Strands Agents.

2.2

Capabilities & Limitations
of Generative AI

Advantages · Risks · Model Selection · Business Value

Task 2.2 — CapabilitiesAdvantages & Disadvantages of GenAI

✅ Advantages
Adaptability

One FM can handle many tasks — summarise, translate, classify, generate — without retraining per task.

Conversational Capability

Natural multi-turn dialogue; understands context, follows up, clarifies — enabling rich AI assistants.

Content Generation at Scale

Produces text, images, code, and audio faster than humans; unlocks personalisation at massive scale.

🚫 Disadvantages
Hallucinations

Model generates confident but incorrect or fabricated information. Mitigated by RAG, grounding, output validation.

Nondeterminism

Same prompt can yield different outputs across runs. Not suitable for tasks requiring exact, reproducible results.

Interpretability & Inaccuracy

"Black box" — hard to explain why a response was generated. Regulatory domains may require explainability.

Task 2.2 — SelectionFactors for Selecting a GenAI Model

Context window size, modality support
FactorWhat to considerExample tradeoff
Performance / Accuracy Benchmark scores, task-specific eval results Larger model = better but slower & pricier
Latency Time-to-first-token, tokens/second Real-time chat needs <1 s; async jobs tolerate more
Cost Token pricing, provisioned vs. on-demand Smaller/distilled models cut cost at acceptable quality
Compliance & Data Privacy Data residency, model provider agreements Self-hosted or VPC-isolated model for regulated data
Model Complexity Multimodal needed for image+text inputs
Capabilities & Constraints Max tokens, supported languages, output formats Code generation → model fine-tuned on code
⚡ Exam Note

Amazon Bedrock Model Evaluation helps compare FM performance on your own data before committing to a model. Available for both automatic and human evaluation.

Task 2.2 — Business ValueMeasuring GenAI Business Value

Technical Metrics

  • Accuracy — correct outputs / total evaluations
  • Cross-domain performance — consistent across task types
  • Latency — response time under load
  • Hallucination rate — % of responses requiring correction
  • Task completion rate — agent success %

Business Metrics

  • ROI — net value generated vs. total cost
  • Efficiency — time/cost saved per process
  • Conversion rate — % of leads / trials converted
  • Average Revenue Per User (ARPU)
  • Customer Lifetime Value (CLV)
  • Customer feedback / NPS / CSAT
⚡ Exam Note

The exam often asks: "How do you demonstrate business value for a GenAI application?" — point to ROI, efficiency gains, conversion rate, and CLV as key metrics alongside accuracy.

2.3

AWS Infrastructure &
GenAI Technologies

Services · Benefits · Security · Cost Tradeoffs

Task 2.3 — ServicesAWS GenAI Services & Tools

Amazon Bedrock Fully managed API access to FMs (Anthropic, Meta, Mistral, Amazon Titan). Fine-tuning, knowledge bases, agents, guardrails.
Bedrock AgentCore Managed runtime for deploying production agentic AI apps — handles memory, tool execution, and session management.
SageMaker AI Full ML platform for custom model training, hosting, and MLOps. Supports fine-tuning and self-hosted FM deployment.
SageMaker JumpStart Model hub with 300+ pre-trained models (open source & commercial) deployable to SageMaker endpoints in 1 click.
Amazon Q GenAI assistant for business (Q Business) and developers (Q Developer). Connects to enterprise data via connectors.
Kiro AI-powered IDE for agentic software development — spec-driven, automated code generation, and test writing.
Strands Agents Open-source SDK for building agentic AI apps; integrates with Bedrock models and MCP-compatible tool servers.
Bedrock Guardrails Policy controls for responsible AI — content filtering, PII redaction, topic deny-lists, and grounding checks.

Task 2.3 — Amazon BedrockAmazon Bedrock Feature Map

Model Access
  • On-demand inference (pay-per-token)
  • Provisioned throughput (reserved capacity)
  • Cross-region inference routing
  • Batch inference for large jobs
Knowledge Bases
  • Managed RAG pipeline
  • Auto-chunking & embedding
  • Vector store integration (OpenSearch, Aurora)
  • Keeps responses grounded in your data
Agents
  • Orchestrates multi-step tasks
  • Connects to APIs via Action Groups
  • Integrates with Knowledge Bases
  • Built-in memory & session management
Fine-tuning
  • Continued pre-training with domain data
  • Supervised fine-tuning (labelled examples)
  • Models stored privately in your account
Guardrails
  • Content filters (hate, violence, misconduct)
  • PII detection & redaction
  • Grounding & hallucination detection
  • Applied to any Bedrock model
Model Evaluation
  • Automatic benchmarking on your tasks
  • Human evaluation workflows
  • Compare models before deployment

Task 2.3 — BenefitsWhy Build GenAI Apps on AWS?

Lower Barrier to Entry
  • Managed APIs — no GPU infra to run
  • Pre-built integrations with AWS data stores
  • SageMaker JumpStart 1-click deploy
Speed to Market
  • Skip model training — use pre-trained FMs
  • Bedrock Agents + Knowledge Bases = RAG in hours
  • Serverless inference — no capacity planning
Cost-effectiveness
  • Pay-per-token; no idle GPU cost
  • Provisioned throughput for high-volume savings
  • Model distillation reduces inference cost
Security & Compliance
  • Data never leaves your AWS account by default
  • VPC support, PrivateLink endpoints
  • HIPAA, SOC, PCI, FedRAMP eligible
  • AWS Shared Responsibility Model applies
Responsible AI
  • Guardrails for safety & content control
  • AWS AI Service Cards document model limitations
  • Bedrock watermarking for generated content
Business Objectives
  • Broad model choice — pick best fit
  • Scales from prototype to production
  • Integration with existing AWS stack (S3, Lambda, RDS)

Task 2.3 — Cost TradeoffsAWS GenAI Service Cost & Performance Tradeoffs

On-Demand Pricing Pay per input + output token. No commitment. Best for variable/unpredictable traffic. Higher per-token rate.
Provisioned Throughput Reserve model units per month. Lower per-token cost, guaranteed latency. Best for steady high-volume workloads.
Custom Models Fine-tuned model hosted in your account. Training cost + hosting cost. Better accuracy for domain tasks; higher total spend.
Cross-Region Inference Routes requests to least-loaded region for higher availability & throughput. Slight latency increase vs. single-region.
Batch Inference Process large jobs asynchronously at up to 50% lower cost vs. real-time. No latency guarantee; results retrieved later.
Serverless / Lambda Zero idle cost; scales to zero. Cold-start latency. Best for infrequent or bursty inference triggers.
⚡ Exam Note

Classic tradeoff question: "High throughput, predictable load, cost-optimised" → Provisioned Throughput. "Low volume, unpredictable" → On-Demand. "Offline scoring of millions of records" → Batch inference.

Quick Review &
Exam Checklist

Domain 2 · Key Points to Lock In

Exam ChecklistCan You Answer These?

Task 2.1 — Must Know
  • Token = unit of text LLMs process; pricing is per input + output token
  • Embedding = semantic vector; vector store enables similarity search
  • Chunking splits docs to fit context window (RAG pipelines)
  • FM lifecycle: data → model select → pre-train → fine-tune → eval → deploy → feedback
  • Diffusion models → images; transformers → text; multi-modal → both
  • MCP connects agents to external tools & data sources
Task 2.2 — Must Know
  • Hallucinations = factually wrong but confident output — biggest GenAI risk
  • Nondeterminism = same prompt, different outputs
  • Model selection: balance accuracy, latency, cost, compliance
  • Business metrics: ROI, ARPU, CLV, conversion rate, efficiency
  • Bedrock Model Evaluation = compare FMs on your data
Task 2.3 — Must Know
  • Bedrock = managed FM API + knowledge bases + agents + guardrails
  • SageMaker JumpStart = 1-click open-source model deployment
  • On-demand vs. provisioned throughput vs. batch tradeoffs
  • Bedrock data stays in your account — security & compliance boundary
  • Guardrails = content filtering, PII, grounding checks
Service → Use Case Quick Map
  • Bedrock → FM access, RAG, agents, guardrails
  • Bedrock AgentCore → production agentic apps
  • SageMaker AI → custom training & hosting
  • JumpStart → open-source model hub
  • Amazon Q → enterprise AI assistant
  • Strands Agents / Kiro → dev-focused agentic tools
Domain 2 Complete

You're ready for
Domain 2

24% of AIF-C01 · Fundamentals of Generative AI
The exam's heaviest domain — now covered.

Task 2.1 — GenAI Concepts
Task 2.2 — Capabilities & Limits
Task 2.3 — AWS Infrastructure