All Days
Day 11
Claude Certified Architect Study Guide
Claude Certified Architect (Foundations)
Claude Certified Architect
The complete study guide. Every domain. Every concept. Every trap.
60
Questions
120
Minutes
720
/ 1000 Pass Score
~85
Study Hours
“You don't need the certificate to build production-grade applications. You just need the knowledge.”
-- The only thing that matters
5 domains -- weighted by exam percentage
The exam tests 5 domains. Domain 1 (Agentic Architecture) is the heaviest at 27%. Together, these cover everything you need to architect production Claude systems.
1. Agentic Architecture & Orchestration
27% of exam -- ~23 hours study
|
2. Tool Design & MCP Integration
18% of exam -- ~15 hours study
|
3. Claude Code Configuration & Workflows
20% of exam -- ~17 hours study
|
4. Prompt Engineering & Structured Output
20% of exam -- ~17 hours study
|
5. Context Management & Reliability
15% of exam -- ~13 hours study
Exam logistics
Proctored exam. No Claude allowed during the test. You cannot use the tool you are being tested on.
Exclusive to Claude Partner Network members. $99 fee, but free for the first 5,000 partners.
But the knowledge is what matters. Whether you take the exam or not, the concepts in this guide are what separate someone who uses Claude from someone who architects production systems with it.
Exclusive to Claude Partner Network members. $99 fee, but free for the first 5,000 partners.
But the knowledge is what matters. Whether you take the exam or not, the concepts in this guide are what separate someone who uses Claude from someone who architects production systems with it.
What you actually need to know
The exam tests 4 core technologies. If you understand these deeply, you pass.
Claude Code
CLAUDE.md hierarchy, slash commands, skills, plan mode, CI/CD with
-p flag, built-in tools (Grep, Glob, Edit, Read)Claude Agent SDK
Agentic loops, stop_reason handling, multi-agent orchestration, hooks (PostToolUse, tool call interception), session management
Claude API
Messages API, tool_use, JSON schemas, tool_choice, Batch API, structured output, validation-retry loops
Model Context Protocol (MCP)
Project-level vs user-level config, .mcp.json, environment variable expansion, community servers, custom server design
6 scenario types tested
The exam presents real-world scenarios. These are the 6 archetypes you will encounter:
1. Customer Support Resolution Agent
Agent SDK + MCP tools + escalation triggers + error propagation
2. Code Generation with Claude Code
CLAUDE.md config + plan mode + slash commands + built-in tools
3. Multi-Agent Research System
Coordinator-subagent orchestration + isolated context + explicit passing
4. Developer Productivity Tools
Built-in tools + MCP servers + tool descriptions + distribution
5. Claude Code for CI/CD
Non-interactive pipelines (-p flag) + structured output + independent review
6. Structured Data Extraction
JSON schemas + tool_use + validation-retry loops + batch processing
Agent SDK + MCP tools + escalation triggers + error propagation
2. Code Generation with Claude Code
CLAUDE.md config + plan mode + slash commands + built-in tools
3. Multi-Agent Research System
Coordinator-subagent orchestration + isolated context + explicit passing
4. Developer Productivity Tools
Built-in tools + MCP servers + tool descriptions + distribution
5. Claude Code for CI/CD
Non-interactive pipelines (-p flag) + structured output + independent review
6. Structured Data Extraction
JSON schemas + tool_use + validation-retry loops + batch processing
Domain 1 -- 27% of Exam
Agentic Architecture & Orchestration
The heaviest domain. Agentic loops, multi-agent coordination, workflow enforcement, task decomposition, and session management.
Core Concept: Agentic Loops
The agentic loop -- how agents actually work
stop_reason is the ONLY reliable termination signal
Every agent follows the same core loop. Understand this and you understand agents.
Step 1: Send a request to Claude via the Messages API.
Step 2: Inspect the
Step 3: If
Step 4: If
Step 1: Send a request to Claude via the Messages API.
Step 2: Inspect the
stop_reason field in the response.Step 3: If
stop_reason === "tool_use": execute the tool, append results to conversation, send back.Step 4: If
stop_reason === "end_turn": the agent is finished. Exit the loop.# The canonical agentic loop
while True:
response = client.messages.create(
model="claude-sonnet-4-20250514",
messages=conversation,
tools=tools
)
if response.stop_reason == "end_turn":
break # Agent finished
if response.stop_reason == "tool_use":
tool_results = execute_tools(response)
conversation.append(response)
conversation.append(tool_results)
while True:
response = client.messages.create(
model="claude-sonnet-4-20250514",
messages=conversation,
tools=tools
)
if response.stop_reason == "end_turn":
break # Agent finished
if response.stop_reason == "tool_use":
tool_results = execute_tools(response)
conversation.append(response)
conversation.append(tool_results)
3 anti-patterns the exam will test you on
The exam will present these as plausible options. Reject all three.
1. Parsing natural language to determine loop termination. Checking if Claude says “I'm done” or “task complete.” This is unreliable.
2. Arbitrary iteration caps as primary stopping mechanism. Setting max_iterations=10 and hoping the agent finishes. Caps are safety nets, not control flow.
3. Checking for assistant text as completion indicator. Looking for the presence of text content as a signal. The only signal is
1. Parsing natural language to determine loop termination. Checking if Claude says “I'm done” or “task complete.” This is unreliable.
2. Arbitrary iteration caps as primary stopping mechanism. Setting max_iterations=10 and hoping the agent finishes. Caps are safety nets, not control flow.
3. Checking for assistant text as completion indicator. Looking for the presence of text content as a signal. The only signal is
stop_reason.Multi-Agent Orchestration
Hub-and-spoke architecture
ALL communication flows through the coordinator
Multi-agent systems use hub-and-spoke architecture. There is one coordinator and multiple subagents.
Rule 1: All communication flows through the coordinator. Subagents NEVER communicate directly with each other.
Rule 2: Subagents do NOT share memory with the coordinator. They operate with isolated context.
Rule 3: Every piece of information a subagent needs must be passed explicitly in its prompt.
Rule 1: All communication flows through the coordinator. Subagents NEVER communicate directly with each other.
Rule 2: Subagents do NOT share memory with the coordinator. They operate with isolated context.
Rule 3: Every piece of information a subagent needs must be passed explicitly in its prompt.
Coordinator Agent
owns the conversation, delegates tasks, aggregates results
|
Subagent A
isolated context
Subagent B
isolated context
Subagent C
isolated context
“People assume subagents share memory with the coordinator. They do not. Subagents operate with isolated context. Every piece of information must be passed explicitly in the prompt.”
-- The biggest mistake on the exam
Workflow Enforcement
Prompt-based guidance
Probabilistic -- works most of the time
Tell the agent what to do in the system prompt. It will follow instructions ~95% of the time.
Good for: formatting, style, tone, general workflow suggestions, low-stakes decisions.
Bad for: anything where failure has real consequences. Financial transactions, security checks, compliance rules.
Good for: formatting, style, tone, general workflow suggestions, low-stakes decisions.
Bad for: anything where failure has real consequences. Financial transactions, security checks, compliance rules.
Programmatic enforcement (hooks)
Deterministic -- works every time
Use hooks to intercept and enforce. The agent cannot bypass programmatic checks.
Good for: financial thresholds, security policies, compliance rules, data validation, access control.
Rule: If the consequence of failure is financial, security, or compliance related, ALWAYS use programmatic enforcement.
Good for: financial thresholds, security policies, compliance rules, data validation, access control.
Rule: If the consequence of failure is financial, security, or compliance related, ALWAYS use programmatic enforcement.
Agent SDK Hooks
PostToolUse and tool call interception
PostToolUse hooks: Run after a tool executes. Use for normalizing data formats, validating outputs, enriching results.
Tool call interception: Run before a tool executes. Use for blocking actions that violate policy. Example: block any refund > $500 without manager approval.
Decision framework: Hooks = deterministic guarantees. Prompts = probabilistic guidance.
Tool call interception: Run before a tool executes. Use for blocking actions that violate policy. Example: block any refund > $500 without manager approval.
Decision framework: Hooks = deterministic guarantees. Prompts = probabilistic guidance.
Task Decomposition
Fixed sequential pipelines
Best for: predictable, structured tasks where steps are known upfront.
Example: Extract data from PDF, validate against schema, write to database, generate report.
Example: Extract data from PDF, validate against schema, write to database, generate report.
Dynamic adaptive decomposition
Best for: open-ended investigation where next steps depend on what was found.
Example: Research a topic, follow interesting threads, synthesize findings.
Example: Research a topic, follow interesting threads, synthesize findings.
Attention dilution problem
When you pass too many files in a single analysis pass, Claude's attention is diluted. The result: inconsistent depth.
The fix: Per-file local analysis pass + a separate cross-file integration pass. Two passes always beats one pass.
The fix: Per-file local analysis pass + a separate cross-file integration pass. Two passes always beats one pass.
Session State Management
Session management options
--resume: Continue a named session. All previous context is preserved.
fork_session: Create an independent branch from a shared baseline.
Fresh start with summary injection: Start a new session but inject a summary of previous work.
Stale context problem: When resuming, files may have changed. You must inform the agent of SPECIFIC file changes.
fork_session: Create an independent branch from a shared baseline.
Fresh start with summary injection: Start a new session but inject a summary of previous work.
Stale context problem: When resuming, files may have changed. You must inform the agent of SPECIFIC file changes.
What to build for this domain
A multi-tool agent with 3-4 MCP tools, proper stop_reason handling, a PostToolUse hook normalizing data formats, and a tool call interception hook blocking policy violations. If you can build this, you understand Domain 1.
Real Code: Agent SDK
# The canonical agentic loop (Agent SDK)
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions
async def main():
async for message in query(
prompt="Find and fix the bug in auth.py",
options=ClaudeAgentOptions(
allowed_tools=["Read", "Edit", "Bash"]
),
):
if hasattr(message, "result"):
print(message.result)
asyncio.run(main())
import asyncio
from claude_agent_sdk import query, ClaudeAgentOptions
async def main():
async for message in query(
prompt="Find and fix the bug in auth.py",
options=ClaudeAgentOptions(
allowed_tools=["Read", "Edit", "Bash"]
),
):
if hasattr(message, "result"):
print(message.result)
asyncio.run(main())
# PostToolUse hook: log every file change
async def log_file_change(input_data, tool_use_id, context):
file_path = input_data.get("tool_input", {}).get("file_path", "unknown")
with open("./audit.log", "a") as f:
f.write(f"{datetime.now()}: modified {file_path}\n")
return {}
async def log_file_change(input_data, tool_use_id, context):
file_path = input_data.get("tool_input", {}).get("file_path", "unknown")
with open("./audit.log", "a") as f:
f.write(f"{datetime.now()}: modified {file_path}\n")
return {}
Practice Scenarios -- Domain 1
PRACTICE SCENARIO
A developer's agent sometimes terminates prematurely because they check if response.content[0].type === 'text' to determine completion. What is the bug?
ANSWER
The model can return text alongside tool_use blocks. Checking for text content is not a reliable termination signal. Use stop_reason === 'end_turn' as the ONLY reliable termination signal.
PRACTICE SCENARIO
A multi-agent research system produces a report on 'renewable energy' that only covers solar and wind, missing geothermal, tidal, and biomass. Where is the root cause?
ANSWER
The coordinator's task decomposition is too narrow. The failure is in the coordinator, not the subagents. The coordinator must decompose research scope to cover all major subtopics before delegating.
PRACTICE SCENARIO
Production data shows that in 8% of cases, a customer support agent processes refunds without verifying account ownership. What is the correct fix?
ANSWER
Programmatic prerequisite gate (hook) that blocks the refund tool until account verification is complete. NOT enhanced prompts, NOT few-shot examples, NOT routing classifiers. Financial operations require deterministic enforcement.
Study Prompt -- Domain 1
Paste this into Claude to study Domain 1
I am studying for the Claude Certified Architect exam. Quiz me on Domain 1: Agentic Architecture and Orchestration (27% of exam).
Test me on these specific concepts:
1. The agentic loop: stop_reason as the ONLY reliable termination signal. Why checking for text content or parsing "I'm done" are anti-patterns.
2. Multi-agent orchestration: hub-and-spoke architecture, coordinator-subagent communication, isolated context (subagents do NOT share memory).
3. Workflow enforcement: prompt-based guidance (probabilistic) vs programmatic hooks (deterministic). When to use each. Financial/security = always hooks.
4. Agent SDK hooks: PostToolUse (data normalization, validation) vs tool call interception (policy enforcement, blocking).
5. Task decomposition: fixed sequential pipelines vs dynamic adaptive decomposition. Attention dilution and per-file analysis passes.
6. Session management: --resume, fork_session, fresh start with summary injection, stale context.
Present scenarios and ask me to identify the correct architectural decision. After I answer, tell me if I am right or wrong and explain why. Focus on the traps: things that sound plausible but are wrong.
Domain 2 -- 18% of Exam
Tool Design & MCP Integration
Tool descriptions, error responses, tool distribution, MCP server configuration, and built-in tools.
Tool descriptions -- the #1 selection mechanism
Vague descriptions = misrouting. Fix descriptions first.
Tool descriptions are THE mechanism Claude uses for tool selection. When Claude picks the wrong tool, the fix is almost always a better description.
Structured Error Responses
4 error categories -- know which is retryable
Transient Timeouts, service unavailability
Strategy: retry with backoff.
Validation Invalid input, wrong format, missing fields
Strategy: fix the input, retry.
Business Policy violations, insufficient funds
Strategy: NOT retryable. Needs alternative workflow.
Permission Access denied, unauthorized
Strategy: escalation needed.
Strategy: retry with backoff.
Validation Invalid input, wrong format, missing fields
Strategy: fix the input, retry.
Business Policy violations, insufficient funds
Strategy: NOT retryable. Needs alternative workflow.
Permission Access denied, unauthorized
Strategy: escalation needed.
Critical distinction: access failure vs valid empty result
When a tool returns nothing, the agent must distinguish between “I couldn't access the data” (error) and “the data genuinely doesn't exist” (valid result). The exam tests this distinction explicitly.
Tool Distribution & MCP
Too many tools = degraded selection
18 tools on one agent is an anti-pattern
At ~18 tools, Claude's tool selection accuracy degrades. Optimal: 4-5 tools per agent.
Project-level: .mcp.json
Lives in the project root. Version controlled. Shared with the team. Use
${GITHUB_TOKEN} expansion syntax for secrets.User-level: ~/.claude.json
Lives in your home directory. NOT version controlled. NOT shared. Exam trap: New team member missing tools? Check if they're in user-level config.
Built-in Tools
Know the difference -- the exam tests this
Grep: Searches file CONTENTS for patterns.
Glob: Matches file PATHS by naming patterns.
Edit: Targeted modifications to specific parts of a file.
Incremental exploration strategy: Grep entry points first, then Read to trace flows. NOT read all files upfront.
Glob: Matches file PATHS by naming patterns.
Edit: Targeted modifications to specific parts of a file.
Incremental exploration strategy: Grep entry points first, then Read to trace flows. NOT read all files upfront.
# Wrong: read all files upfront
Read src/api/*.ts # wastes context on irrelevant files
# Right: incremental exploration
Grep "processPayment" # find entry points
Read src/payments/handler.ts # trace the flow
Read src/api/*.ts # wastes context on irrelevant files
# Right: incremental exploration
Grep "processPayment" # find entry points
Read src/payments/handler.ts # trace the flow
Practice Scenarios -- Domain 2
PRACTICE SCENARIO
An agent routes 'check the status of order #12345' to get_customer instead of lookup_order. Both tools have minimal descriptions. What is the first fix?
ANSWER
Better tool descriptions. Not few-shot examples, not routing classifiers. Tool descriptions are the PRIMARY selection mechanism.
PRACTICE SCENARIO
A tool returns an empty array after a customer lookup. The agent retries 3 times then escalates. But the customer account simply does not exist. What went wrong?
ANSWER
The system confuses a valid empty result with an access failure. Valid empty = the tool found nothing (this IS the answer). Access failure = the tool could not reach the source. Different handling required.
Study Prompt -- Domain 2
Paste this into Claude to study Domain 2
I am studying for the Claude Certified Architect exam. Quiz me on Domain 2: Tool Design and MCP Integration (18% of exam).
Test me on these specific concepts:
1. Tool descriptions as the PRIMARY selection mechanism. Why better descriptions fix misrouting before anything else.
2. Structured error responses: 4 error categories (transient, validation, business, permission) and which are retryable.
3. Access failure vs valid empty result -- the critical distinction the exam tests explicitly.
4. Tool distribution: why 18 tools on one agent degrades selection. Optimal is 4-5 per agent.
5. tool_choice parameter: "auto" vs "any" vs forced specific tool.
6. MCP server configuration: project-level (.mcp.json) vs user-level (~/.claude.json). Version control implications.
7. Community vs custom MCP servers: use community first, build custom only when needed.
8. Built-in tools: Grep (content search) vs Glob (path matching) vs Edit vs Read. Incremental exploration strategy.
Present scenarios and ask me to identify the correct tool design decision. After I answer, tell me if I am right or wrong and explain why.
Domain 3 -- 20% of Exam
Claude Code Configuration & Workflows
CLAUDE.md hierarchy, custom commands, skills, path-specific rules, plan mode, and CI/CD integration.
CLAUDE.md Hierarchy
3 levels -- know where each lives and who sees it
The hierarchy determines which instructions apply to whom.
1. User-level: ~/.claude/CLAUDE.md
Only you. NOT shared. NOT version controlled.
|
2. Project-level: .claude/CLAUDE.md
Everyone on the team. Version controlled. Shared via git.
|
3. Directory-level: subdirectory/CLAUDE.md
Only when working in that directory. Additive to project-level.
Exam trap: the missing instructions problem
Scenario: A new team member joins and Claude is not following conventions. If instructions are in user-level config, the new member doesn't have them. Fix: move to project-level (.claude/CLAUDE.md).
Path-Specific Rules & Plan Mode
.claude/rules/ with YAML frontmatter
Token-efficient, pattern-matched instructions
Instead of putting everything in CLAUDE.md, use path-specific rules that only load when editing matching files. A rule with
paths: ["**/*.test.tsx"] only loads when editing test files.Plan mode
Think before acting
Use when: monolith restructuring, multi-file migration, architectural decisions, ambiguous requirements.
Direct execution
Just do it
Use when: single-file bug fix, clear scope, repetitive task, small well-defined changes.
CI/CD Integration
Non-interactive mode for pipelines
The -p flag is essential -- without it, CI hangs
-p flag: Runs Claude Code in non-interactive mode.
--output-format json: Produces structured findings for downstream pipeline steps.
Independent review instance: Use a separate Claude instance for code review -- not the same session that wrote the code.
--output-format json: Produces structured findings for downstream pipeline steps.
Independent review instance: Use a separate Claude instance for code review -- not the same session that wrote the code.
Practice Scenarios -- Domain 3
PRACTICE SCENARIO
Developer A's Claude Code follows API naming conventions perfectly. Developer B (joined last week) gets inconsistent naming. Both work on the same repo. What is the root cause?
ANSWER
The naming conventions are in user-level config (~/.claude/CLAUDE.md) instead of project-level (.claude/CLAUDE.md). User-level is not version controlled and not shared.
PRACTICE SCENARIO
A CI pipeline script 'claude Analyze this PR' hangs indefinitely. Logs show Claude waiting for input. What is the fix?
ANSWER
Add the -p flag for non-interactive (print) mode. Without -p, Claude Code expects interactive terminal input.
Study Prompt -- Domain 3
Paste this into Claude to study Domain 3
I am studying for the Claude Certified Architect exam. Quiz me on Domain 3: Claude Code Configuration and Workflows (20% of exam).
Test me on these specific concepts:
1. CLAUDE.md hierarchy: user-level (~/.claude/CLAUDE.md) vs project-level (.claude/CLAUDE.md) vs directory-level (subdirectory/CLAUDE.md). Who sees what. Version control implications.
2. Path-specific rules: .claude/rules/ with YAML frontmatter and glob patterns. Token efficiency vs directory-level CLAUDE.md.
3. Custom slash commands: project-scoped (.claude/commands/) vs personal (~/.claude/commands/).
4. Skills: SKILL.md frontmatter (context: fork, allowed-tools, argument-hint).
5. Plan mode vs direct execution: when to use each. Monolith restructuring vs single-file bug fix.
6. CI/CD integration: -p flag for non-interactive mode, --output-format json, --json-schema, independent review instances.
7. The "missing instructions" problem: new team member joins, Claude ignores conventions. Root cause and fix.
Present scenarios and ask me to identify the correct configuration decision. After I answer, tell me if I am right or wrong and explain why.
Domain 4 -- 20% of Exam
Prompt Engineering & Structured Output
Explicit criteria, few-shot prompting, tool_use for structured output, validation-retry loops, batch processing, and multi-instance review.
Explicit criteria -- vague instructions fail
“Be conservative” does NOT work
Vague instructions produce inconsistent results. The fix: Define exactly which issues to report vs skip. Include concrete code examples for each severity level.
Most effective technique for consistency
2-4 targeted examples for ambiguous scenarios. Each example shows REASONING for why one action was chosen over another. Include boundary cases.
Structured Output & Batch API
tool_use with JSON schemas
Eliminates syntax errors. Does NOT prevent semantic errors.
Schema compliance does not prevent fabrication. Claude can return perfectly structured JSON with made-up data. Design schemas with nullable fields and “unclear” enum values.
Batch API
50% cost savings vs synchronous. Up to 24 hours. No multi-turn tool calling. Good for bulk extraction where latency doesn't matter.
Synchronous API
Real-time results. Full multi-turn tool calling. Higher cost. Good for pre-merge checks and interactive agents.
“If developers wait for it, use synchronous. If it can run overnight, use batch.”
-- The decision rule
Practice Scenarios -- Domain 4
PRACTICE SCENARIO
A code review tool has a 97% overall accuracy rate but developers still don't trust it. Why?
ANSWER
97% overall can hide 40% error rates on specific document types. High false positive rates in one category destroy trust in ALL categories. Fix: validate accuracy by type and category.
PRACTICE SCENARIO
A manager proposes using the Batch API for all code review tasks to save costs. What is wrong with this?
ANSWER
Batch API has up to 24-hour processing and no latency SLA. Pre-merge code reviews are blocking workflows that developers wait for. Keep blocking workflows synchronous.
Study Prompt -- Domain 4
Paste this into Claude to study Domain 4
I am studying for the Claude Certified Architect exam. Quiz me on Domain 4: Prompt Engineering and Structured Output (20% of exam).
Test me on these specific concepts:
1. Explicit criteria: why vague instructions like "be conservative" fail. How to define concrete thresholds and examples.
2. Few-shot prompting: 2-4 targeted examples with reasoning. Boundary cases matter more than happy-path examples.
3. Structured output with tool_use: JSON schema guarantees structure but NOT semantic correctness. Schema design tips (nullable fields, "unclear" enums, "other" + detail_string).
4. tool_choice parameter: "auto" vs "any" vs forced specific tool. When to use each.
5. Validation-retry loops: send back original + error + failed attempt. Effective for format errors, ineffective for missing data.
6. Batch API vs synchronous: 50% savings but up to 24 hours, no multi-turn tool calling. Decision rule: blocking = sync, overnight = batch.
7. Multi-instance review: why self-review is less effective. Independent instance catches more. Per-file + cross-file passes.
8. False positive rates: 97% overall accuracy hiding 40% error in specific categories. Trust destruction.
Present scenarios and ask me to identify the correct prompt engineering decision. After I answer, tell me if I am right or wrong and explain why.
Domain 5 -- 15% of Exam
Context Management & Reliability
Context preservation, escalation triggers, error propagation, codebase exploration, and information provenance.
Context preservation -- what progressive summarization kills
Never summarize transactional data
Progressive summarization destroys critical data: dollar amounts, case IDs, timestamps, order numbers.
The fix: Maintain a persistent “case facts” block that is never summarized.
“Lost in the middle” effect: Place key summaries at the beginning, not buried in the middle.
Trim verbose tool results: Tool responses often include 90% irrelevant data.
The fix: Maintain a persistent “case facts” block that is never summarized.
“Lost in the middle” effect: Place key summaries at the beginning, not buried in the middle.
Trim verbose tool results: Tool responses often include 90% irrelevant data.
Escalation Triggers
Valid escalation triggers
Customer requests human: Honor immediately. No negotiation.
Policy gaps: Agent encounters a situation not covered by instructions.
Inability to progress: After reasonable attempts, cannot resolve the issue.
Policy gaps: Agent encounters a situation not covered by instructions.
Inability to progress: After reasonable attempts, cannot resolve the issue.
Unreliable escalation triggers
Sentiment analysis: Sarcasm, cultural differences, and tone misreads create false triggers.
Self-reported confidence scores: Claude's self-reported confidence does not correlate well with actual accuracy.
Self-reported confidence scores: Claude's self-reported confidence does not correlate well with actual accuracy.
Error Propagation & Exploration
Structured error context
Propagate: what failed, what was attempted, and any partial results.
Anti-pattern 1: Silent suppression. Catching an error and continuing as if nothing happened.
Anti-pattern 2: Workflow termination on single failure. Partial results are often still valuable.
Anti-pattern 1: Silent suppression. Catching an error and continuing as if nothing happened.
Anti-pattern 2: Workflow termination on single failure. Partial results are often still valuable.
Information provenance
Every finding must include: claim, source URL, relevant excerpt, and date accessed. When sources conflict, annotate both.
Practice Scenarios -- Domain 5
PRACTICE SCENARIO
After several conversation turns, a support agent refunds $247.83 to the wrong order because it 'remembers' $247.83 but lost the order number. What architectural fix prevents this?
ANSWER
Persistent 'case facts' block with extracted transactional data (amounts, dates, order numbers) included in every prompt. Never rely on progressive summarization for transactional data.
PRACTICE SCENARIO
A customer says 'I want to speak to a human.' The agent offers to help first and tries to resolve the issue. Is this correct?
ANSWER
No. When a customer explicitly requests a human, honor it immediately. Do NOT attempt to resolve first. This is a hard rule.
Study Prompt -- Domain 5
Paste this into Claude to study Domain 5
I am studying for the Claude Certified Architect exam. Quiz me on Domain 5: Context Management and Reliability (15% of exam).
Test me on these specific concepts:
1. Context preservation: progressive summarization kills transactional data. Persistent "case facts" block that is never summarized.
2. "Lost in the middle" effect: Claude pays more attention to beginning and end of context. Place critical data at the beginning.
3. Escalation triggers: valid (customer request, policy gap, inability to progress) vs unreliable (sentiment analysis, self-reported confidence).
4. When customer requests human: honor immediately. No negotiation.
5. Error propagation: structured context (what failed, what was attempted, partial results). Anti-patterns: silent suppression and full workflow termination.
6. Access failure vs valid empty result: timeout = error, zero rows = valid result. Different handling required.
7. Codebase exploration: scratchpad files, subagent delegation, /compact for context management.
8. Information provenance: every finding needs source URL, relevant excerpt, date accessed. Conflicting sources: annotate both.
Present scenarios and ask me to identify the correct reliability decision. After I answer, tell me if I am right or wrong and explain why.
Preparation
Your Study Plan
Three phases: Foundation, Build, Practice. Each phase is one week.
Phase 1: Foundation (Week 1)
Read and understand
- Read Anthropic's official docs on Agent SDK, MCP, Claude Code
- Understand the agentic loop (stop_reason, tool_use cycle) completely
- Set up a CLAUDE.md hierarchy on a real project
- Build your first MCP server
- Understand Grep, Glob, Edit, Read differences
- Read tool_use documentation and JSON schema validation
- Understand the agentic loop (stop_reason, tool_use cycle) completely
- Set up a CLAUDE.md hierarchy on a real project
- Build your first MCP server
- Understand Grep, Glob, Edit, Read differences
- Read tool_use documentation and JSON schema validation
Phase 2: Build (Week 2)
Hands-on construction
- Build a multi-agent system with coordinator + 2 subagents
- Implement PostToolUse and tool call interception hooks
- Create a tool_use extraction pipeline with validation-retry
- Set up CI/CD with -p flag and --output-format json
- Create .claude/rules/ with path-specific patterns
- Build a skill with SKILL.md frontmatter
- Implement PostToolUse and tool call interception hooks
- Create a tool_use extraction pipeline with validation-retry
- Set up CI/CD with -p flag and --output-format json
- Create .claude/rules/ with path-specific patterns
- Build a skill with SKILL.md frontmatter
Phase 3: Practice (Week 3)
Test yourself
- Run through all 5 domain concepts
- For each domain: “Can I explain every concept without looking?”
- Build the capstone: a full agent system with all 5 domains
- Review anti-patterns and exam traps
- Practice classifying error types
- For each domain: “Can I explain every concept without looking?”
- Build the capstone: a full agent system with all 5 domains
- Review anti-patterns and exam traps
- Practice classifying error types
Anthropic Academy Courses (Free)
Claude 101
anthropic.skilljar.com/claude-101
Building with the Claude API
anthropic.skilljar.com/claude-with-the-anthropic-api
Claude Code in Action
anthropic.skilljar.com/claude-code-in-action
Introduction to Agent Skills
anthropic.skilljar.com/introduction-to-agent-skills
Introduction to MCP
anthropic.skilljar.com/introduction-to-model-context-protocol
MCP Advanced Topics
anthropic.skilljar.com/model-context-protocol-advanced-topics
Official Documentation
Agent SDK Overview
platform.claude.com/docs/en/agent-sdk/overview
Claude Code Skills
code.claude.com/docs/en/skills
MCP Specification
modelcontextprotocol.io
Anthropic Prompt Engineering Guide
docs.anthropic.com/en/docs/build-with-claude/prompt-engineering
Community Study Materials
110-Question Practice Set (SGridworks)
github.com/SGridworks/claude-certified-architect-training
5 things to build before the exam
If you can build all 5, you know enough to pass.
1
Multi-tool agent with stop_reason handling + hooks
3-4 MCP tools, proper agentic loop, PostToolUse hook for data normalization, tool call interception for policy enforcement. Covers Domain 1.
2
MCP tools with intentionally similar descriptions
Create 2-3 tools with overlapping functionality. Practice writing descriptions that eliminate misrouting. Covers Domain 2.
3
Project with full CLAUDE.md hierarchy + rules + skills
User-level, project-level, and directory-level CLAUDE.md files. Path-specific rules. At least one skill with SKILL.md frontmatter. Covers Domain 3.
4
Extraction pipeline with tool_use + validation-retry
JSON schema for structured output, forced tool_choice, validation step, retry loop. Covers Domain 4.
5
Coordinator with subagents + error propagation + provenance
Coordinator delegates to 2+ subagents with isolated context. Error propagation with structured context. Research findings include source + excerpt + date. Covers Domain 5.
Practice Exam
Practice Exam
60 multiple choice questions. One correct answer. Three distractors. Weighted by domain. Just like the real thing.
60
Questions
~120
Minutes
720
/ 1000 to Pass
Questions are shuffled each attempt. You cannot change an answer after selecting it.
Exam Day
Exam Day Cheat Sheet
The 10 facts to remember and the 7 anti-patterns to reject. If you know these cold, you will not be tricked.
10 key facts -- always true
Subagents do NOT share memoryEvery piece of information must be passed explicitly in the prompt.
stop_reason is the ONLY reliable termination signalNot "I'm done." Not iteration caps. Not checking for text content. Only stop_reason === "end_turn."
Financial/security = programmatic enforcementNever use prompt-based guidance for financial thresholds or security policies. Use hooks.
Tool descriptions are the PRIMARY selection mechanismWhen Claude picks the wrong tool, fix the description first.
-p flag for CI/CDWithout the -p flag, Claude Code in CI will hang waiting for interactive input.
Independent instance for code reviewSame session reviewing own output retains reasoning context and pulls punches.
Batch API: no multi-turn tool calling50% cheaper, up to 24 hours, but single request/response only.
Progressive summarization kills transactional dataMaintain a persistent "case facts" block that is never summarized.
Access failure != valid empty resultA timeout is an error. Zero rows returned is a valid result.
Few-shot examples beat prose instructions2-4 targeted examples with reasoning produce more consistent output.
7 anti-patterns -- always reject these
Parsing "I'm done" for loop terminationNatural language is unreliable for control flow. Use stop_reason.
Arbitrary iteration caps as primary stoppingCaps are safety nets, not control flow.
18 tools on one agentSelection accuracy degrades. Split into 4-5 tools per agent.
Sentiment-based escalationSentiment analysis is unreliable. Use explicit triggers.
Self-reported confidence for escalationClaude's self-reported confidence does not correlate with actual accuracy.
Silent error suppressionEmpty catch blocks, swallowed errors. Always propagate with structured context.
Reading all files upfrontWastes context window. Use incremental exploration: Grep then Read.
Quick decision matrix
When the exam gives you a scenario, use this to pick the right answer:
SCENARIO CORRECT APPROACH
--------- -----------------
Financial/security enforcement Programmatic hooks (NEVER prompts)
Tool misrouting Better descriptions (FIRST step)
CI pipeline hangs Add -p flag
Code self-review Independent instance
Batch vs sync Blocking = sync, overnight = batch
Subagent needs data Pass explicitly in prompt
Customer wants human Honor immediately
Loop termination stop_reason only
18 tools on one agent Scope to 4-5 per agent
Progressive summarization Persistent case facts block
New team member missing config Move to project-level CLAUDE.md
Ambiguous architecture decision Plan mode, not direct execution
--------- -----------------
Financial/security enforcement Programmatic hooks (NEVER prompts)
Tool misrouting Better descriptions (FIRST step)
CI pipeline hangs Add -p flag
Code self-review Independent instance
Batch vs sync Blocking = sync, overnight = batch
Subagent needs data Pass explicitly in prompt
Customer wants human Honor immediately
Loop termination stop_reason only
18 tools on one agent Scope to 4-5 per agent
Progressive summarization Persistent case facts block
New team member missing config Move to project-level CLAUDE.md
Ambiguous architecture decision Plan mode, not direct execution