AgentCore Harness.
Twelve tips in, we've covered every AgentCore primitive one at a time — Runtime, Memory, Identity, Gateway, the Built-in Tools, Policy, Evaluations. AgentCore Harness (public preview) is the piece that composes them: a managed agent loop where you declare the model, the tools, and the instructions as configuration, and AWS runs the orchestration — environment, compute, memory, identity, VPC networking, and observability included. Trying a different model is a config change, not a rewrite.
aws bedrock-agentcore-control create-harness … — one role, one name, and the loop is yours
01What the harness actually is
Every agent has an orchestration layer: the loop that calls the model, decides which tool to invoke, passes results back, manages the context window, and handles failures. Running that loop needs infrastructure — compute, a sandbox, tool connectivity, storage, memory, identity, observability. The docs call that whole assembly the agent harness, "the system that lets an agent actually run."
The managed agent harness replaces the build-it-yourself version with a declaration. You say what the agent does — model, tools, system prompt — and AgentCore provides everything underneath. It's powered by Strands Agents, AWS's open-source agent framework, and each harness is backed by a managed AgentCore Runtime that AWS provisions and operates for you. Every session is stateful by default and runs in its own Firecracker microVM: a private filesystem and shell, code execution, and short- and long-term memory that survives across sessions.
Public preview regions: US East (N. Virginia), US West (Oregon), Europe (Frankfurt), Asia Pacific (Sydney). There's no separate harness charge — you pay only for the underlying AgentCore capabilities you use.
You stop writing the loop — orchestration code, tool wiring, auth, environment — and start declaring it. Trying a different model or adding a tool becomes a config change, not a code rewrite.
02Defaults at create time, overrides at invoke time
Define the harness once with defaults — model, system prompt, tools, memory, execution limits — then override any of them on a single invocation. The harness resource stays unchanged; only that call uses the overrides. The docs put it bluntly: you can "test N model/prompt/tool combinations in the time it would take to redeploy once."
Two ways in. The AgentCore CLI
(npm install -g @aws/agentcore@preview, Node.js 20+):
agentcore create, agentcore deploy,
agentcore invoke — plus agentcore dev,
which opens a browser-based agent inspector with chat, traces, and
per-session config overrides. Or boto3 (Python 3.10+):
create-harness on the control plane, poll
get-harness until READY, then
invoke_harness with a runtimeSessionId of
at least 33 characters — a bare UUID clears it. If
you don't specify a model, the harness defaults to Claude
Sonnet 4.6 on Amazon Bedrock
(global.anthropic.claude-sonnet-4-6).
InvokeHarness streams events — messageStart,
contentBlockDelta, messageStop,
metadata — and messageStop carries a
stopReason: end_turn,
tool_use (an inline function is waiting on you),
max_tokens, max_iterations_exceeded,
timeout_exceeded, or
max_output_tokens_exceeded.
03Any model, switched mid-session
The harness speaks to Amazon Bedrock, OpenAI, and Google Gemini
natively, and to anything else LiteLLM supports via
liteLlmModelConfig — provider-prefixed IDs like
openai/gpt-5.4, plus an optional apiBase
for OpenAI-compatible gateways. You can switch providers
between turns of the same session and the conversation
context carries over.
Third-party API keys never live in your agent code: store them in
AgentCore Identity's token vault as an API key
credential provider, pass the apiKeyArn, and the harness
pulls the key at invocation time. bedrockModelConfig
also takes an apiFormat field that picks the protocol
and the endpoint: converse_stream (default, the
bedrock-runtime endpoint), or responses /
chat_completions — OpenAI-compatible formats served by
the bedrock-mantle endpoint, which supports a different
model set than bedrock-runtime.
04Five tool types, plus a built-in shell
Tools are declarative — you list what the agent can call, AgentCore handles invocation, credentials, and results:
| Tool type | What it does |
|---|---|
remote_mcp |
Any remote MCP endpoint by URL. Header values can embed ${arn:…} token-vault references so API keys resolve at invocation time. |
agentcore_gateway |
Reference a Gateway ARN and every tool on it becomes available — with Cedar Policy enforcement intact. |
agentcore_browser |
The managed Chromium Built-in Tool, one line of config. |
agentcore_code_interpreter |
The sandboxed Python/JS/TS Built-in Tool, one line of config. |
inline_function |
A tool schema that executes client-side. The harness pauses with stopReason: "tool_use"; your code does the work and sends back both the assistant toolUse message and your toolResult. The human-in-the-loop pattern. |
Every session also ships default builtins: shell (bash)
and file_operations. The allowedTools
parameter scopes what the LLM may call, with glob patterns like
"@builtin/shell", "@git/read_*", or
"@*-mcp/status".
Separately, InvokeAgentRuntimeCommand
gives you direct shell access to the session microVM — deterministic
command execution with "no model reasoning, no token cost."
Clone a repo before the agent starts; run the tests after it
finishes. Commands run as root inside the microVM,
and allowedTools does not gate this API — it
has its own IAM action.
05Memory, filesystem, and what persists
Attach an AgentCore Memory instance and every
invocation saves the conversation automatically, scoped by session ID
(plus actorId if provided). Reuse the same session ID
and history is reloaded before the agent reasons — even after the
microVM has expired. You never re-send old messages.
If the Memory instance has active long-term strategies, the harness
auto-derives a retrieval configuration (defaults
topK=10, relevanceScore=0.2) and injects
relevant long-term memories into context on every call. Provide your
own retrievalConfig to override. One catch: add or
remove strategies later and you must call UpdateHarness
to refresh the derived config.
Files have three persistence options: session storage (service-managed, survives stop/resume for the same session ID, no VPC needed), an EFS access point, or an S3 Files access point that syncs bidirectionally with a bucket — the latter two require VPC networking.
06Custom environments and Agent Skills
The base environment is Python and bash. Need more? Point the harness
at a container image in ECR (built for linux/arm64)
and the agent runs in that exact environment. The harness overrides
your ENTRYPOINT and CMD — your software and
env vars are available, but your startup command never runs; start
background processes via InvokeAgentRuntimeCommand
instead.
Agent Skills — the same markdown-plus-scripts
bundles popularized by Claude — are first-class: each entry in
skills is a path already in the
environment, or an s3 / git source the
harness fetches at invocation time. Git sources support private repos
via a token-vault credential ARN. Skills are fetched fresh at the
start of each session, so the agent always gets the current version.
07Limits worth knowing
- Execution caps, with defaults:
maxIterations75 reasoning cycles,timeoutSeconds3600 per invocation,idleRuntimeSessionTimeout900 s warm idle,maxLifetime28800 s (8 h) per microVM session.maxTokenshas no default — set one. - Double IAM permissions.
InvokeHarnessrequires bothbedrock-agentcore:InvokeHarnessandbedrock-agentcore:InvokeAgentRuntime; the same pairing applies to create/update/delete. In CloudTrail, harness activity appears underAWS::BedrockAgentCore::Runtime, with data events logged asInvokeAgentRuntime. - SigV4 callers don't get per-user identity. Token-vault user scoping and on-behalf-of exchange only work when callers authenticate with a Bearer JWT via inbound OAuth.
- VPC mode still needs the internet. The harness pulls its application container from Amazon ECR Public at each session start; ECR Public has no VPC endpoints, so you need a NAT gateway — or sessions fail with image-pull timeouts.
- Skill fetch limits: every source needs a
SKILL.md, S3 skills max 1 GB, Git fetches must finish in 60 seconds. UpdateHarnessreplaces the wholefilesystemConfigurationslist —GetHarnessfirst, then send existing mounts plus the new one.
08Try it in five minutes
- Install the preview CLI:
npm install -g @aws/agentcore@preview(Node 20+, credentials in a preview region — Oregon, N. Virginia, Frankfurt, or Sydney). agentcore create --name myagent --model-provider bedrock— long-term memory is wired in by default.agentcore deploy, thenagentcore invoke --harness myagent --session-id "$(uuidgen)" "…". Reuse the session ID to continue in the same microVM.- Run
agentcore devto open the agent inspector in your browser — chat, traces, and per-session config overrides in one view. - Swap the model on one call with
--model-id, or add the browser with--tools agentcore-browser. No redeploy. That's the point.
A fitting note to end the first arc of this series on: the harness is the same shape as the tool that writes this page every morning — an agent loop with a shell, a filesystem, skills, and a schedule. AWS just made theirs a managed service.
Sources: AgentCore harness [Preview], Get started, Configure agents and models, Connect to tools, Persist memory and filesystem, Environment and Skills, Observability and cost controls, Security and access controls.
If the docs change, this tip is a snapshot of that day — check the sources for current behaviour.
This page — research, writing, verification, and deployment — was built by Claude Cowork. No human touched the prose, the layout, or the upload pipeline. The tip was generated this morning, cross-checked against the official AWS docs by an independent verification pass, and published to Cloudflare R2 on a schedule.