Claude Code Adds /recap and 1-Hour Prompt Caching: Two Updates That Pay Off in Long Sessions

In brief: Claude Code v2.1.108 adds /recap for instant session re-entry and ENABLE_PROMPT_CACHING_1H for sustained-context workflows. Both features quietly fix the two friction points operators hit most: losing thread on a paused task, and burning tokens on context that hasn’t changed.

What Changed

Anthropic shipped a maintenance-grade Claude Code release that’s more interesting than the version number suggests. Two features stand out for anyone who runs Claude Code for more than a quick one-shot edit:

1. The /recap command and session recap feature. Returning to a session, after lunch, after a meeting, after a context overflow, used to mean re-reading your own scrollback to remember where Claude was in a multi-step task. The new recap behavior gives you a structured summary of what was happening when you stepped away, configurable via /config and invocable on demand with /recap. Telemetry-disabled environments can force the away-summary behavior with the CLAUDE_CODE_ENABLE_AWAY_SUMMARY env var.

2. Granular prompt caching controls via env vars. ENABLE_PROMPT_CACHING_1H opts your session into a 1-hour cache TTL on Anthropic’s API, Bedrock, Vertex, and Foundry. FORCE_PROMPT_CACHING_5M does the inverse. It forces the 5-minute TTL when that’s what you actually want. The defaults haven’t changed, but you now get to pick which one applies to your workflow.

These join the other items shipped in the same window. The Skill tool can now invoke built-in slash commands like /init, /review, and /security-review; PowerShell support replaces the Git Bash hard requirement on Windows; and MCP tool output limits jumped to 500K characters in a separate upgrade.

Why It Matters

If you treat Claude Code as a fancy autocomplete, none of this changes much. If you run it as a junior teammate on a multi-hour task, which is increasingly the right way to use it, both features earn their keep on the first day.

/recap solves a real coordination problem. Long Claude Code sessions accumulate state: files touched, decisions made, tests run, follow-ups deferred. When you pause and come back, the agent’s “memory” is only as good as your willingness to scroll. A clean recap collapses the re-onboarding tax to a few seconds. We’ve seen this matter most for ops engineers and consultants who context-switch between client codebases. /recap lets you stand up an old session without rereading.

The 1-hour cache is more strategic. Anthropic’s default 5-minute TTL is tuned for tight, hot loops. But when you’re in a deep refactor, a long debugging session, or a migration that has you repeatedly re-loading the same set of files over an hour or more, the 5-minute window expires faster than your work cadence. ENABLE_PROMPT_CACHING_1H keeps the cached prefix warm across those gaps. The team at AWS has been documenting up to 90% token savings on repeated file reads with prompt caching enabled. You don’t get that math without a TTL that survives your actual session length.

How to Use It

`/recap`

Inside any Claude Code session, type /recap to get a summary of the conversation so far. To configure when recap fires automatically, run /config and look for the recap-related settings under session behavior.

For environments where you’ve disabled telemetry but still want the away-summary feature, set:

export CLAUDE_CODE_ENABLE_AWAY_SUMMARY=1

Then resume a session with claude in the same directory. The recap fires before you type your first prompt.

1-hour prompt caching

Opt in with an environment variable. The simplest setup, before launching claude:

export ENABLE_PROMPT_CACHING_1H=1
claude

This applies to all four backends Claude Code supports: Anthropic API key, Amazon Bedrock, Google Vertex, and Microsoft Foundry. If you’re routing through a proxy like LiteLLM, check your proxy’s caching configuration too; the env var only affects the Claude Code client side.

If you want the opposite, a session that’s locked to the 5-minute TTL regardless of model defaults:

export FORCE_PROMPT_CACHING_5M=1
claude

A practical rule of thumb: turn on the 1-hour cache when your session involves the same large files (lockfiles, schemas, long config files) being read repeatedly over an extended window. Keep the default 5-minute cache for short sessions where you’re moving fast and rotating contexts.

Pricing note

Prompt caching pricing follows Anthropic’s published cache-write and cache-read rates. The 1-hour TTL doesn’t change the per-token math; it changes how often you avoid the read penalty. Cost only goes up if you’re caching prefixes you never read again, which is rare in real engineering work.

Claude Code’s ultrareview command is the other April 2026 release worth knowing about.
Claude Code vs Cursor explains how the two stack up for daily coding.
Cursor 3.2 multitask and canvases covers the agent-runtime pivot from the other side of the market.
GitHub Copilot review is for teams who want their AI assistant tied to the PR workflow.

This post is part of Pondero’s daily coverage of AI tool updates. See all guides →