Making Claude Code More AI-First
This post laid out three criteria to help determine how AI-first a product is.
We’re extraordinarily early in AI-first development. This post demonstrates how minor tweaks can make most AI products like CC substantially more AI-first. Specifically, this post introduces a “reflective memory” that allows CC to reflect and update its memory based on past interactions with the user.
Claude Code obviously satisfies the first criteria because the LLM Claude entirely drives it.
Starting in version 2.0.21, CC began to somewhat satisfy the second criteria when it started presenting multi-choice questions for unclear requests. Crucially, this process of requesting clarifications falls short from our aspirations because it doesn’t seem to learn from the user’s responses.
CC somewhat satisfies the third criteria because it affords a rudimentary notion of “memory” via CLAUDE.md files. It’s good that they’re vanilla markdown files. This allows CC to easily update them. It’s also good that the /init command somewhat attempts to build a representation in CLAUDE.md of the user’s overall social milieu (i.e. the codebase). However, /init isn’t very sophisticated and lacks many product integrations. CC also doesn’t proactively update CLAUDE.md by reflecting on its own mistakes and misunderstandings in relation to the user’s instrumental goals.
CC’s elegantly simple design makes it straightforward to extend in this direction. CC stores partial logs as JSONL files in ~/.claude/projects for each working directory. These logs can be mined via bulk inference to analyze misunderstandings between Claude and the user, which can subsequently be used to update CLAUDE.md.
This DoubleAscent/claude-logs-analysis repo provides a sample implementation. A collection of Python scripts read these JSONL files and perform bulk inference via a MapReduce pattern.
It runs bulk inference via a “mapper” that runs an analysis prompt over every single JSONL file. The mapper’s context window contains both a sanitized JSONL file and the current CLAUDE.md.
These inference results are then concatenated and fed into a “reducer” prompt. This produces a final report about what aspects of CLAUDE.md should be updated.
The user then manually prompts CC to read this report and subsequently update CLAUDE.md.
We’re essentially setting up a feedback loop to run “prompt optimization” on our revealed preferences as we use CC, with CLAUDE.md’s contents acting as the “learnable parameters” of this process. We’d likely run this process everyday or every week. We’d expect the contents of CLAUDE.md to rapidly reflect changes in our preferences and behaviors. This makes CC overall far better at tracking misunderstandings (criteria 2) and integrating these misunderstandings into its memory (criteria 3).
Note that this repo was intended as a demonstration of what’s possible. One can imagine many extensions/improvements to make the feedback loop spin faster:
Create a slash command to make all this more seamless.
Create a UI to make it easier to quickly arbitrate the various misunderstandings and provide context for each one.
Make the MapReduce robust to larger inputs via sharding.
Use a better model for analysis than Gemini Flash Lite.
Create evals for the mapper/reducer prompts and hill-climb on them.
Hook up the results of AskUserQuestion to a centralized memory.
Have a shared CLAUDE.md for a team, and learn a consolidated set of norms across the whole team.
If linear chains of causation characterized the old world, then feedback loops and spirals will characterize this new AI-first world.
Please leave a comment or email me at varun [at] doubleascent.com if you try out this workflow or anything like it. I’d love to hear from you!

Love it. Super clear. Powerful distinctions. Value as a linear reductionist algorithm vs value as an iterative developmental conversation.