Overview

What is DevMem?

DevMem is a command-line tool that gives your codebase a persistent memory. It crawls your repository on first run, uses AI to understand and document every module it finds, and then watches your git commits to keep that documentation accurate forever.

The core insight behind DevMem is that documentation written once immediately starts lying. The moment a function is refactored, a module is split, or a dependency is added, any static documentation written before that change becomes a liability rather than an asset. DevMem solves this by making documentation a continuous process tied to the one workflow developers never skip: committing code.

The problem it solves

Modern codebases grow faster than any individual can track. AI-assisted development accelerates this further. Code is written and merged faster than documentation can follow. DevMem acts as the system of record that humans cannot maintain manually.

The mechanism

On devmem init, the entire codebase is crawled and documented in parallel. On every subsequent devmem capture, only the changed modules are updated. Every change is linked to its git commit hash, making documentation history and code history permanently aligned.

Overview

How it works

DevMem operates in two distinct modes: the initial documentation pass and the continuous capture loop. Understanding both is essential to understanding why the tool is designed the way it is.

Initial pass - devmem init

When you run devmem init for the first time in a repository, DevMem executes five sequential phases. Each phase is a prerequisite for the next. The tool does not skip phases based on partial state.

Filesystem crawl

The entire directory tree is walked using Go's os.ReadDir recursively. Every file and directory is recorded as a FileNode in a tree structure. Ignore rules from .gitignore and .devmemignore are applied during the walk. The complete tree is written to .devmem/tree.json.

Module detection

The FileNode tree is scored directory by directory. Each directory receives points based on structural signals: README, entry file presence, directory depth, and source file density. Convention directories like internal/, pkg/, and vendor/ are recursed into but not chosen as roots.

User confirmation

Detected modules are printed to the terminal and must be confirmed, edited, or augmented before AI analysis starts. This step is mandatory because wrong module boundaries create low-quality documentation that is difficult to fix later.

Parallel AI analysis

One goroutine is spawned per confirmed module. A semaphore channel limits concurrent AI calls (default 5). Each goroutine builds module context from a compact file tree and entry-file excerpt, calls the model, and streams results back over a channel. Failures are isolated to the affected module.

Master architecture

After module analysis completes, a final architecture call synthesizes module summaries, dependency relationships, technology stack signals, and data flow. It outputs both markdown and mermaid dependency edges.

Continuous loop - devmem capture

After init, every capture run follows a lean, repeatable flow focused on changed modules only.

Read diff

Collect commit and worktree changes from git so documentation reflects both committed and in-progress edits.

Classify change

Use a fast model call to identify change type, summarize intent, and detect structural updates.

Patch docs

Update only affected module sections, append changelog entries, and regenerate master architecture only when required.

Installation

Getting started in 30 seconds

Requirement	Version	Notes
Go	1.21 or higher	Required to build and install
Git	Any modern version	Required for capture and status commands
Anthropic API key	-	Required for all AI operations

Step 1 - Install DevMem

bash

go install github.com/Surya-sourav/devmem@latest

Step 2 - Initialize your repository

bash

cd your-project
devmem init

Info

Your API key is stored in the macOS Keychain (or OS equivalent on Linux/Windows) using the system credential store. It is never written to disk in plain text and never included in any file that could be accidentally committed.

Commands

devmem init

Initialization command. Run once per repository to crawl the tree, detect modules, run parallel AI analysis, and generate master architecture. It is safe to re-run and idempotent.

Flag	Default	Description
--dir	Current working directory	Repository root to operate on
--model	claude-sonnet-4-20250514	Anthropic model for all AI calls

terminal session

$ devmem init

  Crawling codebase...           done  (312 files, 0.1s)
  Detecting modules...           found 7 candidates

  Detected modules:
    auth       src/auth/         (score: 7)
    api        src/api/          (score: 6)
    db         internal/db/      (score: 5)
    worker     internal/worker/  (score: 4)
    config     pkg/config/       (score: 3)

  Confirm? [Y/n]: Y

  Analysing modules (7 total, max 5 parallel)...
    [1/7] auth      ✓ done  (1.2s)
    [3/7] api       ✓ done  (1.4s)
    [2/7] db        ✓ done  (1.8s)
    [5/7] config    ✓ done  (0.9s)
    [4/7] worker    ✓ done  (2.1s)

  Generating master architecture...  ✓ done  (2.3s)

  devmem initialised. 7 modules documented.

What gets created

.devmem structure

.devmem/
├── config.json                      Module map, ignore rules, model config
├── state.json                       Init timestamp, last commit hash, module count
├── tree.json                        Complete FileNode snapshot
└── docs/
    ├── master-architecture.md       Full system overview with module table
    ├── master-architecture.mermaid  Mermaid graph of module dependencies
    └── modules/
        ├── auth.md
        ├── api.md
        └── ...

Idempotency

Running init again regenerates module docs and master architecture from scratch. Existing changelog entries are preserved. Init does not delete or rewrite .devmem/changelog/.

Commands

devmem capture

Reads repository changes, identifies affected modules, updates docs, and writes changelog entries. This is the command that keeps documentation alive.

Read diff

git diff HEAD~1 --name-only for changed files and git diff HEAD~1 for full patch capped to 4k chars before AI.

Resolve modules

Changed paths map against module path definitions in config.json. Unknown paths produce warnings but do not abort.

AI classify + patch

Classification returns change type and summary. Patch call returns only changed doc sections; unchanged text stays untouched.

Write outputs

Writes .devmem/changelog/<commit>.md, patches module docs, updates changed_in, and advances state.json.

Change type taxonomy

Type	Description
feature	New capability added
fix	Bug correction
refactor	Internal restructure, no behavior change
perf	Performance or memory improvement
breaking	Public API or contract changed
docs	Documentation-only change
test	Test additions or changes
config	Build, environment, or dependency change
chore	Maintenance and cleanup
structural	Module added/removed or dependency graph changed

Warning

The structural type is the only classification that triggers master architecture regeneration. All other types patch module docs surgically for faster capture runs.

Post-commit hook

DevMem can install a post-commit hook so capture runs automatically after each commit.

bash

# Install the hook
devmem init   # hook is offered during init

# Or install manually
devmem capture --install-hook

.git/hooks/post-commit

#!/bin/sh
# Installed by devmem
devmem capture

Commands

devmem status

Compares current git HEAD against the last captured commit in state.json and reports modules with undocumented changes.

example output

devmem status

  Last captured commit: abc123f  (2 commits behind)
  Current commit:       def789g

  Modules with undocumented changes:
    api        src/api/handler.go, src/api/routes.go
    worker     internal/worker/processor.go

  Run: devmem capture  to update docs

Code	Meaning
0	All modules are up to date
1	One or more modules have undocumented changes
2	DevMem not initialized in this directory

CI usage

Because status exits with code 1 when documentation is stale, it works as a CI gate for enforcing documentation freshness in pull requests.

GitHub Actions

- name: Check documentation is current
  run: devmem status

Commands

devmem query

Answers natural-language questions about your codebase, grounded in living docs from .devmem/docs/. Responses are streamed to stdout.

usage

devmem query "<question>"

example queries

$ devmem query "how does authentication work?"
$ devmem query "which modules depend on the database layer?"
$ devmem query "what changed in the auth module recently?"
$ devmem query "explain the overall architecture in plain English"

The query command reads all module docs and master architecture docs, concatenates them as model context, and instructs the model to answer from provided documentation only. If the docs do not contain the answer, the model is instructed to say so explicitly.

Context window

For very large codebases, combined docs can approach model context limits. Scope the query with --module when needed: devmem query --module auth "how are tokens validated?".

Under the hood

Filesystem crawler

The crawler is the foundation of DevMem and runs before any AI call or module detection. It is implemented in internal/crawler/walker.go.

Starting at repository root, it recursively walks directories with os.ReadDir, constructing a complete in-memory tree with file metadata used by detection and prompt assembly.

internal/crawler/walker.go

type FileNode struct {
    Name     string      `json:"name"`
    Path     string      `json:"path"`      // relative to repo root
    Type     string      `json:"type"`      // "file" | "dir"
    Ext      string      `json:"ext,omitempty"`
    Size     int64       `json:"size,omitempty"`
    Children []*FileNode `json:"children,omitempty"`
}

Ignore rules

Ignore rules are applied during traversal, not after collection. That avoids expensive reads in massive directories and keeps crawl fast on mixed-language repositories.

default ignores

.git
.devmem
node_modules
vendor
.venv
__pycache__
dist
build
*.lock
*.sum
*.min.js

Tree representation

tree.json (excerpt)

{
  "name": "DevMem",
  "path": ".",
  "type": "dir",
  "children": [
    {"name": "cmd", "path": "cmd", "type": "dir"},
    {"name": "internal", "path": "internal", "type": "dir"}
  ]
}

AI prompt format

.
├── cmd/
│   ├── init.go
│   ├── capture.go
│   └── query.go
└── internal/
    ├── crawler/
    ├── ai/
    └── state/

Under the hood

Module detection

Module detection finds logical boundaries in any repository without parsing code. It uses structure-based scoring and configurable overrides.

Signal	Points	Condition
Own README	+3	Directory contains README.md directly
Entry file	+2	Contains main.go, index.ts, index.js, app.py, __init__.py, mod.rs, etc.
Shallow depth	+2	Depth 1-4 from repository root
Go package	+2	Directory has .go files directly
File count	+1	Contains more than 5 source files

Detection threshold is score >= 3 by default and score >= 2 for directories under convention roots like internal/ and pkg/.

Convention directories

Convention roots are never selected as modules themselves, regardless of score. Their children are scored and considered as module candidates using a lower threshold.

User-defined mappings in .devmem/config.json always win. Config-sourced modules are treated as authoritative and assigned top priority during deduplication.

Under the hood

AI pipeline

DevMem uses four prompt types with distinct input shapes, output schemas, and validators to keep docs deterministic and machine-safe.

Prompt A: Module analysis

Used in init. One call per module. Input includes module name, tree view, entry excerpt, and sibling modules. Output is ModuleAnalysis JSON.

Prompt B: Change classification

Used in capture. Fast model call returns type enum, summary, breaking flag, and per-module summaries.

Prompt C: Module doc patch

Used in capture for each affected module. Input includes current markdown + diff summary. Output is DocPatch JSON keyed by section title.

Prompt D: Master architecture

Used in init and structural captures. Generates overview, tech stack, data flow, module summaries, and mermaid edges.

JSON-only contract

Each prompt requires raw JSON output only. No prose, no markdown fences. If a model still returns fenced JSON, DevMem strips fences and retries parsing safely before skipping invalid data.

Under the hood

Prompt architecture

Prompt templates are separated by task and deliberately strict. Every prompt includes schema reminders, forbidden output forms, and deterministic field requirements to make parser behavior stable in production.

The architecture keeps prompts short and purpose-built. Classification prompts remain compact and latency-optimized, while synthesis prompts include richer context and stricter dependency mapping instructions.

prompt contract

- First character must be {
- Last character must be }
- Return exactly one JSON object
- Do not include markdown code fences
- Fill all required fields even if value is an empty list

Under the hood

Concurrent analysis

Module analysis during init is the slowest operation. DevMem runs module calls concurrently so total runtime approximates the slowest module call rather than the sum of all calls.

cmd/init.go

results := make(chan analysisResult, len(modules))
sem := make(chan struct{}, cfg.MaxConcurrent)
var wg sync.WaitGroup

for _, mod := range modules {
    wg.Add(1)
    go func(m crawler.Module) {
        defer wg.Done()
        sem <- struct{}{}
        defer func() { <-sem }()

        analysis, err := aiClient.AnalyseModule(ctx, buildContext(tree, m))
        results <- analysisResult{module: m, analysis: analysis, err: err}
    }(mod)
}

go func() {
    wg.Wait()
    close(results)
}()

for r := range results {
    // write docs as each result arrives
}

The semaphore channel caps in-flight requests to avoid API throttling while keeping throughput high.

Closure capture bug

The goroutine accepts mod as a value parameter to avoid loop-variable capture. Capturing loop variables directly is a common Go concurrency bug.

File structure

.devmem/ directory

All generated artifacts live under .devmem/ at repository root. Files are plain text, diff-friendly, and designed to be committed for historical traceability.

complete tree

.devmem/
├── config.json              Configuration, module map, ignore rules
├── state.json               Runtime state, last commit, module count
├── tree.json                FileNode snapshot from last init
├── docs/
│   ├── master-architecture.md       System overview document
│   ├── master-architecture.mermaid  Module dependency graph
│   └── modules/
│       ├── auth.md                  Per-module documentation
│       ├── api.md
│       └── ...
└── changelog/
    ├── abc123f4d.md                 One entry per captured commit
    ├── def456g7h.md
    └── ...

File structure

Module docs schema

Each module doc has machine-readable YAML frontmatter and human-readable markdown sections. Capture and status depend on frontmatter integrity for patching and drift checks.

example: .devmem/docs/modules/auth.md

---
module: auth
root_path: src/auth
key_files:
  - src/auth/handler.go
  - src/auth/tokens.go
  - src/auth/middleware.go
depends_on:
  - db
  - config
changed_in:
  - abc123f4d
  - def456g7h
generated_at: 2025-03-21T14:32:00Z
---

# auth

JWT-based authentication module handling token issuance,
validation, and HTTP middleware for protected routes.

## Purpose

Provides stateless auth for all API endpoints using RS256-signed JWTs.

## Key files

- `src/auth/handler.go` - login/logout/refresh HTTP handlers
- `src/auth/tokens.go` - token generation, validation, expiry logic
- `src/auth/middleware.go` - Gin middleware, injects claims into context

## Public API

- `NewMiddleware(cfg Config) gin.HandlerFunc`
- `IssueToken(userID string, claims Claims) (string, error)`
- `ValidateToken(raw string) (*Claims, error)`

## Dependencies

- **db** - loads user records for credential validation
- **config** - reads JWT secret and expiry settings

## Tech notes

Uses golang-jwt/jwt v5. Tokens are RS256 signed.
Refresh tokens stored as opaque strings via the cache module.

## Changelog

- [abc123f4d](../changelog/abc123f4d.md) - Added token revocation
- [def456g7h](../changelog/def456g7h.md) - Fixed expiry edge case

File structure

Changelog schema

Each capture writes one changelog file per commit hash under .devmem/changelog/. The changelog body stores detailed narrative, while module docs only keep short linked summaries.

example: .devmem/changelog/abc123f4d.md

---
commit: abc123f4d
type: feature
captured_at: 2026-03-22T09:14:22Z
modules:
  - auth
  - api
---

# Capture abc123f4d

## Summary
Added token revocation endpoint and middleware checks for revoked JWT IDs.

## Module updates
- auth: Added revocation store lookup before token acceptance.
- api: Added route wiring for revoke endpoint.

## Diff context
- src/auth/tokens.go
- src/auth/middleware.go
- src/api/routes.go

File structure

Master architecture schema

Master architecture output includes a narrative markdown document and a mermaid graph file to visualize directional dependencies.

.devmem/docs/master-architecture.md

# DevMem Architecture

## System overview
The CLI orchestrates crawl, AI analysis, state persistence, and git-linked documentation updates.

## Module map
- cmd
- internal/crawler
- internal/ai
- internal/state
- internal/git
- internal/docs

## Data flow
1) Crawl -> 2) Detect modules -> 3) Analyze -> 4) Write docs -> 5) Capture diffs

.devmem/docs/master-architecture.mermaid

graph TD
  CMD[cmd] --> CRAWLER[crawler]
  CMD --> AI[ai]
  CMD --> STATE[state]
  CMD --> GIT[git]
  CMD --> DOCS[docs]
  DOCS --> STATE

Configuration

config.json reference

.devmem/config.json is generated during init and can be edited manually. Changes apply on the next init or capture run.

.devmem/config.json

{
  "version": "1",
  "ai_model": "claude-sonnet-4-20250514",
  "max_concurrent": 5,
  "ignore": [
    "node_modules",
    "vendor",
    "*.test.go"
  ],
  "modules": {
    "auth": {
      "paths": ["src/auth"]
    },
    "api": {
      "paths": ["src/api"]
    }
  }
}

Field	Type	Default	Description
version	string	"1"	Config schema version
ai_model	string	claude-sonnet-4-20250514	Anthropic model for AI calls
max_concurrent	number	5	Max simultaneous AI calls during init
ignore	array	defaults + custom	Extra ignore patterns for crawl
modules	object	auto-detected	Explicit module definitions overriding heuristics

Configuration

Ignore rules

Ignore configuration merges three sources in order: built-in defaults, repository .gitignore, and repository .devmemignore. Project-local rules are ideal for generated directories, fixtures, and large snapshots that are irrelevant for documentation quality.

.devmemignore example

# build outputs
coverage/
artifacts/

# generated mocks
**/mocks/

# large snapshots
**/*.snap

Configuration

Module overrides

Use module overrides when heuristics cannot capture your intended boundaries. Override entries lock module names and root paths so capture mapping remains stable across refactors.

override example

{
  "modules": {
    "platform-api": {
      "paths": ["services/api", "internal/api"]
    },
    "platform-worker": {
      "paths": ["services/worker"]
    }
  }
}

Git integration

Post-commit hook

When enabled, post-commit runs capture automatically after each commit, keeping docs synchronized with code history without manual steps.

hook script

#!/bin/sh
# Installed by devmem
devmem capture

Git integration

Capture flow

Capture resolves committed and worktree changes in one run, maps each file to modules, and writes linked changelog outputs. Structural changes trigger architecture regeneration automatically.

high-level flow

collect committed diff + worktree diff
-> map files to modules
-> classify change
-> patch modules
-> write changelog
-> update state.json
-> regenerate master architecture only if structural

Git integration

Diff resolution

Diff extraction prefers exact commit boundaries but falls back to worktree analysis when needed. Patch payloads are truncated safely to protect model context budget while preserving key semantic chunks.

Unknown files are reported as warnings, not errors, so capture still succeeds for known modules. This design prevents one unmapped path from blocking documentation for every changed module.