Self-checking AI agents with organization policy enforcement
Architecture Overview — April 2026
Works with Claude Code, GitHub Copilot, and Cursor
AI coding assistants operate with unrestricted access to your codebase.
A validation server sits between the AI and your code. No client modifications needed.
Unchanged — Claude Code, Copilot, or Cursor communicates via standard MCP protocol
withValidationYou define the rules. Not the AI.
"Agents can read and write code in src/ and test/. They cannot access credentials, run curl, or open network connections. Limit to 30 steps."
{
"paths": {
"allowedWritePatterns": ["./src/**", "./test/**"],
"deniedPatterns": ["**/.env", "**/secrets/**"]
},
"bash": { "mode": "capabilities-only" },
"deniedTools": ["WebFetch", "WebSearch"],
"limitCaps": { "maxTotalSteps": 30 }
}
JSON file, checked into your repo, version-controlled, auditable.
| Agent attempts | Result | Why |
|---|---|---|
validated_read(.env) |
BLOCKED | Matches denied pattern **/.env |
validated_bash("curl ...") |
BLOCKED | Bash restricted to capabilities-only mode |
validated_write(/etc/crontab) |
BLOCKED | Path not in allowed write paths |
| Calls Read when plan says Edit | ABORTED | Tool doesn't match plan step |
| Plan with 100 steps | REJECTED | Exceeds policy cap of 30 |
validated_npm("test") |
ALLOWED | Structured tool, no shell injection |
validated_npm, validated_git,
validated_node, validated_tsc
spawnSync with array args — no shell, no injection
Command parsing, allow/deny lists, regex patterns, network detection
Catches accidental violations
Docker with --network=none, read-only filesystem,
resource limits
Kernel enforces what string parsing can't
Key insight: python -c "import socket; ..." bypasses
string analysis. Only the kernel (container) can stop it.
--network=none — all sockets blocked at syscall
level
--read-only — filesystem immutable--memory=512m — OOM killed if exceeded--pids-limit=100 — fork bomb protection--gpus all for ML training--device /dev/video0 for camera-p 8080:8080
{
"container": {
"enabled": true,
"image": "python:3.12-slim",
"networkMode": "none",
"readOnly": true,
"memoryLimit": "512m",
"deriveVolumesFromPolicy": true,
"devices": {
"gpu": true
}
}
}
Every action recorded as a hash-chained entry. Tampering breaks the chain.
Step index, tool, input parameters, output (truncated), duration, status, SHA-256 hash
Aggregate metrics: total steps, duration, success/fail counts
npx mcp-plan-validation init --client claude --policy dev
--client claude--client copilot--client cursor
--client all configures all three simultaneously. Policy
file is shared.
Auto-detects from existing config directories when --client is omitted.
| Template | Bash | Network | Container | Step Limit | Best For |
|---|---|---|---|---|---|
| strict | Capability tools only | Blocked | No | 30 | Code review, docs |
| dev | Policy-checked | Dev ports | Optional | 50 | Daily development |
| ml | Policy-checked | Blocked | GPU, 16GB | 100 | ML training |
| ci | Limited commands | Blocked | Read-only | 20 | CI/CD pipelines |
Each template is a starting point. Customize by editing .plan-validation-policy.json.
Integration tests use the Anthropic Agent SDK to drive Claude Code through the full pipeline.
| Limitation | Impact | Mitigation |
|---|---|---|
| Bash analysis is undecidable | python -c bypasses string checks |
Container sandbox (kernel enforcement) |
| No semantic predicates | Can't verify "function exists" | File-content predicates; AST parsing planned |
| Container needs Docker | Not everywhere | Capability tools work without Docker |
| Soft enforcement | Model could use raw tools if exposed | Configure client to only expose MCP tools |
| Trace output truncated | Large outputs cut to 500 chars | Sufficient for audit; full data in bindings |
| Question | Answer |
|---|---|
| Can we restrict agents? | Yes — org policy controls tools, paths, commands, network, limits |
| Prevent credential exposure? | Yes — denied patterns make sensitive files invisible |
| Audit trail? | Yes — hash-chained trace with integrity verification |
| Works with existing tools? | Yes — Claude Code, Copilot, Cursor with one command |
| Who controls policy? | Your team — JSON file in the repo, not the AI |
| What if AI breaks rules? | Blocked in real-time, logged, plan violations abort |
| How hard to adopt? | npx mcp-plan-validation init --policy dev |