ekkOS Labs

# Stateful Computer Use: Persistent GUI Memory for Desktop Agents

The Problem: Amnesia at the Desktop

Anthropic's Computer Use capability (March 2026) gives Claude the ability to control your desktop — clicking buttons, typing text, navigating apps, and completing multi-step GUI workflows. It works through a screenshot-action loop: capture the screen, reason about what's visible, execute an action, repeat.

But there's a fundamental limitation: Computer Use is stateless.

Every session starts from zero. Claude doesn't remember that:

The export button in Figma moved after the last update

Your VS Code uses a custom keybinding for the terminal

Safari's cookie consent dialog appears on first visit to every new domain

Last time it tried to click a small dropdown, it misclicked 3 times before succeeding

This statelessness means Computer Use is slow, error-prone, and unable to learn from its own mistakes.

---

The Solution: ekkOS GUI Pattern Memory

ekkOS already solves the memory problem for code — patterns, anti-patterns, directives, and episodic recall persist across sessions. Stateful Computer Use extends this memory layer to GUI interactions.

Architecture

┌─────────────────────────────────────────────────────────┐
│                    Claude Code + CU                      │
│                                                          │
│  ┌──────────┐    ┌──────────────┐    ┌──────────────┐   │
│  │ Pre-Task  │    │  Screenshot  │    │  Post-Task   │   │
│  │  Search   │───▶│  Action Loop │───▶│  Auto-Forge  │   │
│  └──────────┘    └──────────────┘    └──────────────┘   │
│       │                                      │           │
│       │         ekkOS Memory Layer           │           │
│  ┌────▼──────────────────────────────────────▼────┐     │
│  │                                                 │     │
│  │  GUI Patterns ─ Anti-Patterns ─ Directives     │     │
│  │  App Knowledge ─ Workflow Prefs ─ Safety Rules  │     │
│  │                                                 │     │
│  └─────────────────────────────────────────────────┘     │
└─────────────────────────────────────────────────────────┘

The Three Phases

Phase 1: Pre-Task Retrieval

Before Claude starts any Computer Use task, ekkOS searches for relevant GUI patterns:

ekkOS_Search("GUI pattern Safari form submission")
→ Pattern: [CU] Safari — Submit contact form on example.com
  - Click sequence: (450, 320) Login → (300, 400) Form → (500, 450) Submit
  - Anti-pattern: Cookie dialog blocks first click, dismiss at (700, 200) first
  - Works when: Safari 18+, macOS Tahoe, light mode

This eliminates cold-start guessing and gives Claude a proven action path before it even takes the first screenshot.

Phase 2: Safety Directive Enforcement

ekkOS directives act as persistent guardrails during screen control:

| Directive | Type | Purpose | |-----------|------|---------| | Never interact with banking apps | NEVER | Financial safety | | Never enter passwords via CU | NEVER | Credential protection | | Always screenshot before form submit | MUST | Audit trail | | Confirm before Delete/Remove/Drop | MUST | Destructive action guard | | Prefer keyboard shortcuts | PREFER | Efficiency optimization | | Avoid elements < 20px | AVOID | Misclick prevention |

These directives persist across every session — Claude doesn't need to be reminded.

Phase 3: Post-Task Auto-Forge

After a successful Computer Use task, the system automatically forges the interaction as a reusable pattern:

Title: [CU] VS Code — Open terminal and run tests
Tags: [computer-use, gui-pattern, vs-code, auto-forged]

Action Sequence:
1. key: Cmd+` (toggle terminal)
2. wait: 500ms (terminal focus)
3. type: "npm test"
4. key: Enter
5. verify: "Tests passed" in terminal output

Works When:
- App: VS Code 1.96+
- OS: macOS Tahoe
- UI State: terminal was closed, sidebar open

Anti-patterns:
- Clicking View → Terminal menu is slower (3 actions vs 1 shortcut)
- Terminal panel sometimes opens as Panel, not integrated terminal

---

The GUI Pattern Schema

We define a structured schema for Computer Use patterns that captures everything needed for reliable replay:

interface GUIPattern {
  // Identity
  title: string;           // "[CU] {App} — {Action}"
  tags: string[];          // ["computer-use", "gui-pattern", "{app}", ...]

  // The Task
  problem: string;         // What the user wanted to accomplish
  solution: string;        // Successful action sequence

  // Action Sequence (structured)
  actions: {
    type: 'click' | 'type' | 'key' | 'scroll' | 'wait' | 'navigate' | 'verify';
    target?: string;       // Description of UI element
    coordinate?: [number, number];
    text?: string;
    duration?: number;     // For waits
  }[];

  // Conditions
  works_when: {
    app: string;           // Exact app name + version
    os: string;            // macOS version
    resolution?: string;   // Screen resolution
    ui_state?: string;     // Dark/light mode, sidebar state, etc.
  };

  // Learned Failures
  anti_patterns: {
    description: string;
    failed_at_step?: number;
    workaround?: string;
  }[];

  // Metrics (tracked by ekkOS Golden Loop)
  success_rate: number;
  applied_count: number;
  last_verified: string;   // ISO date
}

---

Implementation

Hook Integration

The system integrates via Claude Code's hook architecture:

1. Stop Hook Enhancement — After each turn, the existing `stop.sh` hook calls `computer-use-forge.cjs` 2. Transcript Analysis — The script scans the JSONL transcript for computer use tool calls (`computer_20241022`, `computer_20250124`) 3. Pattern Extraction — Detects app name, action sequence, coordinates, and success/failure 4. Auto-Forge — Calls ekkOS API to store as a tagged GUI pattern 5. Silent Operation — Runs in background, no user interruption

Rules Integration

A Claude Code rules file (`~/.claude/rules/computer-use.md`) instructs Claude to:

Search ekkOS before every Computer Use task

Apply retrieved GUI patterns instead of guessing

Check `ekkOS_Conflict` before interacting with sensitive apps

Forge new patterns after successful tasks

Record anti-patterns from failures

---

Results: From Stateless to Learning

| Metric | Without ekkOS | With ekkOS | |--------|:-------------:|:----------:| | Cold-start accuracy | ~60% | ~90% (pattern-guided) | | Average misclicks per task | 3-5 | 0-1 (learned coordinates) | | Repeated mistakes | Every session | Once (anti-pattern forged) | | Task completion time | Baseline | ~40% faster (skip discovery) | | Safety violations | Possible | Directive-blocked | | Cross-session learning | None | Cumulative |

*Preliminary estimates based on internal testing. Formal benchmarks in progress.*

---

The Bigger Picture

Computer Use is the most "amnesiac" capability in the Claude stack. Every screenshot starts blind. By adding persistent memory, we transform it from a demo-impressive but operationally fragile tool into a learning desktop agent that genuinely improves over time.

This is the Golden Loop applied to a new surface area: Search → Apply → Forge → Improve. The same loop that makes ekkOS effective for code now works for GUI automation.

The long-term vision: an AI that knows your desktop better than you do — not because it was programmed with your preferences, but because it learned them through experience and never forgot.

---

Status & Availability

Research Preview — Available now for ekkOS users with Claude Pro/Max

Components — Hook script, rules file, and safety directives ship with ekkOS CLI

Requirements — Claude Code with Computer Use enabled (macOS), ekkOS memory connected

Roadmap — Structured action replay, visual element hashing, cross-user collective GUI patterns

Stateful Computer Use: Persistent GUI Memory for Autonomous Desktop Agents

Abstract