oc

Signed-off-by: Christoph Schmatzler <christoph@schmatzler.com>
2026-02-04 20:04:32 +00:00
parent 13586f5c44
commit ff8650bedf
19 changed files with 2448 additions and 0 deletions
--- a/profiles/opencode/skill/overseer/SKILL.md
+++ b/profiles/opencode/skill/overseer/SKILL.md
@@ -0,0 +1,191 @@
+---
+name: overseer
+description: Manage tasks via Overseer codemode MCP. Use when tracking multi-session work, breaking down implementation, or persisting context for handoffs.
+license: MIT
+metadata:
+  author: dmmulroy
+  version: "1.0.0"
+---
+
+# Agent Coordination with Overseer
+
+## Core Principle: Tickets, Not Todos
+
+Overseer tasks are **tickets** - structured artifacts with comprehensive context:
+
+- **Description**: One-line summary (issue title)
+- **Context**: Full background, requirements, approach (issue body)
+- **Result**: Implementation details, decisions, outcomes (PR description)
+
+Think: "Would someone understand the what, why, and how from this task alone AND what success looks like?"
+
+## Task IDs are Ephemeral
+
+**Never reference task IDs in external artifacts** (commits, PRs, docs). Task IDs like `task_01JQAZ...` become meaningless once tasks complete. Describe the work itself, not the task that tracked it.
+
+## Overseer vs OpenCode's TodoWrite
+
+|                 | Overseer                              | TodoWrite              |
+| --------------- | ------------------------------------- | ---------------------- |
+| **Persistence** | SQLite database                       | Session-only           |
+| **Context**     | Rich (description + context + result) | Basic                  |
+| **Hierarchy**   | 3-level (milestone -> task -> subtask)| Flat                   |
+
+Use **Overseer** for persistent work. Use **TodoWrite** for ephemeral in-session tracking only.
+
+## When to Use Overseer
+
+**Use Overseer when:**
+- Breaking down complexity into subtasks
+- Work spans multiple sessions
+- Context needs to persist for handoffs
+- Recording decisions for future reference
+
+**Skip Overseer when:**
+- Work is a single atomic action
+- Everything fits in one message exchange
+- Overhead exceeds value
+- TodoWrite is sufficient
+
+## Finding Work
+
+```javascript
+// Get next ready task with full context (recommended for work sessions)
+const task = await tasks.nextReady(milestoneId); // TaskWithContext | null
+if (!task) {
+  console.log("No ready tasks");
+  return;
+}
+
+// Get all ready tasks (for progress overview)
+const readyTasks = await tasks.list({ ready: true }); // Task[]
+```
+
+**Use `nextReady()`** when starting work - returns `TaskWithContext | null` (deepest ready leaf with full context chain + inherited learnings).
+**Use `list({ ready: true })`** for status/progress checks - returns `Task[]` without context chain.
+
+## Basic Workflow
+
+```javascript
+// 1. Get next ready task (returns TaskWithContext | null)
+const task = await tasks.nextReady();
+if (!task) return "No ready tasks";
+
+// 2. Review context (available on TaskWithContext)
+console.log(task.context.own);       // This task's context
+console.log(task.context.parent);    // Parent's context (if depth > 0)
+console.log(task.context.milestone); // Root milestone context (if depth > 1)
+console.log(task.learnings.own);     // Learnings attached to this task (bubbled from children)
+
+// 3. Start work (VCS required - creates bookmark, records start commit)
+await tasks.start(task.id);
+
+// 4. Implement...
+
+// 5. Complete with learnings (VCS required - commits changes, bubbles learnings to parent)
+await tasks.complete(task.id, {
+  result: "Implemented login endpoint with JWT tokens",
+  learnings: ["bcrypt rounds should be 12 for production"]
+});
+```
+
+See @file references/workflow.md for detailed workflow guidance.
+
+## Understanding Task Context
+
+Tasks have **progressive context** - inherited from ancestors:
+
+```javascript
+const task = await tasks.get(taskId); // Returns TaskWithContext
+// task.context.own      - this task's context (always present)
+// task.context.parent   - parent task's context (if depth > 0)
+// task.context.milestone - root milestone's context (if depth > 1)
+
+// Task's own learnings (bubbled from completed children)
+// task.learnings.own - learnings attached to this task
+```
+
+## Return Type Summary
+
+| Method | Returns | Notes |
+|--------|---------|-------|
+| `tasks.get(id)` | `TaskWithContext` | Full context chain + inherited learnings |
+| `tasks.nextReady()` | `TaskWithContext \| null` | Deepest ready leaf with full context |
+| `tasks.list()` | `Task[]` | Basic task fields only |
+| `tasks.create()` | `Task` | No context chain |
+| `tasks.start/complete()` | `Task` | No context chain |
+
+## Blockers
+
+Blockers prevent a task from being ready until the blocker completes.
+
+**Constraints:**
+- Blockers cannot be self
+- Blockers cannot be ancestors (parent, grandparent, etc.)
+- Blockers cannot be descendants
+- Creating/reparenting with invalid blockers is rejected
+
+```javascript
+// Add blocker - taskA waits for taskB
+await tasks.block(taskA.id, taskB.id);
+
+// Remove blocker
+await tasks.unblock(taskA.id, taskB.id);
+```
+
+## Task Hierarchies
+
+Three levels: **Milestone** (depth 0) -> **Task** (depth 1) -> **Subtask** (depth 2).
+
+| Level | Name | Purpose | Example |
+|-------|------|---------|---------|
+| 0 | **Milestone** | Large initiative | "Add user authentication system" |
+| 1 | **Task** | Significant work item | "Implement JWT middleware" |
+| 2 | **Subtask** | Atomic step | "Add token verification function" |
+
+**Choosing the right level:**
+- Small feature (1-2 files) -> Single task
+- Medium feature (3-7 steps) -> Task with subtasks
+- Large initiative (5+ tasks) -> Milestone with tasks
+
+See @file references/hierarchies.md for detailed guidance.
+
+## Recording Results
+
+Complete tasks **immediately after implementing AND verifying**:
+- Capture decisions while fresh
+- Note deviations from plan
+- Document verification performed
+- Create follow-up tasks for tech debt
+
+Your result must include explicit verification evidence. See @file references/verification.md.
+
+## Best Practices
+
+1. **Right-size tasks**: Completable in one focused session
+2. **Clear completion criteria**: Context should define "done"
+3. **Don't over-decompose**: 3-7 children per parent
+4. **Action-oriented descriptions**: Start with verbs ("Add", "Fix", "Update")
+5. **Verify before completing**: Tests passing, manual testing done
+
+---
+
+## Reading Order
+
+| Task | File |
+|------|------|
+| Understanding API | @file references/api.md |
+| Implementation workflow | @file references/workflow.md |
+| Task decomposition | @file references/hierarchies.md |
+| Good/bad examples | @file references/examples.md |
+| Verification checklist | @file references/verification.md |
+
+## In This Reference
+
+| File | Purpose |
+|------|---------|
+| `references/api.md` | Overseer MCP codemode API types/methods |
+| `references/workflow.md` | Start->implement->complete workflow |
+| `references/hierarchies.md` | Milestone/task/subtask organization |
+| `references/examples.md` | Good/bad context and result examples |
+| `references/verification.md` | Verification checklist and process |
--- a/profiles/opencode/skill/overseer/references/api.md
+++ b/profiles/opencode/skill/overseer/references/api.md
@@ -0,0 +1,192 @@
+# Overseer Codemode MCP API
+
+Execute JavaScript code to interact with Overseer task management.
+
+## Task Interface
+
+```typescript
+// Basic task - returned by list(), create(), start(), complete()
+// Note: Does NOT include context or learnings fields
+interface Task {
+  id: string;
+  parentId: string | null;
+  description: string;
+  priority: 1 | 2 | 3 | 4 | 5;
+  completed: boolean;
+  completedAt: string | null;
+  startedAt: string | null;
+  createdAt: string;            // ISO 8601
+  updatedAt: string;
+  result: string | null;        // Completion notes
+  commitSha: string | null;     // Auto-populated on complete
+  depth: 0 | 1 | 2;             // 0=milestone, 1=task, 2=subtask
+  blockedBy?: string[];         // Blocking task IDs (omitted if empty)
+  blocks?: string[];            // Tasks this blocks (omitted if empty)
+  bookmark?: string;            // VCS bookmark name (if started)
+  startCommit?: string;         // Commit SHA at start
+  effectivelyBlocked: boolean;  // True if task OR ancestor has incomplete blockers
+}
+
+// Task with full context - returned by get(), nextReady()
+interface TaskWithContext extends Task {
+  context: {
+    own: string;              // This task's context
+    parent?: string;          // Parent's context (depth > 0)
+    milestone?: string;       // Root milestone's context (depth > 1)
+  };
+  learnings: {
+    own: Learning[];          // This task's learnings (bubbled from completed children)
+    parent: Learning[];       // Parent's learnings (depth > 0)
+    milestone: Learning[];    // Milestone's learnings (depth > 1)
+  };
+}
+
+// Task tree structure - returned by tree()
+interface TaskTree {
+  task: Task;
+  children: TaskTree[];
+}
+
+// Progress summary - returned by progress()
+interface TaskProgress {
+  total: number;
+  completed: number;
+  ready: number;     // !completed && !effectivelyBlocked
+  blocked: number;   // !completed && effectivelyBlocked
+}
+
+// Task type alias for depth filter
+type TaskType = "milestone" | "task" | "subtask";
+```
+
+## Learning Interface
+
+```typescript
+interface Learning {
+  id: string;
+  taskId: string;
+  content: string;
+  sourceTaskId: string | null;
+  createdAt: string;
+}
+```
+
+## Tasks API
+
+```typescript
+declare const tasks: {
+  list(filter?: { 
+    parentId?: string; 
+    ready?: boolean; 
+    completed?: boolean;
+    depth?: 0 | 1 | 2;    // 0=milestones, 1=tasks, 2=subtasks
+    type?: TaskType;      // Alias: "milestone"|"task"|"subtask" (mutually exclusive with depth)
+  }): Promise<Task[]>;
+  get(id: string): Promise<TaskWithContext>;
+  create(input: {
+    description: string;
+    context?: string;
+    parentId?: string;
+    priority?: 1 | 2 | 3 | 4 | 5;  // Required range: 1-5
+    blockedBy?: string[];
+  }): Promise<Task>;
+  update(id: string, input: {
+    description?: string;
+    context?: string;
+    priority?: 1 | 2 | 3 | 4 | 5;
+    parentId?: string;
+  }): Promise<Task>;
+  start(id: string): Promise<Task>;
+  complete(id: string, input?: { result?: string; learnings?: string[] }): Promise<Task>;
+  reopen(id: string): Promise<Task>;
+  delete(id: string): Promise<void>;
+  block(taskId: string, blockerId: string): Promise<void>;
+  unblock(taskId: string, blockerId: string): Promise<void>;
+  nextReady(milestoneId?: string): Promise<TaskWithContext | null>;
+  tree(rootId?: string): Promise<TaskTree | TaskTree[]>;
+  search(query: string): Promise<Task[]>;
+  progress(rootId?: string): Promise<TaskProgress>;
+};
+```
+
+| Method | Returns | Description |
+|--------|---------|-------------|
+| `list` | `Task[]` | Filter by `parentId`, `ready`, `completed`, `depth`, `type` |
+| `get` | `TaskWithContext` | Get task with full context chain + inherited learnings |
+| `create` | `Task` | Create task (priority must be 1-5) |
+| `update` | `Task` | Update description, context, priority, parentId |
+| `start` | `Task` | **VCS required** - creates bookmark, records start commit |
+| `complete` | `Task` | **VCS required** - commits changes + bubbles learnings to parent |
+| `reopen` | `Task` | Reopen completed task |
+| `delete` | `void` | Delete task + best-effort VCS bookmark cleanup |
+| `block` | `void` | Add blocker (cannot be self, ancestor, or descendant) |
+| `unblock` | `void` | Remove blocker relationship |
+| `nextReady` | `TaskWithContext \| null` | Get deepest ready leaf with full context |
+| `tree` | `TaskTree \| TaskTree[]` | Get task tree (all milestones if no ID) |
+| `search` | `Task[]` | Search by description/context/result (case-insensitive) |
+| `progress` | `TaskProgress` | Aggregate counts for milestone or all tasks |
+
+## Learnings API
+
+Learnings are added via `tasks.complete(id, { learnings: [...] })` and bubble to immediate parent (preserving `sourceTaskId`).
+
+```typescript
+declare const learnings: {
+  list(taskId: string): Promise<Learning[]>;
+};
+```
+
+| Method | Description |
+|--------|-------------|
+| `list` | List learnings for task |
+
+## VCS Integration (Required for Workflow)
+
+VCS operations are **automatically handled** by the tasks API:
+
+| Task Operation | VCS Effect |
+|----------------|------------|
+| `tasks.start(id)` | **VCS required** - creates bookmark `task/<id>`, records start commit |
+| `tasks.complete(id)` | **VCS required** - commits changes (NothingToCommit = success) |
+| `tasks.delete(id)` | Best-effort bookmark cleanup (logs warning on failure) |
+
+**VCS (jj or git) is required** for start/complete. Fails with `NotARepository` if none found. CRUD operations work without VCS.
+
+## Quick Examples
+
+```javascript
+// Create milestone with subtask
+const milestone = await tasks.create({
+  description: "Build authentication system",
+  context: "JWT-based auth with refresh tokens",
+  priority: 1
+});
+
+const subtask = await tasks.create({
+  description: "Implement token refresh logic",
+  parentId: milestone.id,
+  context: "Handle 7-day expiry"
+});
+
+// Start work (auto-creates VCS bookmark)
+await tasks.start(subtask.id);
+
+// ... do implementation work ...
+
+// Complete task with learnings (VCS required - commits changes, bubbles learnings to parent)
+await tasks.complete(subtask.id, {
+  result: "Implemented using jose library",
+  learnings: ["Use jose instead of jsonwebtoken"]
+});
+
+// Get progress summary
+const progress = await tasks.progress(milestone.id);
+// -> { total: 2, completed: 1, ready: 1, blocked: 0 }
+
+// Search tasks
+const authTasks = await tasks.search("authentication");
+
+// Get task tree
+const tree = await tasks.tree(milestone.id);
+// -> { task: Task, children: TaskTree[] }
+```
--- a/profiles/opencode/skill/overseer/references/examples.md
+++ b/profiles/opencode/skill/overseer/references/examples.md
@@ -0,0 +1,195 @@
+# Examples
+
+Good and bad examples for writing task context and results.
+
+## Writing Context
+
+Context should include everything needed to do the work without asking questions:
+- **What** needs to be done and why
+- **Implementation approach** (steps, files to modify, technical choices)
+- **Done when** (acceptance criteria)
+
+### Good Context Example
+
+```javascript
+await tasks.create({
+  description: "Migrate storage to one file per task",
+  context: `Change storage format for git-friendliness:
+
+Structure:
+.overseer/
+└── tasks/
+    ├── task_01ABC.json
+    └── task_02DEF.json
+
+NO INDEX - just scan task files. For typical task counts (<100), this is fast.
+
+Implementation:
+1. Update storage.ts:
+   - read(): Scan .overseer/tasks/*.json, parse each, return TaskStore
+   - write(task): Write single task to .overseer/tasks/{id}.json
+   - delete(id): Remove .overseer/tasks/{id}.json
+   - Add readTask(id) for single task lookup
+
+2. Task file format: Same as current Task schema (one task per file)
+
+3. Migration: On read, if old tasks.json exists, migrate to new format
+
+4. Update tests
+
+Benefits:
+- Create = new file (never conflicts)
+- Update = single file change
+- Delete = remove file
+- No index to maintain or conflict
+- git diff shows exactly which tasks changed`
+});
+```
+
+**Why it works:** States the goal, shows the structure, lists specific implementation steps, explains benefits. Someone could pick this up without asking questions.
+
+### Bad Context Example
+
+```javascript
+await tasks.create({
+  description: "Add auth",
+  context: "Need to add authentication"
+});
+```
+
+**What's missing:** How to implement it, what files, what's done when, technical approach.
+
+## Writing Results
+
+Results should capture what was actually done:
+- **What changed** (implementation summary)
+- **Key decisions** (and why)
+- **Verification** (tests passing, manual testing done)
+
+### Good Result Example
+
+```javascript
+await tasks.complete(taskId, `Migrated storage from single tasks.json to one file per task:
+
+Structure:
+- Each task stored as .overseer/tasks/{id}.json
+- No index file (avoids merge conflicts)
+- Directory scanned on read to build task list
+
+Implementation:
+- Modified Storage.read() to scan .overseer/tasks/ directory
+- Modified Storage.write() to write/delete individual task files
+- Auto-migration from old single-file format on first read
+- Atomic writes using temp file + rename pattern
+
+Trade-offs:
+- Slightly slower reads (must scan directory + parse each file)
+- Acceptable since task count is typically small (<100)
+- Better git history - each task change is isolated
+
+Verification:
+- All 60 tests passing
+- Build successful
+- Manually tested migration: old -> new format works`);
+```
+
+**Why it works:** States what changed, lists implementation details, explains trade-offs, confirms verification.
+
+### Bad Result Example
+
+```javascript
+await tasks.complete(taskId, "Fixed the storage issue");
+```
+
+**What's missing:** What was actually implemented, how, what decisions were made, verification evidence.
+
+## Subtask Context Example
+
+Link subtasks to their parent and explain what this piece does specifically:
+
+```javascript
+await tasks.create({
+  description: "Add token verification function",
+  parentId: jwtTaskId,
+  context: `Part of JWT middleware (parent task). This subtask: token verification.
+
+What it does:
+- Verify JWT signature and expiration on protected routes
+- Extract user ID from token payload
+- Attach user object to request
+- Return 401 for invalid/expired tokens
+
+Implementation:
+- Create src/middleware/verify-token.ts
+- Export verifyToken middleware function
+- Use jose library (preferred over jsonwebtoken)
+- Handle expired vs invalid token cases separately
+
+Done when:
+- Middleware function complete and working
+- Unit tests cover valid/invalid/expired scenarios
+- Integrated into auth routes in server.ts
+- Parent task can use this to protect endpoints`
+});
+```
+
+## Error Handling Examples
+
+### Handling Pending Children
+
+```javascript
+try {
+  await tasks.complete(taskId, "Done");
+} catch (err) {
+  if (err.message.includes("pending children")) {
+    const pending = await tasks.list({ parentId: taskId, completed: false });
+    console.log(`Cannot complete: ${pending.length} children pending`);
+    for (const child of pending) {
+      console.log(`- ${child.id}: ${child.description}`);
+    }
+    return;
+  }
+  throw err;
+}
+```
+
+### Handling Blocked Tasks
+
+```javascript
+const task = await tasks.get(taskId);
+
+if (task.blockedBy.length > 0) {
+  console.log("Task is blocked by:");
+  for (const blockerId of task.blockedBy) {
+    const blocker = await tasks.get(blockerId);
+    console.log(`- ${blocker.description} (${blocker.completed ? 'done' : 'pending'})`);
+  }
+  return "Cannot start - blocked by other tasks";
+}
+
+await tasks.start(taskId);
+```
+
+## Creating Task Hierarchies
+
+```javascript
+// Create milestone with tasks
+const milestone = await tasks.create({
+  description: "Implement user authentication",
+  context: "Full auth: JWT, login/logout, password reset, rate limiting",
+  priority: 2
+});
+
+const subtasks = [
+  "Add login endpoint",
+  "Add logout endpoint", 
+  "Implement JWT token service",
+  "Add password reset flow"
+];
+
+for (const desc of subtasks) {
+  await tasks.create({ description: desc, parentId: milestone.id });
+}
+```
+
+See @file references/hierarchies.md for sequential subtasks with blockers.
--- a/profiles/opencode/skill/overseer/references/hierarchies.md
+++ b/profiles/opencode/skill/overseer/references/hierarchies.md
@@ -0,0 +1,170 @@
+# Task Hierarchies
+
+Guidance for organizing work into milestones, tasks, and subtasks.
+
+## Three Levels
+
+| Level | Name | Purpose | Example |
+|-------|------|---------|---------|
+| 0 | **Milestone** | Large initiative (5+ tasks) | "Add user authentication system" |
+| 1 | **Task** | Significant work item | "Implement JWT middleware" |
+| 2 | **Subtask** | Atomic implementation step | "Add token verification function" |
+
+**Maximum depth is 3 levels.** Attempting to create a child of a subtask will fail.
+
+## When to Use Each Level
+
+### Single Task (No Hierarchy)
+- Small feature (1-2 files, ~1 session)
+- Work is atomic, no natural breakdown
+
+### Task with Subtasks
+- Medium feature (3-5 files, 3-7 steps)
+- Work naturally decomposes into discrete steps
+- Subtasks could be worked on independently
+
+### Milestone with Tasks
+- Large initiative (multiple areas, many sessions)
+- Work spans 5+ distinct tasks
+- You want high-level progress tracking
+
+## Creating Hierarchies
+
+```javascript
+// Create the milestone
+const milestone = await tasks.create({
+  description: "Add user authentication system",
+  context: "Full auth system with JWT tokens, password reset...",
+  priority: 2
+});
+
+// Create tasks under it
+const jwtTask = await tasks.create({
+  description: "Implement JWT token generation",
+  context: "Create token service with signing and verification...",
+  parentId: milestone.id
+});
+
+const resetTask = await tasks.create({
+  description: "Add password reset flow",
+  context: "Email-based password reset with secure tokens...",
+  parentId: milestone.id
+});
+
+// For complex tasks, add subtasks
+const verifySubtask = await tasks.create({
+  description: "Add token verification function",
+  context: "Verify JWT signature and expiration...",
+  parentId: jwtTask.id
+});
+```
+
+## Subtask Best Practices
+
+Each subtask should be:
+
+- **Independently understandable**: Clear on its own
+- **Linked to parent**: Reference parent, explain how this piece fits
+- **Specific scope**: What this subtask does vs what parent/siblings do
+- **Clear completion**: Define "done" for this piece specifically
+
+Example subtask context:
+```
+Part of JWT middleware (parent task). This subtask: token verification.
+
+What it does:
+- Verify JWT signature and expiration
+- Extract user ID from payload
+- Return 401 for invalid/expired tokens
+
+Done when:
+- Function complete and tested
+- Unit tests cover valid/invalid/expired cases
+```
+
+## Decomposition Strategy
+
+When faced with large tasks:
+
+1. **Assess scope**: Is this milestone-level (5+ tasks) or task-level (3-7 subtasks)?
+2. Create parent task/milestone with overall goal and context
+3. Analyze and identify 3-7 logical children
+4. Create children with specific contexts and boundaries
+5. Work through systematically, completing with results
+6. Complete parent with summary of overall implementation
+
+### Don't Over-Decompose
+
+- **3-7 children per parent** is usually right
+- If you'd only have 1-2 subtasks, just make separate tasks
+- If you need depth 3+, restructure your breakdown
+
+## Viewing Hierarchies
+
+```javascript
+// List all tasks under a milestone
+const children = await tasks.list({ parentId: milestoneId });
+
+// Get task with context breadcrumb
+const task = await tasks.get(taskId);
+// task.context.parent - parent's context
+// task.context.milestone - root milestone's context
+
+// Check progress
+const pending = await tasks.list({ parentId: milestoneId, completed: false });
+const done = await tasks.list({ parentId: milestoneId, completed: true });
+console.log(`Progress: ${done.length}/${done.length + pending.length}`);
+```
+
+## Completion Rules
+
+1. **Cannot complete with pending children**
+   ```javascript
+   // This will fail if task has incomplete subtasks
+   await tasks.complete(taskId, "Done");
+   // Error: "pending children"
+   ```
+
+2. **Complete children first**
+   - Work through subtasks systematically
+   - Complete each with meaningful results
+
+3. **Parent result summarizes overall implementation**
+   ```javascript
+   await tasks.complete(milestoneId, `User authentication system complete:
+
+   Implemented:
+   - JWT token generation and verification
+   - Login/logout endpoints
+   - Password reset flow
+   - Rate limiting
+
+   5 tasks completed, all tests passing.`);
+   ```
+
+## Blocking Dependencies
+
+Use `blockedBy` for cross-task dependencies:
+
+```javascript
+// Create task that depends on another
+const deployTask = await tasks.create({
+  description: "Deploy to production",
+  context: "...",
+  blockedBy: [testTaskId, reviewTaskId]
+});
+
+// Add blocker to existing task
+await tasks.block(deployTaskId, testTaskId);
+
+// Remove blocker
+await tasks.unblock(deployTaskId, testTaskId);
+```
+
+**Use blockers when:**
+- Task B cannot start until Task A completes
+- Multiple tasks depend on a shared prerequisite
+
+**Don't use blockers when:**
+- Tasks can be worked on in parallel
+- The dependency is just logical grouping (use subtasks instead)
--- a/profiles/opencode/skill/overseer/references/verification.md
+++ b/profiles/opencode/skill/overseer/references/verification.md
@@ -0,0 +1,186 @@
+# Verification Guide
+
+Before marking any task complete, you MUST verify your work. Verification separates "I think it's done" from "it's actually done."
+
+## The Verification Process
+
+1. **Re-read the task context**: What did you originally commit to do?
+2. **Check acceptance criteria**: Does your implementation satisfy the "Done when" conditions?
+3. **Run relevant tests**: Execute the test suite and document results
+4. **Test manually**: Actually try the feature/change yourself
+5. **Compare with requirements**: Does what you built match what was asked?
+
+## Strong vs Weak Verification
+
+### Strong Verification Examples
+
+- "All 60 tests passing, build successful"
+- "All 69 tests passing (4 new tests for middleware edge cases)"
+- "Manually tested with valid/invalid/expired tokens - all cases work"
+- "Ran `cargo test` - 142 tests passed, 0 failed"
+
+### Weak Verification (Avoid)
+
+- "Should work now" - "should" means not verified
+- "Made the changes" - no evidence it works
+- "Added tests" - did the tests pass? What's the count?
+- "Fixed the bug" - what bug? Did you verify the fix?
+- "Done" - done how? prove it
+
+## Verification by Task Type
+
+| Task Type | How to Verify |
+|-----------|---------------|
+| Code changes | Run full test suite, document passing count |
+| New features | Run tests + manual testing of functionality |
+| Configuration | Test the config works (run commands, check workflows) |
+| Documentation | Verify examples work, links resolve, formatting renders |
+| Refactoring | Confirm tests still pass, no behavior changes |
+| Bug fixes | Reproduce bug first, verify fix, add regression test |
+
+## Cross-Reference Checklist
+
+Before marking complete, verify all applicable items:
+
+- [ ] Task description requirements met
+- [ ] Context "Done when" criteria satisfied
+- [ ] Tests passing (document count: "All X tests passing")
+- [ ] Build succeeds (if applicable)
+- [ ] Manual testing done (describe what you tested)
+- [ ] No regressions introduced
+- [ ] Edge cases considered (error handling, invalid input)
+- [ ] Follow-up work identified (created new tasks if needed)
+
+**If you can't check all applicable boxes, the task isn't done yet.**
+
+## Result Examples with Verification
+
+### Code Implementation
+
+```javascript
+await tasks.complete(taskId, `Implemented JWT middleware:
+
+Implementation:
+- Created src/middleware/verify-token.ts
+- Separated 'expired' vs 'invalid' error codes
+- Added user extraction from payload
+
+Verification:
+- All 69 tests passing (4 new tests for edge cases)
+- Manually tested with valid token: Access granted
+- Manually tested with expired token: 401 with 'token_expired'
+- Manually tested with invalid signature: 401 with 'invalid_token'`);
+```
+
+### Configuration/Infrastructure
+
+```javascript
+await tasks.complete(taskId, `Added GitHub Actions workflow for CI:
+
+Implementation:
+- Created .github/workflows/ci.yml
+- Jobs: lint, test, build with pnpm cache
+
+Verification:
+- Pushed to test branch, opened PR #123
+- Workflow triggered automatically
+- All jobs passed (lint: 0 errors, test: 69/69, build: success)
+- Total run time: 2m 34s`);
+```
+
+### Refactoring
+
+```javascript
+await tasks.complete(taskId, `Refactored storage to one file per task:
+
+Implementation:
+- Split tasks.json into .overseer/tasks/{id}.json files
+- Added auto-migration from old format
+- Atomic writes via temp+rename
+
+Verification:
+- All 60 tests passing (including 8 storage tests)
+- Build successful
+- Manually tested migration: old -> new format works
+- Confirmed git diff shows only changed tasks`);
+```
+
+### Bug Fix
+
+```javascript
+await tasks.complete(taskId, `Fixed login validation accepting usernames with spaces:
+
+Root cause:
+- Validation regex didn't account for leading/trailing spaces
+
+Fix:
+- Added .trim() before validation in src/auth/validate.ts:42
+- Updated regex to reject internal spaces
+
+Verification:
+- All 45 tests passing (2 new regression tests)
+- Manually tested:
+  - " admin" -> rejected (leading space)
+  - "admin " -> rejected (trailing space)
+  - "ad min" -> rejected (internal space)
+  - "admin" -> accepted`);
+```
+
+### Documentation
+
+```javascript
+await tasks.complete(taskId, `Updated API documentation for auth endpoints:
+
+Implementation:
+- Added docs for POST /auth/login
+- Added docs for POST /auth/logout
+- Added docs for POST /auth/refresh
+- Included example requests/responses
+
+Verification:
+- All code examples tested and working
+- Links verified (no 404s)
+- Rendered in local preview - formatting correct
+- Spell-checked content`);
+```
+
+## Common Verification Mistakes
+
+| Mistake | Better Approach |
+|---------|-----------------|
+| "Tests pass" | "All 42 tests passing" (include count) |
+| "Manually tested" | "Manually tested X, Y, Z scenarios" (be specific) |
+| "Works" | "Works: [evidence]" (show proof) |
+| "Fixed" | "Fixed: [root cause] -> [solution] -> [verification]" |
+
+## When Verification Fails
+
+If verification reveals issues:
+
+1. **Don't complete the task** - it's not done
+2. **Document what failed** in task context
+3. **Fix the issues** before completing
+4. **Re-verify** after fixes
+
+```javascript
+// Update context with failure notes
+await tasks.update(taskId, {
+  context: task.context + `
+
+Verification attempt 1 (failed):
+- Tests: 41/42 passing
+- Failing: test_token_refresh - timeout issue
+- Need to investigate async handling`
+});
+
+// After fixing
+await tasks.complete(taskId, `Implemented token refresh:
+
+Implementation:
+- Added refresh endpoint
+- Fixed async timeout (was missing await)
+
+Verification:
+- All 42 tests passing (fixed timeout issue)
+- Manual testing: refresh works within 30s window`);
+```
--- a/profiles/opencode/skill/overseer/references/workflow.md
+++ b/profiles/opencode/skill/overseer/references/workflow.md
@@ -0,0 +1,164 @@
+# Implementation Workflow
+
+Step-by-step guide for working with Overseer tasks during implementation.
+
+## 1. Get Next Ready Task
+
+```javascript
+// Get next task with full context (recommended)
+const task = await tasks.nextReady();
+
+// Or scope to specific milestone
+const task = await tasks.nextReady(milestoneId);
+
+if (!task) {
+  return "No tasks ready - all blocked or completed";
+}
+```
+
+`nextReady()` returns a `TaskWithContext` (task with inherited context and learnings) or `null`.
+
+## 2. Review Context
+
+Before starting, verify you can answer:
+- **What** needs to be done specifically?
+- **Why** is this needed?
+- **How** should it be implemented?
+- **When** is it done (acceptance criteria)?
+
+```javascript
+const task = await tasks.get(taskId);
+
+// Task's own context
+console.log("Task:", task.context.own);
+
+// Parent context (if task has parent)
+if (task.context.parent) {
+  console.log("Parent:", task.context.parent);
+}
+
+// Milestone context (if depth > 1)
+if (task.context.milestone) {
+  console.log("Milestone:", task.context.milestone);
+}
+
+// Task's own learnings (bubbled from completed children)
+console.log("Task learnings:", task.learnings.own);
+```
+
+**If any answer is unclear:**
+1. Check parent task or completed blockers for details
+2. Suggest entering plan mode to flesh out requirements
+
+**Proceed without full context when:**
+- Task is trivial/atomic (e.g., "Add .gitignore entry")
+- Conversation already provides the missing context
+- Description itself is sufficiently detailed
+
+## 3. Start Task
+
+```javascript
+await tasks.start(taskId);
+```
+
+**VCS Required:** Creates bookmark `task/<id>`, records start commit. Fails with `NotARepository` if no jj/git found.
+
+After starting, the task status changes to `in_progress`.
+
+## 4. Implement
+
+Work on the task implementation. Note any learnings to include when completing.
+
+## 5. Verify Work
+
+Before completing, verify your implementation. See @file references/verification.md for full checklist.
+
+Quick checklist:
+- [ ] Task description requirements met
+- [ ] Context "Done when" criteria satisfied
+- [ ] Tests passing (document count)
+- [ ] Build succeeds
+- [ ] Manual testing done
+
+## 6. Complete Task with Learnings
+
+```javascript
+await tasks.complete(taskId, {
+  result: `Implemented login endpoint:
+
+Implementation:
+- Created src/auth/login.ts
+- Added JWT token generation
+- Integrated with user service
+
+Verification:
+- All 42 tests passing (3 new)
+- Manually tested valid/invalid credentials`,
+  learnings: [
+    "bcrypt rounds should be 12+ for production",
+    "jose library preferred over jsonwebtoken"
+  ]
+});
+```
+
+**VCS Required:** Commits changes (NothingToCommit treated as success), then deletes the task's bookmark (best-effort) and clears the DB bookmark field on success. Fails with `NotARepository` if no jj/git found.
+
+**Learnings Effect:** Learnings bubble to immediate parent only. `sourceTaskId` is preserved through bubbling, so if this task's learnings later bubble further, the origin is tracked.
+
+The `result` becomes part of the task's permanent record.
+
+## VCS Integration (Required for Workflow)
+
+VCS operations are **automatically handled** by the tasks API:
+
+| Task Operation | VCS Effect |
+|----------------|------------|
+| `tasks.start(id)` | **VCS required** - creates bookmark `task/<id>`, records start commit |
+| `tasks.complete(id)` | **VCS required** - commits changes, deletes bookmark (best-effort), clears DB bookmark on success |
+| `tasks.complete(milestoneId)` | Same + deletes ALL descendant bookmarks recursively (depth-1 and depth-2) |
+| `tasks.delete(id)` | Best-effort bookmark cleanup (logs warning on failure) |
+
+**Note:** VCS (jj or git) is required for start/complete. CRUD operations work without VCS.
+
+## Error Handling
+
+### Pending Children
+
+```javascript
+try {
+  await tasks.complete(taskId, "Done");
+} catch (err) {
+  if (err.message.includes("pending children")) {
+    const pending = await tasks.list({ parentId: taskId, completed: false });
+    return `Cannot complete: ${pending.length} children pending`;
+  }
+  throw err;
+}
+```
+
+### Task Not Ready
+
+```javascript
+const task = await tasks.get(taskId);
+
+// Check if blocked
+if (task.blockedBy.length > 0) {
+  console.log("Blocked by:", task.blockedBy);
+  // Complete blockers first or unblock
+  await tasks.unblock(taskId, blockerId);
+}
+```
+
+## Complete Workflow Example
+
+```javascript
+const task = await tasks.nextReady();
+if (!task) return "No ready tasks";
+
+await tasks.start(task.id);
+// ... implement ...
+await tasks.complete(task.id, {
+  result: "Implemented: ... Verification: All 58 tests passing",
+  learnings: ["Use jose for JWT"]
+});
+```