diff --git a/profiles/opencode.nix b/profiles/opencode.nix index 6093d49..d37bc49 100644 --- a/profiles/opencode.nix +++ b/profiles/opencode.nix @@ -80,6 +80,11 @@ APPSIGNAL_API_KEY = "{env:APPSIGNAL_API_KEY}"; }; }; + overseer = { + enabled = true; + type = "local"; + command = ["npx" "-y" "@dmmulroy/overseer" "mcp"]; + }; }; }; }; diff --git a/profiles/opencode/command/overseer-plan.md b/profiles/opencode/command/overseer-plan.md new file mode 100644 index 0000000..2e4b127 --- /dev/null +++ b/profiles/opencode/command/overseer-plan.md @@ -0,0 +1,17 @@ +--- +description: Convert a markdown plan/spec to Overseer tasks +--- + +Convert markdown planning documents into trackable Overseer task hierarchies. + +First, invoke the skill tool to load the overseer-plan skill: + +``` +skill({ name: 'overseer-plan' }) +``` + +Then follow the skill instructions to convert the document. + + +$ARGUMENTS + diff --git a/profiles/opencode/command/overseer.md b/profiles/opencode/command/overseer.md new file mode 100644 index 0000000..03fcdc1 --- /dev/null +++ b/profiles/opencode/command/overseer.md @@ -0,0 +1,17 @@ +--- +description: Manage tasks via Overseer - create, list, start, complete, find ready work +--- + +Task orchestration via Overseer codemode MCP. + +First, invoke the skill tool to load the overseer skill: + +``` +skill({ name: 'overseer' }) +``` + +Then follow the skill instructions to manage tasks. + + +$ARGUMENTS + diff --git a/profiles/opencode/command/plan-spec.md b/profiles/opencode/command/plan-spec.md new file mode 100644 index 0000000..30ae7dc --- /dev/null +++ b/profiles/opencode/command/plan-spec.md @@ -0,0 +1,17 @@ +--- +description: Dialogue-driven spec development through skeptical questioning +--- + +Develop implementation-ready specs through iterative dialogue and skeptical questioning. + +First, invoke the skill tool to load the spec-planner skill: + +``` +skill({ name: 'spec-planner' }) +``` + +Then follow the skill instructions to develop the spec. + + +$ARGUMENTS + diff --git a/profiles/opencode/skill/overseer-plan/SKILL.md b/profiles/opencode/skill/overseer-plan/SKILL.md new file mode 100644 index 0000000..fbb5bb2 --- /dev/null +++ b/profiles/opencode/skill/overseer-plan/SKILL.md @@ -0,0 +1,110 @@ +--- +name: overseer-plan +description: Convert markdown planning documents to Overseer tasks via MCP codemode. Use when converting plans, specs, or design docs to trackable task hierarchies. +license: MIT +metadata: + author: dmmulroy + version: "1.0.0" +--- + +# Converting Markdown Documents to Overseer Tasks + +Use `/overseer-plan` to convert any markdown planning document into trackable Overseer tasks. + +## When to Use + +- After completing a plan in plan mode +- Converting specs/design docs to implementation tasks +- Creating tasks from roadmap or milestone documents + +## Usage + +``` +/overseer-plan +/overseer-plan --priority 3 # Set priority (1-5) +/overseer-plan --parent # Create as child of existing task +``` + +## What It Does + +1. Reads markdown file +2. Extracts title from first `#` heading (strips "Plan: " prefix) +3. Creates Overseer milestone (or child task if `--parent` provided) +4. Analyzes structure for child task breakdown +5. Creates child tasks (depth 1) or subtasks (depth 2) when appropriate +6. Returns task ID and breakdown summary + +## Hierarchy Levels + +| Depth | Name | Example | +|-------|------|---------| +| 0 | **Milestone** | "Add user authentication system" | +| 1 | **Task** | "Implement JWT middleware" | +| 2 | **Subtask** | "Add token verification function" | + +## Breakdown Decision + +**Create subtasks when:** +- 3-7 clearly separable work items +- Implementation across multiple files/components +- Clear sequential dependencies + +**Keep single milestone when:** +- 1-2 steps only +- Work items tightly coupled +- Plan is exploratory/investigative + +## Task Quality Criteria + +Every task must be: +- **Atomic**: Single committable unit of work +- **Validated**: Has tests OR explicit acceptance criteria in context ("Done when: ...") +- **Clear**: Technical, specific, imperative verb + +Every milestone must: +- **Demoable**: Produces runnable/testable increment +- **Builds on prior**: Can depend on previous milestone's output + +## Review Workflow + +1. Analyze document -> propose breakdown +2. **Invoke Oracle** to review breakdown and suggest improvements +3. Incorporate feedback +4. Create in Overseer (persists to SQLite via MCP) + +## After Creating + +```javascript +await tasks.get(""); // TaskWithContext (full context + learnings) +await tasks.list({ parentId: "" }); // Task[] (children without context chain) +await tasks.start(""); // Task (VCS required - creates bookmark, records start commit) +await tasks.complete("", { result: "...", learnings: [...] }); // Task (VCS required - commits, bubbles learnings) +``` + +**VCS Required**: `start` and `complete` require jj or git (fail with `NotARepository` if none found). CRUD operations work without VCS. + +**Note**: Priority must be 1-5. Blockers cannot be ancestors or descendants. + +## When NOT to Use + +- Document incomplete or exploratory +- Content not actionable +- No meaningful planning content + +--- + +## Reading Order + +| Task | File | +|------|------| +| Understanding API | @file references/api.md | +| Agent implementation | @file references/implementation.md | +| See examples | @file references/examples.md | + +## In This Reference + +| File | Purpose | +|------|---------| +| `references/api.md` | Overseer MCP codemode API types/methods | +| `references/implementation.md` | Step-by-step execution instructions for agent | +| `references/examples.md` | Complete worked examples | diff --git a/profiles/opencode/skill/overseer-plan/references/api.md b/profiles/opencode/skill/overseer-plan/references/api.md new file mode 100644 index 0000000..a61cbeb --- /dev/null +++ b/profiles/opencode/skill/overseer-plan/references/api.md @@ -0,0 +1,192 @@ +# Overseer Codemode MCP API + +Execute JavaScript code to interact with Overseer task management. + +## Task Interfaces + +```typescript +// Basic task - returned by list(), create(), start(), complete() +// Note: Does NOT include context or learnings fields +interface Task { + id: string; + parentId: string | null; + description: string; + priority: 1 | 2 | 3 | 4 | 5; + completed: boolean; + completedAt: string | null; + startedAt: string | null; + createdAt: string; // ISO 8601 + updatedAt: string; + result: string | null; // Completion notes + commitSha: string | null; // Auto-populated on complete + depth: 0 | 1 | 2; // 0=milestone, 1=task, 2=subtask + blockedBy?: string[]; // Blocking task IDs (omitted if empty) + blocks?: string[]; // Tasks this blocks (omitted if empty) + bookmark?: string; // VCS bookmark name (if started) + startCommit?: string; // Commit SHA at start + effectivelyBlocked: boolean; // True if task OR ancestor has incomplete blockers +} + +// Task with full context - returned by get(), nextReady() +interface TaskWithContext extends Task { + context: { + own: string; // This task's context + parent?: string; // Parent's context (depth > 0) + milestone?: string; // Root milestone's context (depth > 1) + }; + learnings: { + own: Learning[]; // This task's learnings (bubbled from completed children) + parent: Learning[]; // Parent's learnings (depth > 0) + milestone: Learning[]; // Milestone's learnings (depth > 1) + }; +} + +// Task tree structure - returned by tree() +interface TaskTree { + task: Task; + children: TaskTree[]; +} + +// Progress summary - returned by progress() +interface TaskProgress { + total: number; + completed: number; + ready: number; // !completed && !effectivelyBlocked + blocked: number; // !completed && effectivelyBlocked +} + +// Task type alias for depth filter +type TaskType = "milestone" | "task" | "subtask"; +``` + +## Learning Interface + +```typescript +interface Learning { + id: string; + taskId: string; + content: string; + sourceTaskId: string | null; + createdAt: string; +} +``` + +## Tasks API + +```typescript +declare const tasks: { + list(filter?: { + parentId?: string; + ready?: boolean; + completed?: boolean; + depth?: 0 | 1 | 2; // 0=milestones, 1=tasks, 2=subtasks + type?: TaskType; // Alias: "milestone"|"task"|"subtask" (mutually exclusive with depth) + }): Promise; + get(id: string): Promise; + create(input: { + description: string; + context?: string; + parentId?: string; + priority?: 1 | 2 | 3 | 4 | 5; // Must be 1-5 + blockedBy?: string[]; // Cannot be ancestors/descendants + }): Promise; + update(id: string, input: { + description?: string; + context?: string; + priority?: 1 | 2 | 3 | 4 | 5; + parentId?: string; + }): Promise; + start(id: string): Promise; + complete(id: string, input?: { result?: string; learnings?: string[] }): Promise; + reopen(id: string): Promise; + delete(id: string): Promise; + block(taskId: string, blockerId: string): Promise; + unblock(taskId: string, blockerId: string): Promise; + nextReady(milestoneId?: string): Promise; + tree(rootId?: string): Promise; + search(query: string): Promise; + progress(rootId?: string): Promise; +}; +``` + +| Method | Returns | Description | +|--------|---------|-------------| +| `list` | `Task[]` | Filter by `parentId`, `ready`, `completed`, `depth`, `type` | +| `get` | `TaskWithContext` | Get task with full context chain + inherited learnings | +| `create` | `Task` | Create task (priority must be 1-5) | +| `update` | `Task` | Update description, context, priority, parentId | +| `start` | `Task` | **VCS required** - creates bookmark, records start commit | +| `complete` | `Task` | **VCS required** - commits changes + bubbles learnings to parent | +| `reopen` | `Task` | Reopen completed task | +| `delete` | `void` | Delete task + best-effort VCS bookmark cleanup | +| `block` | `void` | Add blocker (cannot be self, ancestor, or descendant) | +| `unblock` | `void` | Remove blocker relationship | +| `nextReady` | `TaskWithContext \| null` | Get deepest ready leaf with full context | +| `tree` | `TaskTree \| TaskTree[]` | Get task tree (all milestones if no ID) | +| `search` | `Task[]` | Search by description/context/result (case-insensitive) | +| `progress` | `TaskProgress` | Aggregate counts for milestone or all tasks | + +## Learnings API + +Learnings are added via `tasks.complete(id, { learnings: [...] })` and bubble to immediate parent (preserving `sourceTaskId`). + +```typescript +declare const learnings: { + list(taskId: string): Promise; +}; +``` + +| Method | Description | +|--------|-------------| +| `list` | List learnings for task | + +## VCS Integration (Required for Workflow) + +VCS operations are **automatically handled** by the tasks API: + +| Task Operation | VCS Effect | +|----------------|------------| +| `tasks.start(id)` | **VCS required** - creates bookmark `task/`, records start commit | +| `tasks.complete(id)` | **VCS required** - commits changes (NothingToCommit = success) | +| `tasks.delete(id)` | Best-effort bookmark cleanup (logs warning on failure) | + +**VCS (jj or git) is required** for start/complete. Fails with `NotARepository` if none found. CRUD operations work without VCS. + +## Quick Examples + +```javascript +// Create milestone with subtask +const milestone = await tasks.create({ + description: "Build authentication system", + context: "JWT-based auth with refresh tokens", + priority: 1 +}); + +const subtask = await tasks.create({ + description: "Implement token refresh logic", + parentId: milestone.id, + context: "Handle 7-day expiry" +}); + +// Start work (VCS required - creates bookmark) +await tasks.start(subtask.id); + +// ... do implementation work ... + +// Complete task with learnings (VCS required - commits changes, bubbles learnings to parent) +await tasks.complete(subtask.id, { + result: "Implemented using jose library", + learnings: ["Use jose instead of jsonwebtoken"] +}); + +// Get progress summary +const progress = await tasks.progress(milestone.id); +// -> { total: 2, completed: 1, ready: 1, blocked: 0 } + +// Search tasks +const authTasks = await tasks.search("authentication"); + +// Get task tree +const tree = await tasks.tree(milestone.id); +// -> { task: Task, children: TaskTree[] } +``` diff --git a/profiles/opencode/skill/overseer-plan/references/examples.md b/profiles/opencode/skill/overseer-plan/references/examples.md new file mode 100644 index 0000000..2660680 --- /dev/null +++ b/profiles/opencode/skill/overseer-plan/references/examples.md @@ -0,0 +1,177 @@ +# Examples + +## Example 1: With Breakdown + +### Input (`auth-plan.md`) + +```markdown +# Plan: Add Authentication System + +## Implementation +1. Create database schema for users/tokens +2. Implement auth controller with endpoints +3. Add JWT middleware for route protection +4. Build frontend login/register forms +5. Add integration tests +``` + +### Execution + +```javascript +const milestone = await tasks.create({ + description: "Add Authentication System", + context: `# Add Authentication System\n\n## Implementation\n1. Create database schema...`, + priority: 3 +}); + +const subtasks = [ + { desc: "Create database schema for users/tokens", done: "Migration runs, tables exist with FK constraints" }, + { desc: "Implement auth controller with endpoints", done: "POST /register, /login return expected responses" }, + { desc: "Add JWT middleware for route protection", done: "Unauthorized requests return 401, valid tokens pass" }, + { desc: "Build frontend login/register forms", done: "Forms render, submit without errors" }, + { desc: "Add integration tests", done: "`npm test` passes with auth coverage" } +]; + +for (const sub of subtasks) { + await tasks.create({ + description: sub.desc, + context: `Part of 'Add Authentication System'.\n\nDone when: ${sub.done}`, + parentId: milestone.id + }); +} + +return { milestone: milestone.id, subtaskCount: subtasks.length }; +``` + +### Output + +``` +Created milestone task_01ABC from plan + +Analyzed plan structure: Found 5 distinct implementation steps +Created 5 subtasks: +- task_02XYZ: Create database schema for users/tokens +- task_03ABC: Implement auth controller with endpoints +- task_04DEF: Add JWT middleware for route protection +- task_05GHI: Build frontend login/register forms +- task_06JKL: Add integration tests + +View structure: execute `await tasks.list({ parentId: "task_01ABC" })` +``` + +## Example 2: No Breakdown + +### Input (`bugfix-plan.md`) + +```markdown +# Plan: Fix Login Validation Bug + +## Problem +Login fails when username has spaces + +## Solution +Update validation regex in auth.ts line 42 +``` + +### Execution + +```javascript +const milestone = await tasks.create({ + description: "Fix Login Validation Bug", + context: `# Fix Login Validation Bug\n\n## Problem\nLogin fails...`, + priority: 3 +}); + +return { milestone: milestone.id, breakdown: false }; +``` + +### Output + +``` +Created milestone task_01ABC from plan + +Plan describes a cohesive single task. No subtask breakdown needed. + +View task: execute `await tasks.get("task_01ABC")` +``` + +## Example 3: Epic-Level (Two-Level Hierarchy) + +### Input (`full-auth-plan.md`) + +```markdown +# Complete User Authentication System + +## Phase 1: Backend Infrastructure +1. Database schema for users/sessions +2. Password hashing with bcrypt +3. JWT token generation + +## Phase 2: API Endpoints +1. POST /auth/register +2. POST /auth/login +3. POST /auth/logout + +## Phase 3: Frontend +1. Login/register forms +2. Protected routes +3. Session persistence +``` + +### Execution + +```javascript +const milestone = await tasks.create({ + description: "Complete User Authentication System", + context: ``, + priority: 3 +}); + +const phases = [ + { name: "Backend Infrastructure", items: [ + { desc: "Database schema", done: "Migration runs, tables exist" }, + { desc: "Password hashing", done: "bcrypt hashes verified in tests" }, + { desc: "JWT tokens", done: "Token generation/validation works" } + ]}, + { name: "API Endpoints", items: [ + { desc: "POST /auth/register", done: "Creates user, returns 201" }, + { desc: "POST /auth/login", done: "Returns JWT on valid credentials" }, + { desc: "POST /auth/logout", done: "Invalidates session, returns 200" } + ]}, + { name: "Frontend", items: [ + { desc: "Login/register forms", done: "Forms render, submit successfully" }, + { desc: "Protected routes", done: "Redirect to login when unauthenticated" }, + { desc: "Session persistence", done: "Refresh maintains logged-in state" } + ]} +]; + +for (const phase of phases) { + const phaseTask = await tasks.create({ + description: phase.name, + parentId: milestone.id + }); + for (const item of phase.items) { + await tasks.create({ + description: item.desc, + context: `Part of '${phase.name}'.\n\nDone when: ${item.done}`, + parentId: phaseTask.id + }); + } +} + +return milestone; +``` + +### Output + +``` +Created milestone task_01ABC from plan + +Analyzed plan structure: Found 3 major phases +Created as milestone with 3 tasks: +- task_02XYZ: Backend Infrastructure (3 subtasks) +- task_03ABC: API Endpoints (3 subtasks) +- task_04DEF: Frontend (3 subtasks) + +View structure: execute `await tasks.list({ parentId: "task_01ABC" })` +``` diff --git a/profiles/opencode/skill/overseer-plan/references/implementation.md b/profiles/opencode/skill/overseer-plan/references/implementation.md new file mode 100644 index 0000000..aff6b48 --- /dev/null +++ b/profiles/opencode/skill/overseer-plan/references/implementation.md @@ -0,0 +1,210 @@ +# Implementation Instructions + +**For the skill agent executing `/overseer-plan`.** Follow this workflow exactly. + +## Step 1: Read Markdown File + +Read the provided file using the Read tool. + +## Step 2: Extract Title + +- Parse first `#` heading as title +- Strip "Plan: " prefix if present (case-insensitive) +- Fallback: use filename without extension + +## Step 3: Create Milestone via MCP + +Basic creation: + +```javascript +const milestone = await tasks.create({ + description: "", + context: ``, + priority: +}); +return milestone; +``` + +With `--parent` option: + +```javascript +const task = await tasks.create({ + description: "", + context: ``, + parentId: "", + priority: +}); +return task; +``` + +Capture returned task ID for subsequent steps. + +## Step 4: Analyze Plan Structure + +### Breakdown Indicators + +1. **Numbered/bulleted implementation lists (3-7 items)** + ```markdown + ## Implementation + 1. Create database schema + 2. Build API endpoints + 3. Add frontend components + ``` + +2. **Clear subsections under implementation/tasks/steps** + ```markdown + ### 1. Backend Changes + - Modify server.ts + + ### 2. Frontend Updates + - Update login form + ``` + +3. **File-specific sections** + ```markdown + ### `src/auth.ts` - Add JWT validation + ### `src/middleware.ts` - Create auth middleware + ``` + +4. **Sequential phases** + ```markdown + **Phase 1: Database Layer** + **Phase 2: API Layer** + ``` + +### Do NOT Break Down When + +- Only 1-2 steps/items +- Plan is a single cohesive fix +- Content is exploratory ("investigate", "research") +- Work items inseparable +- Plan very short (<10 lines) + +## Step 5: Validate Atomicity & Acceptance Criteria + +For each proposed task, verify: +- **Atomic**: Can be completed in single commit +- **Validated**: Has clear acceptance criteria + +If task too large -> split further. +If no validation -> add to context: + +``` +Done when: +``` + +Examples of good acceptance criteria: +- "Done when: `npm test` passes, new migration applied" +- "Done when: API returns 200 with expected payload" +- "Done when: Component renders without console errors" +- "Done when: Type check passes (`tsc --noEmit`)" + +## Step 6: Oracle Review + +Before creating tasks, invoke Oracle to review the proposed breakdown. + +**Prompt Oracle with:** + +``` +Review this task breakdown for "": + +1. - Done when: +2. - Done when: +... + +Check: +- Are tasks truly atomic (single commit)? +- Is validation criteria clear and observable? +- Does milestone deliver demoable increment? +- Missing dependencies/blockers? +- Any tasks that should be split or merged? +``` + +Incorporate Oracle's feedback, then proceed to create tasks. + +## Step 7: Create Subtasks (If Breaking Down) + +### Extract for Each Subtask + +1. **Description**: Strip numbering, keep concise (1-10 words), imperative form +2. **Context**: Section content + "Part of [milestone description]" + acceptance criteria + +### Flat Breakdown + +```javascript +const subtasks = [ + { description: "Create database schema", context: "Schema for users/tokens. Part of 'Add Auth'.\n\nDone when: Migration runs, tables exist with FK constraints." }, + { description: "Build API endpoints", context: "POST /auth/register, /auth/login. Part of 'Add Auth'.\n\nDone when: Endpoints return expected responses, tests pass." } +]; + +const created = []; +for (const sub of subtasks) { + const task = await tasks.create({ + description: sub.description, + context: sub.context, + parentId: milestone.id + }); + created.push(task); +} +return { milestone: milestone.id, subtasks: created }; +``` + +### Epic-Level Breakdown (phases with sub-items) + +```javascript +// Create phase as task under milestone +const phase = await tasks.create({ + description: "Backend Infrastructure", + context: "Phase 1 context...", + parentId: milestoneId +}); + +// Create subtasks under phase +for (const item of phaseItems) { + await tasks.create({ + description: item.description, + context: item.context, + parentId: phase.id + }); +} +``` + +## Step 8: Report Results + +### Subtasks Created + +``` +Created milestone from plan + +Analyzed plan structure: Found distinct implementation steps +Created subtasks: +- : +- : +... + +View structure: execute `await tasks.list({ parentId: "" })` +``` + +### No Breakdown + +``` +Created milestone from plan + +Plan describes a cohesive single task. No subtask breakdown needed. + +View task: execute `await tasks.get("")` +``` + +### Epic-Level Breakdown + +``` +Created milestone from plan + +Analyzed plan structure: Found major phases +Created as milestone with tasks: +- : ( subtasks) +- : ( subtasks) +... + +View structure: execute `await tasks.list({ parentId: "" })` +``` diff --git a/profiles/opencode/skill/overseer/SKILL.md b/profiles/opencode/skill/overseer/SKILL.md new file mode 100644 index 0000000..d83080a --- /dev/null +++ b/profiles/opencode/skill/overseer/SKILL.md @@ -0,0 +1,191 @@ +--- +name: overseer +description: Manage tasks via Overseer codemode MCP. Use when tracking multi-session work, breaking down implementation, or persisting context for handoffs. +license: MIT +metadata: + author: dmmulroy + version: "1.0.0" +--- + +# Agent Coordination with Overseer + +## Core Principle: Tickets, Not Todos + +Overseer tasks are **tickets** - structured artifacts with comprehensive context: + +- **Description**: One-line summary (issue title) +- **Context**: Full background, requirements, approach (issue body) +- **Result**: Implementation details, decisions, outcomes (PR description) + +Think: "Would someone understand the what, why, and how from this task alone AND what success looks like?" + +## Task IDs are Ephemeral + +**Never reference task IDs in external artifacts** (commits, PRs, docs). Task IDs like `task_01JQAZ...` become meaningless once tasks complete. Describe the work itself, not the task that tracked it. + +## Overseer vs OpenCode's TodoWrite + +| | Overseer | TodoWrite | +| --------------- | ------------------------------------- | ---------------------- | +| **Persistence** | SQLite database | Session-only | +| **Context** | Rich (description + context + result) | Basic | +| **Hierarchy** | 3-level (milestone -> task -> subtask)| Flat | + +Use **Overseer** for persistent work. Use **TodoWrite** for ephemeral in-session tracking only. + +## When to Use Overseer + +**Use Overseer when:** +- Breaking down complexity into subtasks +- Work spans multiple sessions +- Context needs to persist for handoffs +- Recording decisions for future reference + +**Skip Overseer when:** +- Work is a single atomic action +- Everything fits in one message exchange +- Overhead exceeds value +- TodoWrite is sufficient + +## Finding Work + +```javascript +// Get next ready task with full context (recommended for work sessions) +const task = await tasks.nextReady(milestoneId); // TaskWithContext | null +if (!task) { + console.log("No ready tasks"); + return; +} + +// Get all ready tasks (for progress overview) +const readyTasks = await tasks.list({ ready: true }); // Task[] +``` + +**Use `nextReady()`** when starting work - returns `TaskWithContext | null` (deepest ready leaf with full context chain + inherited learnings). +**Use `list({ ready: true })`** for status/progress checks - returns `Task[]` without context chain. + +## Basic Workflow + +```javascript +// 1. Get next ready task (returns TaskWithContext | null) +const task = await tasks.nextReady(); +if (!task) return "No ready tasks"; + +// 2. Review context (available on TaskWithContext) +console.log(task.context.own); // This task's context +console.log(task.context.parent); // Parent's context (if depth > 0) +console.log(task.context.milestone); // Root milestone context (if depth > 1) +console.log(task.learnings.own); // Learnings attached to this task (bubbled from children) + +// 3. Start work (VCS required - creates bookmark, records start commit) +await tasks.start(task.id); + +// 4. Implement... + +// 5. Complete with learnings (VCS required - commits changes, bubbles learnings to parent) +await tasks.complete(task.id, { + result: "Implemented login endpoint with JWT tokens", + learnings: ["bcrypt rounds should be 12 for production"] +}); +``` + +See @file references/workflow.md for detailed workflow guidance. + +## Understanding Task Context + +Tasks have **progressive context** - inherited from ancestors: + +```javascript +const task = await tasks.get(taskId); // Returns TaskWithContext +// task.context.own - this task's context (always present) +// task.context.parent - parent task's context (if depth > 0) +// task.context.milestone - root milestone's context (if depth > 1) + +// Task's own learnings (bubbled from completed children) +// task.learnings.own - learnings attached to this task +``` + +## Return Type Summary + +| Method | Returns | Notes | +|--------|---------|-------| +| `tasks.get(id)` | `TaskWithContext` | Full context chain + inherited learnings | +| `tasks.nextReady()` | `TaskWithContext \| null` | Deepest ready leaf with full context | +| `tasks.list()` | `Task[]` | Basic task fields only | +| `tasks.create()` | `Task` | No context chain | +| `tasks.start/complete()` | `Task` | No context chain | + +## Blockers + +Blockers prevent a task from being ready until the blocker completes. + +**Constraints:** +- Blockers cannot be self +- Blockers cannot be ancestors (parent, grandparent, etc.) +- Blockers cannot be descendants +- Creating/reparenting with invalid blockers is rejected + +```javascript +// Add blocker - taskA waits for taskB +await tasks.block(taskA.id, taskB.id); + +// Remove blocker +await tasks.unblock(taskA.id, taskB.id); +``` + +## Task Hierarchies + +Three levels: **Milestone** (depth 0) -> **Task** (depth 1) -> **Subtask** (depth 2). + +| Level | Name | Purpose | Example | +|-------|------|---------|---------| +| 0 | **Milestone** | Large initiative | "Add user authentication system" | +| 1 | **Task** | Significant work item | "Implement JWT middleware" | +| 2 | **Subtask** | Atomic step | "Add token verification function" | + +**Choosing the right level:** +- Small feature (1-2 files) -> Single task +- Medium feature (3-7 steps) -> Task with subtasks +- Large initiative (5+ tasks) -> Milestone with tasks + +See @file references/hierarchies.md for detailed guidance. + +## Recording Results + +Complete tasks **immediately after implementing AND verifying**: +- Capture decisions while fresh +- Note deviations from plan +- Document verification performed +- Create follow-up tasks for tech debt + +Your result must include explicit verification evidence. See @file references/verification.md. + +## Best Practices + +1. **Right-size tasks**: Completable in one focused session +2. **Clear completion criteria**: Context should define "done" +3. **Don't over-decompose**: 3-7 children per parent +4. **Action-oriented descriptions**: Start with verbs ("Add", "Fix", "Update") +5. **Verify before completing**: Tests passing, manual testing done + +--- + +## Reading Order + +| Task | File | +|------|------| +| Understanding API | @file references/api.md | +| Implementation workflow | @file references/workflow.md | +| Task decomposition | @file references/hierarchies.md | +| Good/bad examples | @file references/examples.md | +| Verification checklist | @file references/verification.md | + +## In This Reference + +| File | Purpose | +|------|---------| +| `references/api.md` | Overseer MCP codemode API types/methods | +| `references/workflow.md` | Start->implement->complete workflow | +| `references/hierarchies.md` | Milestone/task/subtask organization | +| `references/examples.md` | Good/bad context and result examples | +| `references/verification.md` | Verification checklist and process | diff --git a/profiles/opencode/skill/overseer/references/api.md b/profiles/opencode/skill/overseer/references/api.md new file mode 100644 index 0000000..18ca2bf --- /dev/null +++ b/profiles/opencode/skill/overseer/references/api.md @@ -0,0 +1,192 @@ +# Overseer Codemode MCP API + +Execute JavaScript code to interact with Overseer task management. + +## Task Interface + +```typescript +// Basic task - returned by list(), create(), start(), complete() +// Note: Does NOT include context or learnings fields +interface Task { + id: string; + parentId: string | null; + description: string; + priority: 1 | 2 | 3 | 4 | 5; + completed: boolean; + completedAt: string | null; + startedAt: string | null; + createdAt: string; // ISO 8601 + updatedAt: string; + result: string | null; // Completion notes + commitSha: string | null; // Auto-populated on complete + depth: 0 | 1 | 2; // 0=milestone, 1=task, 2=subtask + blockedBy?: string[]; // Blocking task IDs (omitted if empty) + blocks?: string[]; // Tasks this blocks (omitted if empty) + bookmark?: string; // VCS bookmark name (if started) + startCommit?: string; // Commit SHA at start + effectivelyBlocked: boolean; // True if task OR ancestor has incomplete blockers +} + +// Task with full context - returned by get(), nextReady() +interface TaskWithContext extends Task { + context: { + own: string; // This task's context + parent?: string; // Parent's context (depth > 0) + milestone?: string; // Root milestone's context (depth > 1) + }; + learnings: { + own: Learning[]; // This task's learnings (bubbled from completed children) + parent: Learning[]; // Parent's learnings (depth > 0) + milestone: Learning[]; // Milestone's learnings (depth > 1) + }; +} + +// Task tree structure - returned by tree() +interface TaskTree { + task: Task; + children: TaskTree[]; +} + +// Progress summary - returned by progress() +interface TaskProgress { + total: number; + completed: number; + ready: number; // !completed && !effectivelyBlocked + blocked: number; // !completed && effectivelyBlocked +} + +// Task type alias for depth filter +type TaskType = "milestone" | "task" | "subtask"; +``` + +## Learning Interface + +```typescript +interface Learning { + id: string; + taskId: string; + content: string; + sourceTaskId: string | null; + createdAt: string; +} +``` + +## Tasks API + +```typescript +declare const tasks: { + list(filter?: { + parentId?: string; + ready?: boolean; + completed?: boolean; + depth?: 0 | 1 | 2; // 0=milestones, 1=tasks, 2=subtasks + type?: TaskType; // Alias: "milestone"|"task"|"subtask" (mutually exclusive with depth) + }): Promise; + get(id: string): Promise; + create(input: { + description: string; + context?: string; + parentId?: string; + priority?: 1 | 2 | 3 | 4 | 5; // Required range: 1-5 + blockedBy?: string[]; + }): Promise; + update(id: string, input: { + description?: string; + context?: string; + priority?: 1 | 2 | 3 | 4 | 5; + parentId?: string; + }): Promise; + start(id: string): Promise; + complete(id: string, input?: { result?: string; learnings?: string[] }): Promise; + reopen(id: string): Promise; + delete(id: string): Promise; + block(taskId: string, blockerId: string): Promise; + unblock(taskId: string, blockerId: string): Promise; + nextReady(milestoneId?: string): Promise; + tree(rootId?: string): Promise; + search(query: string): Promise; + progress(rootId?: string): Promise; +}; +``` + +| Method | Returns | Description | +|--------|---------|-------------| +| `list` | `Task[]` | Filter by `parentId`, `ready`, `completed`, `depth`, `type` | +| `get` | `TaskWithContext` | Get task with full context chain + inherited learnings | +| `create` | `Task` | Create task (priority must be 1-5) | +| `update` | `Task` | Update description, context, priority, parentId | +| `start` | `Task` | **VCS required** - creates bookmark, records start commit | +| `complete` | `Task` | **VCS required** - commits changes + bubbles learnings to parent | +| `reopen` | `Task` | Reopen completed task | +| `delete` | `void` | Delete task + best-effort VCS bookmark cleanup | +| `block` | `void` | Add blocker (cannot be self, ancestor, or descendant) | +| `unblock` | `void` | Remove blocker relationship | +| `nextReady` | `TaskWithContext \| null` | Get deepest ready leaf with full context | +| `tree` | `TaskTree \| TaskTree[]` | Get task tree (all milestones if no ID) | +| `search` | `Task[]` | Search by description/context/result (case-insensitive) | +| `progress` | `TaskProgress` | Aggregate counts for milestone or all tasks | + +## Learnings API + +Learnings are added via `tasks.complete(id, { learnings: [...] })` and bubble to immediate parent (preserving `sourceTaskId`). + +```typescript +declare const learnings: { + list(taskId: string): Promise; +}; +``` + +| Method | Description | +|--------|-------------| +| `list` | List learnings for task | + +## VCS Integration (Required for Workflow) + +VCS operations are **automatically handled** by the tasks API: + +| Task Operation | VCS Effect | +|----------------|------------| +| `tasks.start(id)` | **VCS required** - creates bookmark `task/`, records start commit | +| `tasks.complete(id)` | **VCS required** - commits changes (NothingToCommit = success) | +| `tasks.delete(id)` | Best-effort bookmark cleanup (logs warning on failure) | + +**VCS (jj or git) is required** for start/complete. Fails with `NotARepository` if none found. CRUD operations work without VCS. + +## Quick Examples + +```javascript +// Create milestone with subtask +const milestone = await tasks.create({ + description: "Build authentication system", + context: "JWT-based auth with refresh tokens", + priority: 1 +}); + +const subtask = await tasks.create({ + description: "Implement token refresh logic", + parentId: milestone.id, + context: "Handle 7-day expiry" +}); + +// Start work (auto-creates VCS bookmark) +await tasks.start(subtask.id); + +// ... do implementation work ... + +// Complete task with learnings (VCS required - commits changes, bubbles learnings to parent) +await tasks.complete(subtask.id, { + result: "Implemented using jose library", + learnings: ["Use jose instead of jsonwebtoken"] +}); + +// Get progress summary +const progress = await tasks.progress(milestone.id); +// -> { total: 2, completed: 1, ready: 1, blocked: 0 } + +// Search tasks +const authTasks = await tasks.search("authentication"); + +// Get task tree +const tree = await tasks.tree(milestone.id); +// -> { task: Task, children: TaskTree[] } +``` diff --git a/profiles/opencode/skill/overseer/references/examples.md b/profiles/opencode/skill/overseer/references/examples.md new file mode 100644 index 0000000..7eda889 --- /dev/null +++ b/profiles/opencode/skill/overseer/references/examples.md @@ -0,0 +1,195 @@ +# Examples + +Good and bad examples for writing task context and results. + +## Writing Context + +Context should include everything needed to do the work without asking questions: +- **What** needs to be done and why +- **Implementation approach** (steps, files to modify, technical choices) +- **Done when** (acceptance criteria) + +### Good Context Example + +```javascript +await tasks.create({ + description: "Migrate storage to one file per task", + context: `Change storage format for git-friendliness: + +Structure: +.overseer/ +└── tasks/ + ├── task_01ABC.json + └── task_02DEF.json + +NO INDEX - just scan task files. For typical task counts (<100), this is fast. + +Implementation: +1. Update storage.ts: + - read(): Scan .overseer/tasks/*.json, parse each, return TaskStore + - write(task): Write single task to .overseer/tasks/{id}.json + - delete(id): Remove .overseer/tasks/{id}.json + - Add readTask(id) for single task lookup + +2. Task file format: Same as current Task schema (one task per file) + +3. Migration: On read, if old tasks.json exists, migrate to new format + +4. Update tests + +Benefits: +- Create = new file (never conflicts) +- Update = single file change +- Delete = remove file +- No index to maintain or conflict +- git diff shows exactly which tasks changed` +}); +``` + +**Why it works:** States the goal, shows the structure, lists specific implementation steps, explains benefits. Someone could pick this up without asking questions. + +### Bad Context Example + +```javascript +await tasks.create({ + description: "Add auth", + context: "Need to add authentication" +}); +``` + +**What's missing:** How to implement it, what files, what's done when, technical approach. + +## Writing Results + +Results should capture what was actually done: +- **What changed** (implementation summary) +- **Key decisions** (and why) +- **Verification** (tests passing, manual testing done) + +### Good Result Example + +```javascript +await tasks.complete(taskId, `Migrated storage from single tasks.json to one file per task: + +Structure: +- Each task stored as .overseer/tasks/{id}.json +- No index file (avoids merge conflicts) +- Directory scanned on read to build task list + +Implementation: +- Modified Storage.read() to scan .overseer/tasks/ directory +- Modified Storage.write() to write/delete individual task files +- Auto-migration from old single-file format on first read +- Atomic writes using temp file + rename pattern + +Trade-offs: +- Slightly slower reads (must scan directory + parse each file) +- Acceptable since task count is typically small (<100) +- Better git history - each task change is isolated + +Verification: +- All 60 tests passing +- Build successful +- Manually tested migration: old -> new format works`); +``` + +**Why it works:** States what changed, lists implementation details, explains trade-offs, confirms verification. + +### Bad Result Example + +```javascript +await tasks.complete(taskId, "Fixed the storage issue"); +``` + +**What's missing:** What was actually implemented, how, what decisions were made, verification evidence. + +## Subtask Context Example + +Link subtasks to their parent and explain what this piece does specifically: + +```javascript +await tasks.create({ + description: "Add token verification function", + parentId: jwtTaskId, + context: `Part of JWT middleware (parent task). This subtask: token verification. + +What it does: +- Verify JWT signature and expiration on protected routes +- Extract user ID from token payload +- Attach user object to request +- Return 401 for invalid/expired tokens + +Implementation: +- Create src/middleware/verify-token.ts +- Export verifyToken middleware function +- Use jose library (preferred over jsonwebtoken) +- Handle expired vs invalid token cases separately + +Done when: +- Middleware function complete and working +- Unit tests cover valid/invalid/expired scenarios +- Integrated into auth routes in server.ts +- Parent task can use this to protect endpoints` +}); +``` + +## Error Handling Examples + +### Handling Pending Children + +```javascript +try { + await tasks.complete(taskId, "Done"); +} catch (err) { + if (err.message.includes("pending children")) { + const pending = await tasks.list({ parentId: taskId, completed: false }); + console.log(`Cannot complete: ${pending.length} children pending`); + for (const child of pending) { + console.log(`- ${child.id}: ${child.description}`); + } + return; + } + throw err; +} +``` + +### Handling Blocked Tasks + +```javascript +const task = await tasks.get(taskId); + +if (task.blockedBy.length > 0) { + console.log("Task is blocked by:"); + for (const blockerId of task.blockedBy) { + const blocker = await tasks.get(blockerId); + console.log(`- ${blocker.description} (${blocker.completed ? 'done' : 'pending'})`); + } + return "Cannot start - blocked by other tasks"; +} + +await tasks.start(taskId); +``` + +## Creating Task Hierarchies + +```javascript +// Create milestone with tasks +const milestone = await tasks.create({ + description: "Implement user authentication", + context: "Full auth: JWT, login/logout, password reset, rate limiting", + priority: 2 +}); + +const subtasks = [ + "Add login endpoint", + "Add logout endpoint", + "Implement JWT token service", + "Add password reset flow" +]; + +for (const desc of subtasks) { + await tasks.create({ description: desc, parentId: milestone.id }); +} +``` + +See @file references/hierarchies.md for sequential subtasks with blockers. diff --git a/profiles/opencode/skill/overseer/references/hierarchies.md b/profiles/opencode/skill/overseer/references/hierarchies.md new file mode 100644 index 0000000..d6378fd --- /dev/null +++ b/profiles/opencode/skill/overseer/references/hierarchies.md @@ -0,0 +1,170 @@ +# Task Hierarchies + +Guidance for organizing work into milestones, tasks, and subtasks. + +## Three Levels + +| Level | Name | Purpose | Example | +|-------|------|---------|---------| +| 0 | **Milestone** | Large initiative (5+ tasks) | "Add user authentication system" | +| 1 | **Task** | Significant work item | "Implement JWT middleware" | +| 2 | **Subtask** | Atomic implementation step | "Add token verification function" | + +**Maximum depth is 3 levels.** Attempting to create a child of a subtask will fail. + +## When to Use Each Level + +### Single Task (No Hierarchy) +- Small feature (1-2 files, ~1 session) +- Work is atomic, no natural breakdown + +### Task with Subtasks +- Medium feature (3-5 files, 3-7 steps) +- Work naturally decomposes into discrete steps +- Subtasks could be worked on independently + +### Milestone with Tasks +- Large initiative (multiple areas, many sessions) +- Work spans 5+ distinct tasks +- You want high-level progress tracking + +## Creating Hierarchies + +```javascript +// Create the milestone +const milestone = await tasks.create({ + description: "Add user authentication system", + context: "Full auth system with JWT tokens, password reset...", + priority: 2 +}); + +// Create tasks under it +const jwtTask = await tasks.create({ + description: "Implement JWT token generation", + context: "Create token service with signing and verification...", + parentId: milestone.id +}); + +const resetTask = await tasks.create({ + description: "Add password reset flow", + context: "Email-based password reset with secure tokens...", + parentId: milestone.id +}); + +// For complex tasks, add subtasks +const verifySubtask = await tasks.create({ + description: "Add token verification function", + context: "Verify JWT signature and expiration...", + parentId: jwtTask.id +}); +``` + +## Subtask Best Practices + +Each subtask should be: + +- **Independently understandable**: Clear on its own +- **Linked to parent**: Reference parent, explain how this piece fits +- **Specific scope**: What this subtask does vs what parent/siblings do +- **Clear completion**: Define "done" for this piece specifically + +Example subtask context: +``` +Part of JWT middleware (parent task). This subtask: token verification. + +What it does: +- Verify JWT signature and expiration +- Extract user ID from payload +- Return 401 for invalid/expired tokens + +Done when: +- Function complete and tested +- Unit tests cover valid/invalid/expired cases +``` + +## Decomposition Strategy + +When faced with large tasks: + +1. **Assess scope**: Is this milestone-level (5+ tasks) or task-level (3-7 subtasks)? +2. Create parent task/milestone with overall goal and context +3. Analyze and identify 3-7 logical children +4. Create children with specific contexts and boundaries +5. Work through systematically, completing with results +6. Complete parent with summary of overall implementation + +### Don't Over-Decompose + +- **3-7 children per parent** is usually right +- If you'd only have 1-2 subtasks, just make separate tasks +- If you need depth 3+, restructure your breakdown + +## Viewing Hierarchies + +```javascript +// List all tasks under a milestone +const children = await tasks.list({ parentId: milestoneId }); + +// Get task with context breadcrumb +const task = await tasks.get(taskId); +// task.context.parent - parent's context +// task.context.milestone - root milestone's context + +// Check progress +const pending = await tasks.list({ parentId: milestoneId, completed: false }); +const done = await tasks.list({ parentId: milestoneId, completed: true }); +console.log(`Progress: ${done.length}/${done.length + pending.length}`); +``` + +## Completion Rules + +1. **Cannot complete with pending children** + ```javascript + // This will fail if task has incomplete subtasks + await tasks.complete(taskId, "Done"); + // Error: "pending children" + ``` + +2. **Complete children first** + - Work through subtasks systematically + - Complete each with meaningful results + +3. **Parent result summarizes overall implementation** + ```javascript + await tasks.complete(milestoneId, `User authentication system complete: + + Implemented: + - JWT token generation and verification + - Login/logout endpoints + - Password reset flow + - Rate limiting + + 5 tasks completed, all tests passing.`); + ``` + +## Blocking Dependencies + +Use `blockedBy` for cross-task dependencies: + +```javascript +// Create task that depends on another +const deployTask = await tasks.create({ + description: "Deploy to production", + context: "...", + blockedBy: [testTaskId, reviewTaskId] +}); + +// Add blocker to existing task +await tasks.block(deployTaskId, testTaskId); + +// Remove blocker +await tasks.unblock(deployTaskId, testTaskId); +``` + +**Use blockers when:** +- Task B cannot start until Task A completes +- Multiple tasks depend on a shared prerequisite + +**Don't use blockers when:** +- Tasks can be worked on in parallel +- The dependency is just logical grouping (use subtasks instead) diff --git a/profiles/opencode/skill/overseer/references/verification.md b/profiles/opencode/skill/overseer/references/verification.md new file mode 100644 index 0000000..905b737 --- /dev/null +++ b/profiles/opencode/skill/overseer/references/verification.md @@ -0,0 +1,186 @@ +# Verification Guide + +Before marking any task complete, you MUST verify your work. Verification separates "I think it's done" from "it's actually done." + +## The Verification Process + +1. **Re-read the task context**: What did you originally commit to do? +2. **Check acceptance criteria**: Does your implementation satisfy the "Done when" conditions? +3. **Run relevant tests**: Execute the test suite and document results +4. **Test manually**: Actually try the feature/change yourself +5. **Compare with requirements**: Does what you built match what was asked? + +## Strong vs Weak Verification + +### Strong Verification Examples + +- "All 60 tests passing, build successful" +- "All 69 tests passing (4 new tests for middleware edge cases)" +- "Manually tested with valid/invalid/expired tokens - all cases work" +- "Ran `cargo test` - 142 tests passed, 0 failed" + +### Weak Verification (Avoid) + +- "Should work now" - "should" means not verified +- "Made the changes" - no evidence it works +- "Added tests" - did the tests pass? What's the count? +- "Fixed the bug" - what bug? Did you verify the fix? +- "Done" - done how? prove it + +## Verification by Task Type + +| Task Type | How to Verify | +|-----------|---------------| +| Code changes | Run full test suite, document passing count | +| New features | Run tests + manual testing of functionality | +| Configuration | Test the config works (run commands, check workflows) | +| Documentation | Verify examples work, links resolve, formatting renders | +| Refactoring | Confirm tests still pass, no behavior changes | +| Bug fixes | Reproduce bug first, verify fix, add regression test | + +## Cross-Reference Checklist + +Before marking complete, verify all applicable items: + +- [ ] Task description requirements met +- [ ] Context "Done when" criteria satisfied +- [ ] Tests passing (document count: "All X tests passing") +- [ ] Build succeeds (if applicable) +- [ ] Manual testing done (describe what you tested) +- [ ] No regressions introduced +- [ ] Edge cases considered (error handling, invalid input) +- [ ] Follow-up work identified (created new tasks if needed) + +**If you can't check all applicable boxes, the task isn't done yet.** + +## Result Examples with Verification + +### Code Implementation + +```javascript +await tasks.complete(taskId, `Implemented JWT middleware: + +Implementation: +- Created src/middleware/verify-token.ts +- Separated 'expired' vs 'invalid' error codes +- Added user extraction from payload + +Verification: +- All 69 tests passing (4 new tests for edge cases) +- Manually tested with valid token: Access granted +- Manually tested with expired token: 401 with 'token_expired' +- Manually tested with invalid signature: 401 with 'invalid_token'`); +``` + +### Configuration/Infrastructure + +```javascript +await tasks.complete(taskId, `Added GitHub Actions workflow for CI: + +Implementation: +- Created .github/workflows/ci.yml +- Jobs: lint, test, build with pnpm cache + +Verification: +- Pushed to test branch, opened PR #123 +- Workflow triggered automatically +- All jobs passed (lint: 0 errors, test: 69/69, build: success) +- Total run time: 2m 34s`); +``` + +### Refactoring + +```javascript +await tasks.complete(taskId, `Refactored storage to one file per task: + +Implementation: +- Split tasks.json into .overseer/tasks/{id}.json files +- Added auto-migration from old format +- Atomic writes via temp+rename + +Verification: +- All 60 tests passing (including 8 storage tests) +- Build successful +- Manually tested migration: old -> new format works +- Confirmed git diff shows only changed tasks`); +``` + +### Bug Fix + +```javascript +await tasks.complete(taskId, `Fixed login validation accepting usernames with spaces: + +Root cause: +- Validation regex didn't account for leading/trailing spaces + +Fix: +- Added .trim() before validation in src/auth/validate.ts:42 +- Updated regex to reject internal spaces + +Verification: +- All 45 tests passing (2 new regression tests) +- Manually tested: + - " admin" -> rejected (leading space) + - "admin " -> rejected (trailing space) + - "ad min" -> rejected (internal space) + - "admin" -> accepted`); +``` + +### Documentation + +```javascript +await tasks.complete(taskId, `Updated API documentation for auth endpoints: + +Implementation: +- Added docs for POST /auth/login +- Added docs for POST /auth/logout +- Added docs for POST /auth/refresh +- Included example requests/responses + +Verification: +- All code examples tested and working +- Links verified (no 404s) +- Rendered in local preview - formatting correct +- Spell-checked content`); +``` + +## Common Verification Mistakes + +| Mistake | Better Approach | +|---------|-----------------| +| "Tests pass" | "All 42 tests passing" (include count) | +| "Manually tested" | "Manually tested X, Y, Z scenarios" (be specific) | +| "Works" | "Works: [evidence]" (show proof) | +| "Fixed" | "Fixed: [root cause] -> [solution] -> [verification]" | + +## When Verification Fails + +If verification reveals issues: + +1. **Don't complete the task** - it's not done +2. **Document what failed** in task context +3. **Fix the issues** before completing +4. **Re-verify** after fixes + +```javascript +// Update context with failure notes +await tasks.update(taskId, { + context: task.context + ` + +Verification attempt 1 (failed): +- Tests: 41/42 passing +- Failing: test_token_refresh - timeout issue +- Need to investigate async handling` +}); + +// After fixing +await tasks.complete(taskId, `Implemented token refresh: + +Implementation: +- Added refresh endpoint +- Fixed async timeout (was missing await) + +Verification: +- All 42 tests passing (fixed timeout issue) +- Manual testing: refresh works within 30s window`); +``` diff --git a/profiles/opencode/skill/overseer/references/workflow.md b/profiles/opencode/skill/overseer/references/workflow.md new file mode 100644 index 0000000..4b3414f --- /dev/null +++ b/profiles/opencode/skill/overseer/references/workflow.md @@ -0,0 +1,164 @@ +# Implementation Workflow + +Step-by-step guide for working with Overseer tasks during implementation. + +## 1. Get Next Ready Task + +```javascript +// Get next task with full context (recommended) +const task = await tasks.nextReady(); + +// Or scope to specific milestone +const task = await tasks.nextReady(milestoneId); + +if (!task) { + return "No tasks ready - all blocked or completed"; +} +``` + +`nextReady()` returns a `TaskWithContext` (task with inherited context and learnings) or `null`. + +## 2. Review Context + +Before starting, verify you can answer: +- **What** needs to be done specifically? +- **Why** is this needed? +- **How** should it be implemented? +- **When** is it done (acceptance criteria)? + +```javascript +const task = await tasks.get(taskId); + +// Task's own context +console.log("Task:", task.context.own); + +// Parent context (if task has parent) +if (task.context.parent) { + console.log("Parent:", task.context.parent); +} + +// Milestone context (if depth > 1) +if (task.context.milestone) { + console.log("Milestone:", task.context.milestone); +} + +// Task's own learnings (bubbled from completed children) +console.log("Task learnings:", task.learnings.own); +``` + +**If any answer is unclear:** +1. Check parent task or completed blockers for details +2. Suggest entering plan mode to flesh out requirements + +**Proceed without full context when:** +- Task is trivial/atomic (e.g., "Add .gitignore entry") +- Conversation already provides the missing context +- Description itself is sufficiently detailed + +## 3. Start Task + +```javascript +await tasks.start(taskId); +``` + +**VCS Required:** Creates bookmark `task/`, records start commit. Fails with `NotARepository` if no jj/git found. + +After starting, the task status changes to `in_progress`. + +## 4. Implement + +Work on the task implementation. Note any learnings to include when completing. + +## 5. Verify Work + +Before completing, verify your implementation. See @file references/verification.md for full checklist. + +Quick checklist: +- [ ] Task description requirements met +- [ ] Context "Done when" criteria satisfied +- [ ] Tests passing (document count) +- [ ] Build succeeds +- [ ] Manual testing done + +## 6. Complete Task with Learnings + +```javascript +await tasks.complete(taskId, { + result: `Implemented login endpoint: + +Implementation: +- Created src/auth/login.ts +- Added JWT token generation +- Integrated with user service + +Verification: +- All 42 tests passing (3 new) +- Manually tested valid/invalid credentials`, + learnings: [ + "bcrypt rounds should be 12+ for production", + "jose library preferred over jsonwebtoken" + ] +}); +``` + +**VCS Required:** Commits changes (NothingToCommit treated as success), then deletes the task's bookmark (best-effort) and clears the DB bookmark field on success. Fails with `NotARepository` if no jj/git found. + +**Learnings Effect:** Learnings bubble to immediate parent only. `sourceTaskId` is preserved through bubbling, so if this task's learnings later bubble further, the origin is tracked. + +The `result` becomes part of the task's permanent record. + +## VCS Integration (Required for Workflow) + +VCS operations are **automatically handled** by the tasks API: + +| Task Operation | VCS Effect | +|----------------|------------| +| `tasks.start(id)` | **VCS required** - creates bookmark `task/`, records start commit | +| `tasks.complete(id)` | **VCS required** - commits changes, deletes bookmark (best-effort), clears DB bookmark on success | +| `tasks.complete(milestoneId)` | Same + deletes ALL descendant bookmarks recursively (depth-1 and depth-2) | +| `tasks.delete(id)` | Best-effort bookmark cleanup (logs warning on failure) | + +**Note:** VCS (jj or git) is required for start/complete. CRUD operations work without VCS. + +## Error Handling + +### Pending Children + +```javascript +try { + await tasks.complete(taskId, "Done"); +} catch (err) { + if (err.message.includes("pending children")) { + const pending = await tasks.list({ parentId: taskId, completed: false }); + return `Cannot complete: ${pending.length} children pending`; + } + throw err; +} +``` + +### Task Not Ready + +```javascript +const task = await tasks.get(taskId); + +// Check if blocked +if (task.blockedBy.length > 0) { + console.log("Blocked by:", task.blockedBy); + // Complete blockers first or unblock + await tasks.unblock(taskId, blockerId); +} +``` + +## Complete Workflow Example + +```javascript +const task = await tasks.nextReady(); +if (!task) return "No ready tasks"; + +await tasks.start(task.id); +// ... implement ... +await tasks.complete(task.id, { + result: "Implemented: ... Verification: All 58 tests passing", + learnings: ["Use jose for JWT"] +}); +``` diff --git a/profiles/opencode/skill/spec-planner/SKILL.md b/profiles/opencode/skill/spec-planner/SKILL.md new file mode 100644 index 0000000..4a8e772 --- /dev/null +++ b/profiles/opencode/skill/spec-planner/SKILL.md @@ -0,0 +1,206 @@ +--- +name: spec-planner +description: Dialogue-driven spec development through skeptical questioning and iterative refinement. Triggers: "spec this out", feature planning, architecture decisions, "is this worth it?" questions, RFC/design doc creation, work scoping. Invoke Librarian for unfamiliar tech/frameworks/APIs. +--- + +# Spec Planner + +Produce implementation-ready specs through rigorous dialogue and honest trade-off analysis. + +## Core Philosophy + +- **Dialogue over deliverables** - Plans emerge from discussion, not assumption +- **Skeptical by default** - Requirements are incomplete until proven otherwise +- **Second-order thinking** - Consider downstream effects and maintenance burden + +## Workflow Phases + +``` +CLARIFY --[user responds]--> DISCOVER --[done]--> DRAFT --[complete]--> REFINE --[approved]--> DONE + | | | | + +--[still ambiguous]--<------+-------------------+----[gaps found]------+ +``` + +**State phase at end of every response:** +``` +--- +Phase: CLARIFY | Waiting for: answers to questions 1-4 +``` + +--- + +## Phase 1: CLARIFY (Mandatory) + +**Hard rule:** No spec until user has responded to at least one round of questions. + +1. **STOP.** Do not proceed to planning. +2. Identify gaps in: scope, motivation, constraints, edge cases, success criteria +3. Ask 3-5 pointed questions that would change the approach. USE YOUR QUESTION TOOL. +4. **Wait for responses** + +| Category | Example | +|----------|---------| +| Scope | "Share where? Social media? Direct link? Embed?" | +| Motivation | "What user problem are we actually solving?" | +| Constraints | "Does this need to work with existing privacy settings?" | +| Success | "How will we know this worked?" | + +**Escape prevention:** Even if request seems complete, ask 2+ clarifying questions. Skip only for mechanical requests (e.g., "rename X to Y"). + +**Anti-patterns to resist:** +- "Just give me a rough plan" -> Still needs scope questions +- "I'll figure out the details" -> Those details ARE the spec +- Very long initial request -> Longer != clearer; probe assumptions + +**Transition:** User answered AND no new ambiguities -> DISCOVER + +--- + +## Phase 2: DISCOVER + +**After clarification, before planning:** Understand existing system. + +Launch explore subagents in parallel: + +``` +Task( + subagent_type="explore", + description="Explore [area name]", + prompt="Explore [area]. Return: key files, abstractions, patterns, integration points." +) +``` + +| Target | What to Find | +|--------|--------------| +| Affected area | Files, modules that will change | +| Existing patterns | How similar features are implemented | +| Integration points | APIs, events, data flows touched | + +**If unfamiliar tech involved**, invoke Librarian: + +``` +Task( + subagent_type="librarian", + description="Research [tech name]", + prompt="Research [tech] for [use case]. Return: recommended approach, gotchas, production patterns." +) +``` + +**Output:** Brief architecture summary before proposing solutions. + +**Transition:** System context understood -> DRAFT + +--- + +## Phase 3: DRAFT + +Apply planning framework from [decision-frameworks.md](./references/decision-frameworks.md): + +1. **Problem Definition** - What are we solving? For whom? Cost of not solving? +2. **Constraints Inventory** - Time, system, knowledge, scope ceiling +3. **Solution Space** - Simplest -> Balanced -> Full engineering solution +4. **Trade-off Analysis** - See table format in references +5. **Recommendation** - One clear choice with reasoning + +Use appropriate template from [templates.md](./references/templates.md): +- **Quick Decision** - Scoped technical choices +- **Feature Plan** - New feature development +- **ADR** - Architecture decisions +- **RFC** - Larger proposals + +**Transition:** Spec produced -> REFINE + +--- + +## Phase 4: REFINE + +Run completeness check: + +| Criterion | Check | +|-----------|-------| +| Scope bounded | Every deliverable listed; non-goals explicit | +| Ambiguity resolved | No "TBD" or "to be determined" | +| Acceptance testable | Each criterion pass/fail verifiable | +| Dependencies ordered | Clear what blocks what | +| Types defined | Data shapes specified (not "some object") | +| Effort estimated | Each deliverable has S/M/L/XL | +| Risks identified | At least 2 risks with mitigations | +| Open questions | Resolved OR assigned owner | + +**If any criterion fails:** Return to dialogue. "To finalize, I need clarity on: [failing criteria]." + +**Transition:** All criteria pass + user approval -> DONE + +--- + +## Phase 5: DONE + +### Final Output + +``` +=== Spec Complete === + +Phase: DONE +Type: +Effort: +Status: Ready for task breakdown + +Discovery: +- Explored: +- Key findings: + +Recommendation: + + +Key Trade-offs: +- + +Deliverables (Ordered): +1. [D1] (effort) - depends on: - +2. [D2] (effort) - depends on: D1 + +Open Questions: +- [ ] -> Owner: [who] +``` + +### Write Spec to File (MANDATORY) + +1. Derive filename from feature/decision name (kebab-case) +2. Write spec to `specs/.md` +3. Confirm: `Spec written to: specs/.md` + +--- + +## Effort Estimates + +| Size | Time | Scope | +|------|------|-------| +| **S** | <1 hour | Single file, isolated change | +| **M** | 1-3 hours | Few files, contained feature | +| **L** | 1-2 days | Cross-cutting, multiple components | +| **XL** | >2 days | Major refactor, new system | + +## Scope Control + +When scope creeps: +1. **Name it:** "That's scope expansion. Let's finish X first." +2. **Park it:** "Added to Open Questions. Revisit after core spec stable." +3. **Cost it:** "Adding Y changes effort from M to XL. Worth it?" + +**Hard rule:** If scope changes, re-estimate and flag explicitly. + +## References + +| File | When to Read | +|------|--------------| +| [templates.md](./references/templates.md) | Output formats for plans, ADRs, RFCs | +| [decision-frameworks.md](./references/decision-frameworks.md) | Complex multi-factor decisions | +| [estimation.md](./references/estimation.md) | Breaking down work, avoiding underestimation | +| [technical-debt.md](./references/technical-debt.md) | Evaluating refactoring ROI | + +## Integration + +| Agent | When to Invoke | +|-------|----------------| +| **Librarian** | Research unfamiliar tech, APIs, frameworks | +| **Oracle** | Deep architectural analysis, complex debugging | diff --git a/profiles/opencode/skill/spec-planner/references/decision-frameworks.md b/profiles/opencode/skill/spec-planner/references/decision-frameworks.md new file mode 100644 index 0000000..8e881dd --- /dev/null +++ b/profiles/opencode/skill/spec-planner/references/decision-frameworks.md @@ -0,0 +1,75 @@ +# Decision Frameworks + +## Reversibility Matrix + +| Decision Type | Approach | +|---------------|----------| +| **Two-way door** (easily reversed) | Decide fast, learn from outcome | +| **One-way door** (hard to reverse) | Invest time in analysis | + +Most decisions are two-way doors. Don't over-analyze. + +## Cost of Delay + +``` +Daily Cost = (Value Delivered / Time to Deliver) x Risk Factor +``` + +Use when prioritizing: +- High daily cost -> Do first +- Low daily cost -> Can wait + +## RICE Scoring + +| Factor | Question | Scale | +|--------|----------|-------| +| **R**each | How many users affected? | # users/period | +| **I**mpact | How much per user? | 0.25, 0.5, 1, 2, 3 | +| **C**onfidence | How sure are we? | 20%, 50%, 80%, 100% | +| **E**ffort | Person-weeks | 0.5, 1, 2, 4, 8+ | + +``` +RICE = (Reach x Impact x Confidence) / Effort +``` + +## Technical Decision Checklist + +Before committing to a technical approach: + +- [ ] Have we talked to someone who's done this before? +- [ ] What's the simplest version that teaches us something? +- [ ] What would make us reverse this decision? +- [ ] Who maintains this in 6 months? +- [ ] What's our rollback plan? + +## When to Build vs Buy vs Adopt + +| Signal | Build | Buy | Adopt (OSS) | +|--------|-------|-----|-------------| +| Core differentiator | Yes | No | Maybe | +| Commodity problem | No | Yes | Yes | +| Tight integration needed | Yes | Maybe | Maybe | +| Team has expertise | Yes | N/A | Yes | +| Time pressure | No | Yes | Maybe | +| Long-term control needed | Yes | No | Maybe | + +## Decomposition Strategies + +### Vertical Slicing +Cut features into thin end-to-end slices that deliver value: +``` +Bad: "Build database layer" -> "Build API" -> "Build UI" +Good: "User can see their profile" -> "User can edit name" -> "User can upload avatar" +``` + +### Risk-First Ordering +1. Identify highest-risk unknowns +2. Build spike/proof-of-concept for those first +3. Then build around proven foundation + +### Dependency Mapping +``` +[Feature A] -depends on-> [Feature B] -depends on-> [Feature C] + ^ + Start here +``` diff --git a/profiles/opencode/skill/spec-planner/references/estimation.md b/profiles/opencode/skill/spec-planner/references/estimation.md new file mode 100644 index 0000000..b318b7e --- /dev/null +++ b/profiles/opencode/skill/spec-planner/references/estimation.md @@ -0,0 +1,69 @@ +# Estimation + +## Why Estimates Fail + +| Cause | Mitigation | +|-------|------------| +| Optimism bias | Use historical data, not gut | +| Missing scope | List "obvious" tasks explicitly | +| Integration blindness | Add 20-30% for glue code | +| Unknown unknowns | Add buffer based on novelty | +| Interruptions | Assume 60% focused time | + +## Estimation Techniques + +### Three-Point Estimation +``` +Expected = (Optimistic + 4xMostLikely + Pessimistic) / 6 +``` + +### Relative Sizing +Compare to known references: +- "This is about twice as complex as Feature X" +- Use Fibonacci (1, 2, 3, 5, 8, 13) to reflect uncertainty + +### Task Decomposition +1. Break into tasks <=4 hours +2. If can't decompose, spike first +3. Sum tasks + 20% integration buffer + +## Effort Multipliers + +| Factor | Multiplier | +|--------|------------| +| New technology | 1.5-2x | +| Unclear requirements | 1.3-1.5x | +| External dependencies (waiting on others) | 1.2-1.5x | +| Legacy/undocumented code | 1.3-2x | +| Production deployment | 1.2x | +| First time doing X | 2-3x | +| Context switching (other priorities) | 1.3x | +| Yak shaving risk (unknown unknowns) | 1.5x | + +## Hidden Work Checklist + +Always include time for: +- [ ] Code review (20% of dev time) +- [ ] Testing (30-50% of dev time) +- [ ] Documentation (10% of dev time) +- [ ] Deployment/config (varies) +- [ ] Bug fixes from testing (20% buffer) +- [ ] Interruptions / competing priorities + +## When to Re-Estimate + +Re-estimate when: +- Scope changes materially +- Major unknown becomes known +- Actual progress diverges >30% from estimate + +## Communicating Estimates + +**Good:** "1-2 weeks, confidence 70%-main risk is the third-party API integration" + +**Bad:** "About 2 weeks" + +Always include: +1. Range, not point estimate +2. Confidence level +3. Key assumptions/risks diff --git a/profiles/opencode/skill/spec-planner/references/technical-debt.md b/profiles/opencode/skill/spec-planner/references/technical-debt.md new file mode 100644 index 0000000..fb5f966 --- /dev/null +++ b/profiles/opencode/skill/spec-planner/references/technical-debt.md @@ -0,0 +1,94 @@ +# Technical Debt + +## Debt Categories + +| Type | Example | Urgency | +|------|---------|---------| +| **Deliberate-Prudent** | "Ship now, refactor next sprint" | Planned paydown | +| **Deliberate-Reckless** | "We don't have time for tests" | Accumulating risk | +| **Inadvertent-Prudent** | "Now we know a better way" | Normal learning | +| **Inadvertent-Reckless** | "What's layering?" | Learning curve | + +## When to Pay Down Debt + +**Pay now when:** +- Debt is in path of upcoming work +- Cognitive load slowing every change +- Bugs recurring in same area +- Onboarding time increasing + +**Defer when:** +- Area is stable, rarely touched +- Bigger refactor coming anyway +- Time constrained on priority work +- Code may be deprecated + +## ROI Framework + +``` +Debt ROI = (Time Saved Per Touch x Touches/Month x Months) / Paydown Cost +``` + +| ROI | Action | +|-----|--------| +| >3x | Prioritize immediately | +| 1-3x | Plan into upcoming work | +| <1x | Accept or isolate | + +## Refactoring Strategies + +### Strangler Fig +1. Build new alongside old +2. Redirect traffic incrementally +3. Remove old when empty + +Best for: Large system replacements + +### Branch by Abstraction +1. Create abstraction over old code +2. Implement new behind abstraction +3. Switch implementations +4. Remove old + +Best for: Library/dependency swaps + +### Parallel Change (Expand-Contract) +1. Add new behavior alongside old +2. Migrate callers incrementally +3. Remove old behavior + +Best for: API changes + +### Mikado Method +1. Try the change +2. When it breaks, note prerequisites +3. Revert +4. Recursively fix prerequisites +5. Apply original change + +Best for: Untangling dependencies + +## Tracking Debt + +Minimum viable debt tracking: +```markdown +## Tech Debt Log + +| ID | Description | Impact | Area | Added | +|----|-------------|--------|------|-------| +| TD-1 | No caching layer | Slow queries | /api | 2024-01 | +``` + +Review monthly. Prune resolved items. + +## Communicating Debt to Stakeholders + +**Frame as investment, not cleanup:** +- "This will reduce bug rate by ~30%" +- "Deployment time goes from 2 hours to 20 minutes" +- "New features in this area take 2x longer than they should" + +**Avoid:** +- "The code is messy" +- "We need to refactor" +- Technical jargon without business impact diff --git a/profiles/opencode/skill/spec-planner/references/templates.md b/profiles/opencode/skill/spec-planner/references/templates.md new file mode 100644 index 0000000..19c594c --- /dev/null +++ b/profiles/opencode/skill/spec-planner/references/templates.md @@ -0,0 +1,161 @@ +# Output Templates + +## Quick Decision + +For scoped technical choices with clear options. + +``` +## Decision: [choice] + +**Why:** [1-2 sentences] +**Trade-off:** [what we're giving up] +**Revisit if:** [trigger conditions] +``` + +## Feature Plan (Implementation-Ready) + +For new feature development. **Complete enough for task decomposition.** + +``` +## Feature: [name] + +### Problem Statement +**Who:** [specific user/persona] +**What:** [the problem they face] +**Why it matters:** [business/user impact] +**Evidence:** [how we know this is real] + +### Proposed Solution +[High-level approach in 2-3 paragraphs] + +### Scope & Deliverables +| Deliverable | Effort | Depends On | +|-------------|--------|------------| +| [D1] | S/M/L | - | +| [D2] | S/M/L | D1 | + +### Non-Goals (Explicit Exclusions) +- [Thing people might assume is in scope but isn't] + +### Data Model +[Types, schemas, state shapes that will exist or change] + +### API/Interface Contract +[Public interfaces between components-input/output/errors] + +### Acceptance Criteria +- [ ] [Testable statement 1] +- [ ] [Testable statement 2] + +### Test Strategy +| Layer | What | How | +|-------|------|-----| +| Unit | [specific logic] | [approach] | +| Integration | [boundaries] | [approach] | + +### Risks & Mitigations +| Risk | Likelihood | Impact | Mitigation | +|------|------------|--------|------------| + +### Trade-offs Made +| Chose | Over | Because | +|-------|------|---------| + +### Open Questions +- [ ] [Question] -> Owner: [who decides] + +### Success Metrics +- [Measurable outcome] +``` + +## Architecture Decision Record (ADR) + +For significant architecture decisions that need documentation. + +``` +## ADR: [title] + +**Status:** Proposed | Accepted | Deprecated | Superseded +**Date:** [date] + +### Context +[What forces are at play] + +### Decision +[What we're doing] + +### Consequences +- [+] [Benefit] +- [-] [Drawback] +- [~] [Neutral observation] +``` + +## RFC (Request for Comments) + +For larger proposals needing broader review. + +``` +## RFC: [title] + +**Author:** [name] +**Status:** Draft | In Review | Accepted | Rejected +**Created:** [date] + +### Summary +[1-2 paragraph overview] + +### Motivation +[Why are we doing this?] + +### Detailed Design +[Technical details] + +### Alternatives Considered +| Option | Pros | Cons | Why Not | +|--------|------|------|---------| + +### Migration/Rollout +[How we get from here to there] + +### Open Questions +- [ ] [Question] +``` + +## Handoff Artifact + +When spec is complete, produce final summary for task decomposition: + +``` +# [Feature Name] - Implementation Spec + +**Status:** Ready for task breakdown +**Effort:** [total estimate] +**Approved by:** [human who approved] +**Date:** [date] + +## Deliverables (Ordered) + +1. **[D1]** (S) - [one-line description] + - Depends on: - + - Files likely touched: [paths] + +2. **[D2]** (M) - [one-line description] + - Depends on: D1 + - Files likely touched: [paths] + +## Key Technical Decisions +- [Decision]: [choice] because [reason] + +## Data Model +[Copy from spec] + +## Acceptance Criteria +1. [Criterion 1] +2. [Criterion 2] + +## Open Items (Non-Blocking) +- [Item] -> Owner: [who] + +--- +*Spec approved for task decomposition.* +```