oc

Signed-off-by: Christoph Schmatzler <christoph@schmatzler.com>
2026-02-04 20:04:32 +00:00
parent 13586f5c44
commit ff8650bedf
19 changed files with 2448 additions and 0 deletions
--- a/profiles/opencode/skill/overseer/references/verification.md
+++ b/profiles/opencode/skill/overseer/references/verification.md
@@ -0,0 +1,186 @@
+# Verification Guide
+
+Before marking any task complete, you MUST verify your work. Verification separates "I think it's done" from "it's actually done."
+
+## The Verification Process
+
+1. **Re-read the task context**: What did you originally commit to do?
+2. **Check acceptance criteria**: Does your implementation satisfy the "Done when" conditions?
+3. **Run relevant tests**: Execute the test suite and document results
+4. **Test manually**: Actually try the feature/change yourself
+5. **Compare with requirements**: Does what you built match what was asked?
+
+## Strong vs Weak Verification
+
+### Strong Verification Examples
+
+- "All 60 tests passing, build successful"
+- "All 69 tests passing (4 new tests for middleware edge cases)"
+- "Manually tested with valid/invalid/expired tokens - all cases work"
+- "Ran `cargo test` - 142 tests passed, 0 failed"
+
+### Weak Verification (Avoid)
+
+- "Should work now" - "should" means not verified
+- "Made the changes" - no evidence it works
+- "Added tests" - did the tests pass? What's the count?
+- "Fixed the bug" - what bug? Did you verify the fix?
+- "Done" - done how? prove it
+
+## Verification by Task Type
+
+| Task Type | How to Verify |
+|-----------|---------------|
+| Code changes | Run full test suite, document passing count |
+| New features | Run tests + manual testing of functionality |
+| Configuration | Test the config works (run commands, check workflows) |
+| Documentation | Verify examples work, links resolve, formatting renders |
+| Refactoring | Confirm tests still pass, no behavior changes |
+| Bug fixes | Reproduce bug first, verify fix, add regression test |
+
+## Cross-Reference Checklist
+
+Before marking complete, verify all applicable items:
+
+- [ ] Task description requirements met
+- [ ] Context "Done when" criteria satisfied
+- [ ] Tests passing (document count: "All X tests passing")
+- [ ] Build succeeds (if applicable)
+- [ ] Manual testing done (describe what you tested)
+- [ ] No regressions introduced
+- [ ] Edge cases considered (error handling, invalid input)
+- [ ] Follow-up work identified (created new tasks if needed)
+
+**If you can't check all applicable boxes, the task isn't done yet.**
+
+## Result Examples with Verification
+
+### Code Implementation
+
+```javascript
+await tasks.complete(taskId, `Implemented JWT middleware:
+
+Implementation:
+- Created src/middleware/verify-token.ts
+- Separated 'expired' vs 'invalid' error codes
+- Added user extraction from payload
+
+Verification:
+- All 69 tests passing (4 new tests for edge cases)
+- Manually tested with valid token: Access granted
+- Manually tested with expired token: 401 with 'token_expired'
+- Manually tested with invalid signature: 401 with 'invalid_token'`);
+```
+
+### Configuration/Infrastructure
+
+```javascript
+await tasks.complete(taskId, `Added GitHub Actions workflow for CI:
+
+Implementation:
+- Created .github/workflows/ci.yml
+- Jobs: lint, test, build with pnpm cache
+
+Verification:
+- Pushed to test branch, opened PR #123
+- Workflow triggered automatically
+- All jobs passed (lint: 0 errors, test: 69/69, build: success)
+- Total run time: 2m 34s`);
+```
+
+### Refactoring
+
+```javascript
+await tasks.complete(taskId, `Refactored storage to one file per task:
+
+Implementation:
+- Split tasks.json into .overseer/tasks/{id}.json files
+- Added auto-migration from old format
+- Atomic writes via temp+rename
+
+Verification:
+- All 60 tests passing (including 8 storage tests)
+- Build successful
+- Manually tested migration: old -> new format works
+- Confirmed git diff shows only changed tasks`);
+```
+
+### Bug Fix
+
+```javascript
+await tasks.complete(taskId, `Fixed login validation accepting usernames with spaces:
+
+Root cause:
+- Validation regex didn't account for leading/trailing spaces
+
+Fix:
+- Added .trim() before validation in src/auth/validate.ts:42
+- Updated regex to reject internal spaces
+
+Verification:
+- All 45 tests passing (2 new regression tests)
+- Manually tested:
+  - " admin" -> rejected (leading space)
+  - "admin " -> rejected (trailing space)
+  - "ad min" -> rejected (internal space)
+  - "admin" -> accepted`);
+```
+
+### Documentation
+
+```javascript
+await tasks.complete(taskId, `Updated API documentation for auth endpoints:
+
+Implementation:
+- Added docs for POST /auth/login
+- Added docs for POST /auth/logout
+- Added docs for POST /auth/refresh
+- Included example requests/responses
+
+Verification:
+- All code examples tested and working
+- Links verified (no 404s)
+- Rendered in local preview - formatting correct
+- Spell-checked content`);
+```
+
+## Common Verification Mistakes
+
+| Mistake | Better Approach |
+|---------|-----------------|
+| "Tests pass" | "All 42 tests passing" (include count) |
+| "Manually tested" | "Manually tested X, Y, Z scenarios" (be specific) |
+| "Works" | "Works: [evidence]" (show proof) |
+| "Fixed" | "Fixed: [root cause] -> [solution] -> [verification]" |
+
+## When Verification Fails
+
+If verification reveals issues:
+
+1. **Don't complete the task** - it's not done
+2. **Document what failed** in task context
+3. **Fix the issues** before completing
+4. **Re-verify** after fixes
+
+```javascript
+// Update context with failure notes
+await tasks.update(taskId, {
+  context: task.context + `
+
+Verification attempt 1 (failed):
+- Tests: 41/42 passing
+- Failing: test_token_refresh - timeout issue
+- Need to investigate async handling`
+});
+
+// After fixing
+await tasks.complete(taskId, `Implemented token refresh:
+
+Implementation:
+- Added refresh endpoint
+- Fixed async timeout (was missing await)
+
+Verification:
+- All 42 tests passing (fixed timeout issue)
+- Manual testing: refresh works within 30s window`);
+```