186
profiles/opencode/skill/overseer/references/verification.md
Normal file
186
profiles/opencode/skill/overseer/references/verification.md
Normal file
@@ -0,0 +1,186 @@
|
||||
# Verification Guide
|
||||
|
||||
Before marking any task complete, you MUST verify your work. Verification separates "I think it's done" from "it's actually done."
|
||||
|
||||
## The Verification Process
|
||||
|
||||
1. **Re-read the task context**: What did you originally commit to do?
|
||||
2. **Check acceptance criteria**: Does your implementation satisfy the "Done when" conditions?
|
||||
3. **Run relevant tests**: Execute the test suite and document results
|
||||
4. **Test manually**: Actually try the feature/change yourself
|
||||
5. **Compare with requirements**: Does what you built match what was asked?
|
||||
|
||||
## Strong vs Weak Verification
|
||||
|
||||
### Strong Verification Examples
|
||||
|
||||
- "All 60 tests passing, build successful"
|
||||
- "All 69 tests passing (4 new tests for middleware edge cases)"
|
||||
- "Manually tested with valid/invalid/expired tokens - all cases work"
|
||||
- "Ran `cargo test` - 142 tests passed, 0 failed"
|
||||
|
||||
### Weak Verification (Avoid)
|
||||
|
||||
- "Should work now" - "should" means not verified
|
||||
- "Made the changes" - no evidence it works
|
||||
- "Added tests" - did the tests pass? What's the count?
|
||||
- "Fixed the bug" - what bug? Did you verify the fix?
|
||||
- "Done" - done how? prove it
|
||||
|
||||
## Verification by Task Type
|
||||
|
||||
| Task Type | How to Verify |
|
||||
|-----------|---------------|
|
||||
| Code changes | Run full test suite, document passing count |
|
||||
| New features | Run tests + manual testing of functionality |
|
||||
| Configuration | Test the config works (run commands, check workflows) |
|
||||
| Documentation | Verify examples work, links resolve, formatting renders |
|
||||
| Refactoring | Confirm tests still pass, no behavior changes |
|
||||
| Bug fixes | Reproduce bug first, verify fix, add regression test |
|
||||
|
||||
## Cross-Reference Checklist
|
||||
|
||||
Before marking complete, verify all applicable items:
|
||||
|
||||
- [ ] Task description requirements met
|
||||
- [ ] Context "Done when" criteria satisfied
|
||||
- [ ] Tests passing (document count: "All X tests passing")
|
||||
- [ ] Build succeeds (if applicable)
|
||||
- [ ] Manual testing done (describe what you tested)
|
||||
- [ ] No regressions introduced
|
||||
- [ ] Edge cases considered (error handling, invalid input)
|
||||
- [ ] Follow-up work identified (created new tasks if needed)
|
||||
|
||||
**If you can't check all applicable boxes, the task isn't done yet.**
|
||||
|
||||
## Result Examples with Verification
|
||||
|
||||
### Code Implementation
|
||||
|
||||
```javascript
|
||||
await tasks.complete(taskId, `Implemented JWT middleware:
|
||||
|
||||
Implementation:
|
||||
- Created src/middleware/verify-token.ts
|
||||
- Separated 'expired' vs 'invalid' error codes
|
||||
- Added user extraction from payload
|
||||
|
||||
Verification:
|
||||
- All 69 tests passing (4 new tests for edge cases)
|
||||
- Manually tested with valid token: Access granted
|
||||
- Manually tested with expired token: 401 with 'token_expired'
|
||||
- Manually tested with invalid signature: 401 with 'invalid_token'`);
|
||||
```
|
||||
|
||||
### Configuration/Infrastructure
|
||||
|
||||
```javascript
|
||||
await tasks.complete(taskId, `Added GitHub Actions workflow for CI:
|
||||
|
||||
Implementation:
|
||||
- Created .github/workflows/ci.yml
|
||||
- Jobs: lint, test, build with pnpm cache
|
||||
|
||||
Verification:
|
||||
- Pushed to test branch, opened PR #123
|
||||
- Workflow triggered automatically
|
||||
- All jobs passed (lint: 0 errors, test: 69/69, build: success)
|
||||
- Total run time: 2m 34s`);
|
||||
```
|
||||
|
||||
### Refactoring
|
||||
|
||||
```javascript
|
||||
await tasks.complete(taskId, `Refactored storage to one file per task:
|
||||
|
||||
Implementation:
|
||||
- Split tasks.json into .overseer/tasks/{id}.json files
|
||||
- Added auto-migration from old format
|
||||
- Atomic writes via temp+rename
|
||||
|
||||
Verification:
|
||||
- All 60 tests passing (including 8 storage tests)
|
||||
- Build successful
|
||||
- Manually tested migration: old -> new format works
|
||||
- Confirmed git diff shows only changed tasks`);
|
||||
```
|
||||
|
||||
### Bug Fix
|
||||
|
||||
```javascript
|
||||
await tasks.complete(taskId, `Fixed login validation accepting usernames with spaces:
|
||||
|
||||
Root cause:
|
||||
- Validation regex didn't account for leading/trailing spaces
|
||||
|
||||
Fix:
|
||||
- Added .trim() before validation in src/auth/validate.ts:42
|
||||
- Updated regex to reject internal spaces
|
||||
|
||||
Verification:
|
||||
- All 45 tests passing (2 new regression tests)
|
||||
- Manually tested:
|
||||
- " admin" -> rejected (leading space)
|
||||
- "admin " -> rejected (trailing space)
|
||||
- "ad min" -> rejected (internal space)
|
||||
- "admin" -> accepted`);
|
||||
```
|
||||
|
||||
### Documentation
|
||||
|
||||
```javascript
|
||||
await tasks.complete(taskId, `Updated API documentation for auth endpoints:
|
||||
|
||||
Implementation:
|
||||
- Added docs for POST /auth/login
|
||||
- Added docs for POST /auth/logout
|
||||
- Added docs for POST /auth/refresh
|
||||
- Included example requests/responses
|
||||
|
||||
Verification:
|
||||
- All code examples tested and working
|
||||
- Links verified (no 404s)
|
||||
- Rendered in local preview - formatting correct
|
||||
- Spell-checked content`);
|
||||
```
|
||||
|
||||
## Common Verification Mistakes
|
||||
|
||||
| Mistake | Better Approach |
|
||||
|---------|-----------------|
|
||||
| "Tests pass" | "All 42 tests passing" (include count) |
|
||||
| "Manually tested" | "Manually tested X, Y, Z scenarios" (be specific) |
|
||||
| "Works" | "Works: [evidence]" (show proof) |
|
||||
| "Fixed" | "Fixed: [root cause] -> [solution] -> [verification]" |
|
||||
|
||||
## When Verification Fails
|
||||
|
||||
If verification reveals issues:
|
||||
|
||||
1. **Don't complete the task** - it's not done
|
||||
2. **Document what failed** in task context
|
||||
3. **Fix the issues** before completing
|
||||
4. **Re-verify** after fixes
|
||||
|
||||
```javascript
|
||||
// Update context with failure notes
|
||||
await tasks.update(taskId, {
|
||||
context: task.context + `
|
||||
|
||||
Verification attempt 1 (failed):
|
||||
- Tests: 41/42 passing
|
||||
- Failing: test_token_refresh - timeout issue
|
||||
- Need to investigate async handling`
|
||||
});
|
||||
|
||||
// After fixing
|
||||
await tasks.complete(taskId, `Implemented token refresh:
|
||||
|
||||
Implementation:
|
||||
- Added refresh endpoint
|
||||
- Fixed async timeout (was missing await)
|
||||
|
||||
Verification:
|
||||
- All 42 tests passing (fixed timeout issue)
|
||||
- Manual testing: refresh works within 30s window`);
|
||||
```
|
||||
Reference in New Issue
Block a user