# Verification Guide

Before marking any task complete, you MUST verify your work. Verification separates "I think it's done" from "it's actually done."

## The Verification Process

1. **Re-read the task context**: What did you originally commit to do?
2. **Check acceptance criteria**: Does your implementation satisfy the "Done when" conditions?
3. **Run relevant tests**: Execute the test suite and document results
4. **Test manually**: Actually try the feature/change yourself
5. **Compare with requirements**: Does what you built match what was asked?

## Strong vs Weak Verification

### Strong Verification Examples

- "All 60 tests passing, build successful"
- "All 69 tests passing (4 new tests for middleware edge cases)"
- "Manually tested with valid/invalid/expired tokens - all cases work"
- "Ran `cargo test` - 142 tests passed, 0 failed"

### Weak Verification (Avoid)

- "Should work now" - "should" means not verified
- "Made the changes" - no evidence it works
- "Added tests" - did the tests pass? What's the count?
- "Fixed the bug" - what bug? Did you verify the fix?
- "Done" - done how? prove it

## Verification by Task Type

| Task Type | How to Verify |
|-----------|---------------|
| Code changes | Run full test suite, document passing count |
| New features | Run tests + manual testing of functionality |
| Configuration | Test the config works (run commands, check workflows) |
| Documentation | Verify examples work, links resolve, formatting renders |
| Refactoring | Confirm tests still pass, no behavior changes |
| Bug fixes | Reproduce bug first, verify fix, add regression test |

## Cross-Reference Checklist

Before marking complete, verify all applicable items:

- [ ] Task description requirements met
- [ ] Context "Done when" criteria satisfied
- [ ] Tests passing (document count: "All X tests passing")
- [ ] Build succeeds (if applicable)
- [ ] Manual testing done (describe what you tested)
- [ ] No regressions introduced
- [ ] Edge cases considered (error handling, invalid input)
- [ ] Follow-up work identified (created new tasks if needed)

**If you can't check all applicable boxes, the task isn't done yet.**

## Result Examples with Verification

### Code Implementation

```javascript
await tasks.complete(taskId, `Implemented JWT middleware:

Implementation:
- Created src/middleware/verify-token.ts
- Separated 'expired' vs 'invalid' error codes
- Added user extraction from payload

Verification:
- All 69 tests passing (4 new tests for edge cases)
- Manually tested with valid token: Access granted
- Manually tested with expired token: 401 with 'token_expired'
- Manually tested with invalid signature: 401 with 'invalid_token'`);
```

### Configuration/Infrastructure

```javascript
await tasks.complete(taskId, `Added GitHub Actions workflow for CI:

Implementation:
- Created .github/workflows/ci.yml
- Jobs: lint, test, build with pnpm cache

Verification:
- Pushed to test branch, opened PR #123
- Workflow triggered automatically
- All jobs passed (lint: 0 errors, test: 69/69, build: success)
- Total run time: 2m 34s`);
```

### Refactoring

```javascript
await tasks.complete(taskId, `Refactored storage to one file per task:

Implementation:
- Split tasks.json into .overseer/tasks/{id}.json files
- Added auto-migration from old format
- Atomic writes via temp+rename

Verification:
- All 60 tests passing (including 8 storage tests)
- Build successful
- Manually tested migration: old -> new format works
- Confirmed git diff shows only changed tasks`);
```

### Bug Fix

```javascript
await tasks.complete(taskId, `Fixed login validation accepting usernames with spaces:

Root cause:
- Validation regex didn't account for leading/trailing spaces

Fix:
- Added .trim() before validation in src/auth/validate.ts:42
- Updated regex to reject internal spaces

Verification:
- All 45 tests passing (2 new regression tests)
- Manually tested:
  - " admin" -> rejected (leading space)
  - "admin " -> rejected (trailing space)
  - "ad min" -> rejected (internal space)
  - "admin" -> accepted`);
```

### Documentation

```javascript
await tasks.complete(taskId, `Updated API documentation for auth endpoints:

Implementation:
- Added docs for POST /auth/login
- Added docs for POST /auth/logout
- Added docs for POST /auth/refresh
- Included example requests/responses

Verification:
- All code examples tested and working
- Links verified (no 404s)
- Rendered in local preview - formatting correct
- Spell-checked content`);
```

## Common Verification Mistakes

| Mistake | Better Approach |
|---------|-----------------|
| "Tests pass" | "All 42 tests passing" (include count) |
| "Manually tested" | "Manually tested X, Y, Z scenarios" (be specific) |
| "Works" | "Works: [evidence]" (show proof) |
| "Fixed" | "Fixed: [root cause] -> [solution] -> [verification]" |

## When Verification Fails

If verification reveals issues:

1. **Don't complete the task** - it's not done
2. **Document what failed** in task context
3. **Fix the issues** before completing
4. **Re-verify** after fixes

```javascript
// Update context with failure notes
await tasks.update(taskId, {
  context: task.context + `

Verification attempt 1 (failed):
- Tests: 41/42 passing
- Failing: test_token_refresh - timeout issue
- Need to investigate async handling`
});

// After fixing
await tasks.complete(taskId, `Implemented token refresh:

Implementation:
- Added refresh endpoint
- Fixed async timeout (was missing await)

Verification:
- All 42 tests passing (fixed timeout issue)
- Manual testing: refresh works within 30s window`);
```