Commit Graph

2 Commits

Author SHA1 Message Date
steve 45fa7c4e8f fix(archive): harden archive.ph submit/poll flow
CI / build (pull_request) Successful in 1m5s
CI / vet (pull_request) Successful in 1m26s
CI / test (pull_request) Successful in 1m27s
The archive.ph submission flow had several defects that caused Mort's
summary fallback to return placeholder "Working..." pages instead of
real archived content, or hang for the full timeout:

- Context cancellation in the poll loop fell through to a final
  WaitForNetworkIdle and returned the doc as success. The function now
  returns a typed error (ErrArchiveIncomplete on deadline, wrapped
  ctx.Err() on caller cancel).
- The poll only checked doc.URL() — if archive.ph's JS got wedged on
  /wip/<id>, the loop spun until timeout. Completion now also requires
  a DOM marker (#HEADER, [id^="SHARE"], .TEXT-BLOCK) so URL-only
  transitions don't satisfy the check.
- The final URL is now validated against an alphanumeric ID pattern,
  rejecting /wip/, /submit, /newest/ and the front page.
- 5-second blind sleep before polling replaced with a bounded
  WaitForNetworkIdle that short-circuits when already archived.
- Form selectors now use a cascade (input[name='url'] →
  input[type='url'] → input.input-url → input[name='anyway'], and
  similar for the submit button) so a single archive.ph markup change
  doesn't kill the flow. Errors name which selectors were tried.
- Default timeout lowered from 1 hour to 5 minutes (still overridable
  via context deadline). Exposed as DefaultTimeout.
- Poll progress is now logged at slog.Info every 30s so production logs
  surface stuck flows.
- Front-page 5xx now retries twice with 1s/4s backoff before failing.
- New exported sentinels: ErrArchiveIncomplete, ErrArchiveSelectorMissing.
- Tests cover URL validator (incl. /wip/, /newest/, short IDs, o-prefix),
  selector cascade, DOM completion detector, transient status
  classification, and ctx cancellation paths via a thread-safe mutating
  mock document. Full integration with a live browser remains hand-tested.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 17:23:24 -04:00
steve cb2ed10cfd refactor: restructure API, deduplicate code, expand test coverage
CI / build (push) Failing after 2m4s
CI / test (push) Failing after 2m6s
CI / vet (push) Failing after 2m19s
- Extract shared DeferClose helper, removing 14 duplicate copies
- Rename PlayWright-prefixed types to cleaner names (BrowserOptions,
  BrowserSelection, NewBrowser, etc.)
- Rename fields: ServerAddress, RequireServer (was DontLaunchOnConnectFailure)
- Extract shared initBrowser/mergeOptions into browser_init.go,
  deduplicating ~120 lines between NewBrowser and NewInteractiveBrowser
- Remove unused locator field from document struct
- Add tests for all previously untested packages (archive, aislegopher,
  wegmans, useragents, powerball) and expand existing test suites
- Add MIGRATION.md documenting all breaking API changes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-09 13:59:47 -05:00