fix(archive): harden archive.ph submit/poll flow #87
Reference in New Issue
Block a user
Delete Branch "fix/archive-ph-poll-hardening"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
The
archive.Archiveflow used by Mort's summary system was returning placeholder "Working…" pages instead of actual archived content, and hanging up to 1 hour on stuck submissions. This PR fixes 8 defects.Defects fixed
ctx.Done()exit returning placeholder doc — poll loop now classifies the ctx error (deadline →ErrArchiveIncomplete, caller-cancel → wrappedcontext.Canceled), closes the doc, and returns the error.isArchiveCompletenow requires both a snapshot-shaped URL AND a DOM completion marker (#HEADER,[id^='SHARE'],.TEXT-BLOCK).isFinalSnapshotURLrejects/wip/,/submit,/newest/, and the front page; path must match^/(o/)?[A-Za-z0-9]{5,}(/|$).WaitForNetworkIdle(8s)then a 1s poll interval (was 5s blind + 5s poll).DefaultTimeout = 5 * time.Minute(still ctx-overridable).slog.Info"still waiting for archive.ph" fires every 30s with current URL.New exports (additive, no breaking changes)
ErrArchiveIncomplete— wrapscontext.DeadlineExceededsoerrors.Isworks for eitherErrArchiveSelectorMissing— naming which selectors were triedDefaultTimeout— exposed constant (5 min)archive.Archive,archive.IsArchived, and theirConfigmethods keep their existing signatures.Test coverage
TestIsFinalSnapshotURL(12 subcases) — front page,/wip/,/submit,/newest/, short IDs,o/prefixTestHasCompletionMarker— empty doc + eachcompletionSelectorsentryTestFindURLInput_Cascade+TestFindSubmitButton_Cascade— first-wins, last-fallback-works, all-miss-returns-nilTestIsTransientStatus(7 subcases) — 5xx → retry, 4xx → no retryTestPollUntilArchived_ContextCancelled_NeverCompletes— deadline →ErrArchiveIncompleteTestPollUntilArchived_CallerCancelled— cancel → wrappedcontext.Canceled, NOTErrArchiveIncompleteTestPollUntilArchived_SuccessRequiresBothURLAndMarker— regression contract for defect #2TestPollUntilArchived_URLOnly_NotEnough— final-looking URL with no marker hits the deadlineTestArchive_SelectorMissing— full Archive call returnsErrArchiveSelectorMissinggo build ./...,go vet ./...,go test -race -count=1 ./sites/archive/...all clean.LOC delta
+832 / -70 (most addition is the new test file + docstrings).