feat(archive): keep page open on captcha-status errors so callers can promote
Adds OpenPageOptions.AllowNonOKStatus. When set, openPage no longer closes the page on non-2xx (other than 404) and Open returns both a usable Document and ErrInvalidStatusCode. archive.IsArchived and Archive opt in, so callers can PromoteToInteractive the captcha page, hand it to a human solver, and demote back to extract content from the same browser instance — avoiding the cf_clearance fingerprint-binding issue that re-challenges any fresh retry browser. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -7,6 +7,14 @@ import (
|
||||
|
||||
type OpenPageOptions struct {
|
||||
Referer string
|
||||
|
||||
// AllowNonOKStatus, when true, keeps the page open and returns a usable
|
||||
// Document along with ErrInvalidStatusCode on non-2xx responses (other
|
||||
// than 404, which is treated as ErrPageNotFound and still closes the
|
||||
// page). This lets callers promote the page to an InteractiveBrowser
|
||||
// to e.g. let a human solve a Cloudflare captcha that produced a 403,
|
||||
// then resume extraction from the same browser instance.
|
||||
AllowNonOKStatus bool
|
||||
}
|
||||
|
||||
type Browser interface {
|
||||
|
||||
Reference in New Issue
Block a user