Add API to promote a headless Browser page to InteractiveBrowser mid-session #76

Closed
opened 2026-02-24 02:03:49 +00:00 by Claude · 2 comments
Collaborator

Problem

When a headless Browser session hits a captcha or anti-bot challenge, there's no way to hand the current page to a human for interactive solving. The current workaround in mort is:

  1. Detect captcha on the headless page
  2. Close the browser entirely (losing all page state, JS context, DOM)
  3. Spin up a new InteractiveBrowser in the captcha proxy for the human
  4. Human solves the captcha; cookies are saved to shared storage
  5. Spin up another new Browser with those cookies and retry from scratch

This works but is wasteful and fragile — the retry navigates from scratch, which may trigger different anti-bot behavior, and any page-local state (JS variables, session storage, service workers) is lost.

Proposed Solution

Add a way to promote an existing headless page to an InteractiveBrowser so the same page (with its full state) can be handed to a human:

// Option A: Method on Browser interface
func (b playWrightBrowser) PromoteToInteractive(doc Document) (InteractiveBrowser, error)

// Option B: Constructor from existing Playwright objects
func NewInteractiveBrowserFromPage(pw *playwright.Playwright, browser playwright.Browser, ctx playwright.BrowserContext, page playwright.Page) InteractiveBrowser

The promoted InteractiveBrowser would reuse the exact same Playwright process, browser instance, context (cookies, storage), and page. The human interacts with the page (solving a captcha, clicking through a consent wall, etc.), then the caller can either:

  • Continue using the InteractiveBrowser to extract content
  • Or demote back to scraping mode on the same page

Key design considerations

  • Ownership transfer: The Browser should not close the page/context after promotion. Need to handle the lifecycle carefully — either the caller explicitly transfers ownership, or the Document is detached from the original Browser.
  • Page access from Document: Currently Document wraps a playwright.Page but doesn't expose it. The promotion API would need access to the underlying page, either through Document or by having the caller pass it.
  • Screenshot streaming: The InteractiveBrowser.Screenshot() method is already implemented for the existing interactive browser and would work on the promoted page without changes.
  • Cookie sync: The promoted browser shares the same BrowserContext, so cookies set during interactive use are immediately available if the session is demoted back to scraping.

Use Cases

  1. Captcha solving: Headless scraper detects captcha → promotes to interactive → human solves it → continues scraping on the same page
  2. Consent walls: Cookie banners or GDPR walls that require clicking through
  3. Login flows: Sites that require interactive login before content is accessible
  4. Debugging: Developer wants to inspect what the headless browser is seeing

Current Architecture Blockers

All Playwright objects (pw, browser, ctx, page) are unexported fields in both playWrightBrowser and interactiveBrowser. There's no shared interface or inheritance between Browser and InteractiveBrowser. Implementing this requires either:

  • Exposing the underlying objects (via getter methods or a promotion method)
  • Or adding the promotion logic within the go-extractor package where it has access to private fields
## Problem When a headless `Browser` session hits a captcha or anti-bot challenge, there's no way to hand the current page to a human for interactive solving. The current workaround in mort is: 1. Detect captcha on the headless page 2. **Close the browser entirely** (losing all page state, JS context, DOM) 3. Spin up a new `InteractiveBrowser` in the captcha proxy for the human 4. Human solves the captcha; cookies are saved to shared storage 5. Spin up another new `Browser` with those cookies and retry from scratch This works but is wasteful and fragile — the retry navigates from scratch, which may trigger different anti-bot behavior, and any page-local state (JS variables, session storage, service workers) is lost. ## Proposed Solution Add a way to promote an existing headless page to an `InteractiveBrowser` so the same page (with its full state) can be handed to a human: ```go // Option A: Method on Browser interface func (b playWrightBrowser) PromoteToInteractive(doc Document) (InteractiveBrowser, error) // Option B: Constructor from existing Playwright objects func NewInteractiveBrowserFromPage(pw *playwright.Playwright, browser playwright.Browser, ctx playwright.BrowserContext, page playwright.Page) InteractiveBrowser ``` The promoted `InteractiveBrowser` would reuse the exact same Playwright process, browser instance, context (cookies, storage), and page. The human interacts with the page (solving a captcha, clicking through a consent wall, etc.), then the caller can either: - Continue using the `InteractiveBrowser` to extract content - Or demote back to scraping mode on the same page ### Key design considerations - **Ownership transfer**: The `Browser` should not close the page/context after promotion. Need to handle the lifecycle carefully — either the caller explicitly transfers ownership, or the `Document` is detached from the original `Browser`. - **Page access from Document**: Currently `Document` wraps a `playwright.Page` but doesn't expose it. The promotion API would need access to the underlying page, either through `Document` or by having the caller pass it. - **Screenshot streaming**: The `InteractiveBrowser.Screenshot()` method is already implemented for the existing interactive browser and would work on the promoted page without changes. - **Cookie sync**: The promoted browser shares the same `BrowserContext`, so cookies set during interactive use are immediately available if the session is demoted back to scraping. ## Use Cases 1. **Captcha solving**: Headless scraper detects captcha → promotes to interactive → human solves it → continues scraping on the same page 2. **Consent walls**: Cookie banners or GDPR walls that require clicking through 3. **Login flows**: Sites that require interactive login before content is accessible 4. **Debugging**: Developer wants to inspect what the headless browser is seeing ## Current Architecture Blockers All Playwright objects (`pw`, `browser`, `ctx`, `page`) are **unexported fields** in both `playWrightBrowser` and `interactiveBrowser`. There's no shared interface or inheritance between `Browser` and `InteractiveBrowser`. Implementing this requires either: - Exposing the underlying objects (via getter methods or a promotion method) - Or adding the promotion logic within the go-extractor package where it has access to private fields
Claude added the enhancementpriority/medium labels 2026-02-24 02:03:54 +00:00
Author
Collaborator

Starting work on this. Plan:

  1. Add detached field to document struct — Close() becomes no-op when detached
  2. Add ownsInfrastructure and detached fields to interactiveBrowser — promoted browsers only close the page, not the entire pw/browser/ctx stack
  3. Create promote.go with package-level PromoteToInteractive(Document) and DemoteToDocument(InteractiveBrowser) functions
  4. Add unit tests in promote_test.go

Branch: feature/76-promote-to-interactive

Starting work on this. Plan: 1. Add `detached` field to `document` struct — `Close()` becomes no-op when detached 2. Add `ownsInfrastructure` and `detached` fields to `interactiveBrowser` — promoted browsers only close the page, not the entire pw/browser/ctx stack 3. Create `promote.go` with package-level `PromoteToInteractive(Document)` and `DemoteToDocument(InteractiveBrowser)` functions 4. Add unit tests in `promote_test.go` Branch: `feature/76-promote-to-interactive`
Author
Collaborator

Implementation complete. PR: #78

Changes:

  • document.go — added detached field; Close() is a no-op when detached
  • interactive.go — added ownsInfrastructure and detached fields; promoted browsers only close the page, not the pw/browser/ctx stack
  • promote.go (new) — PromoteToInteractive() and DemoteToDocument() package-level functions
  • promote_test.go (new) — 4 unit tests covering error paths

All builds, tests, and vet pass.

Implementation complete. PR: #78 **Changes:** - `document.go` — added `detached` field; `Close()` is a no-op when detached - `interactive.go` — added `ownsInfrastructure` and `detached` fields; promoted browsers only close the page, not the pw/browser/ctx stack - `promote.go` (new) — `PromoteToInteractive()` and `DemoteToDocument()` package-level functions - `promote_test.go` (new) — 4 unit tests covering error paths All builds, tests, and vet pass.
steve closed this issue 2026-02-24 02:29:07 +00:00
Sign in to join this conversation.