feat: add RemoveHidden option for display:none element stripping #63
Reference in New Issue
Block a user
Delete Branch "feature/readability-remove-hidden"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
RemoveHidden boolfield toReadabilityOptionsthat evaluates JavaScript on the live page to remove all elements with computeddisplay: nonebefore readability extractionpageEvaluatorinterface so the concrete Playwright-backed document supports JS evaluation without changing theDocumentinterfacePageEvaluatemethod to thedocumentstructCloses #62
Context
Some websites embed hidden
display: noneelements as anti-AI-scraping honeypots containing prompt injection attacks. These are invisible to users but get picked up by readability extraction. CSS selectors can't target computed styles, so JavaScript evaluation withgetComputedStyleis needed.Test plan
go test ./...passes