Create exported extractortest package with MockBrowser, MockDocument,
and MockNode that support selector-based responses for testing site
extractors without a real browser.
Add extraction tests for DuckDuckGo (result parsing, empty results, no
links, full search flow) and Powerball (drawing parsing, next drawing
parsing with billion/million, error cases, full GetCurrent flow).
Closes#21
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Wrap staticCookieJar in struct with sync.RWMutex for thread safety
- Add SameSite field to Cookie struct with Strict/Lax/None constants
- Update Playwright cookie conversion functions for SameSite
- Replace hardcoded 4-country switch with dynamic country code generation
Closes#20, #22, #23
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Return errors for required fields (ID, price) and log warnings for
optional fields (title, description, unit price) across all site
extractors instead of silently discarding them with _ =.
Closes#24
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Extract identical numericOnly inline functions from powerball and
megamillions into shared sites/internal/parse.NumericOnly with tests
- Extract duplicated DuckDuckGo result parsing from Search() and
GetResults() into shared extractResults() helper
Closes#13, #14
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- playwright.go: check error from page.Context().Cookies() before
iterating over results, preventing silent failures
- archive.go: replace time.Sleep(5s) with context-aware select using
time.After, allowing the operation to be cancelled promptly
Closes#7, #18
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix archive cmd passing only archive-specific Flags instead of the
merged flags variable that includes browser flags (#8)
- Move defer DeferClose() after error checks in 6 locations to prevent
calling Close on nil values (#19):
- sites/duckduckgo/cmd/duckduckgo/main.go
- sites/duckduckgo/duckduckgo.go
- sites/google/cmd/google/main.go
- sites/wegmans/cmd/wegmans/main.go
- sites/wegmans/wegmans.go
- sites/aislegopher/aislegopher.go
Closes#8, #19
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Change SearchPage.GetResults() to return ([]Result, error) so ForEach
errors are no longer silently discarded
- Fix Search() to return the ForEach error instead of nil
- Update cmd caller to check GetResults() errors
Closes#5, #6
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add length check before slicing article.Content[:32], matching the
safe truncation pattern already used in cmd/browser/main.go.
Closes#9
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- document.go: check if resp is nil before calling resp.Status() in
Refresh(), since Playwright's Reload() can return a nil response
- archive.go: check SelectFirst() results for nil before calling
Type() and Click(), preventing panics when DOM elements are missing
Closes#10, #11
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Fix Nodes.First() panic on empty slice (return nil)
- Fix ticker leak in archive.go (create once, defer Stop)
- Fix cookie path matching for empty and root paths
- Fix lost query params in google.go (u.Query().Set was discarded)
- Fix type assertion panic in useragents.go
- Fix dropped date parse error in powerball.go
- Remove unreachable dead code in megamillions.go and powerball.go
- Simplify document.go WaitForNetworkIdle, remove unused root field
- Remove debug fmt.Println calls across codebase
- Replace panic(err) with stderr+exit in all cmd/ programs
- Fix duckduckgo cmd: remove useless defer, return error on bad safesearch
- Fix archive cmd: ToConfig returns error instead of panicking
- Add 39+ unit tests across 6 new test files
- Add Gitea Actions CI workflow (build, test, vet in parallel)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Refined price parsing logic to strip trailing periods from units (e.g., "lb." -> "lb") for better handling. Added logging for debugging extracted response data.
Adjusted HTML selectors for improved compatibility and updated price parsing logic to handle additional formats. Added logging to provide better debugging insights during price extraction.
Replaced `currency.Amount` with `int` for jackpot values to simplify representation. Adjusted parsing logic accordingly. Updated Go version to 1.24.0 and refreshed dependencies in go.mod for compatibility.
Introduced the `OpenSearch` method and `SearchPage` interface to streamline search operations and allow for loading more results dynamically. Updated dependencies and modified the DuckDuckGo CLI to utilize these enhancements.
Introduce `ErrCommandNoReactions` to allow commands to opt out of success reactions. Adjust bot behavior to respect this error and prevent reactions when applicable, ensuring cleaner and more controlled responses. Add error handling and safeguard workers against panics.
This update enhances the `Item` structure to include `UnitPrice` and `Unit` fields. Additional logic is implemented to extract and parse unit pricing details from the HTML, improving data accuracy and granularity.
Added price field to Item struct in AisleGopher and implemented logic to extract price data. Updated Wegmans parser to validate URL structure by ensuring the second segment is "product". These changes improve data accuracy and error handling.
Introduce functionality to retrieve item details, including name and price, from Wegmans using a browser-based scraper. This includes a CLI tool to execute searches and robust error handling for URL validation and browser interactions.
Introduced a new package and command for extracting data from aislegopher.com, including URL parsing and item retrieval. Updated dependencies in go.mod to support the new functionality. Additionally, refined import structure in the DuckDuckGo integration.
Replaced the overly complex CSS selector with a simplified "h2" selector for extracting titles. This change improves maintainability and ensures accurate title extraction from the updated DOM structure.
Implemented a DuckDuckGo search module with configurable SafeSearch and regional settings. Added a CLI tool to perform searches via DuckDuckGo using browser automation, supporting flags for customization.
Moved the local package import to align with standard Go import grouping conventions. This improves code readability and maintains a consistent structure.
Introduce a Google search integration, including a Go package for performing searches with configurable parameters (e.g., language, region) and a CLI tool for executing search queries. Refactor archive CLI import ordering for consistency.