Weather extractor CSS selectors don't match DuckDuckGo's actual DOM #53
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The weather extractor in
sites/duckduckgo/weather.gouses CSS selectors that don't match DuckDuckGo's actual weather widget DOM structure, resulting in empty data being returned.Current selectors (not working):
div.module--weather span.module__title__linkdiv.module--weather .module__current-tempdiv.module--weather .module__weather-summarydiv.module--weather .module__high-temp/.module__low-tempdiv.module--weather .module__forecast-day.forecast-day__name,.forecast-day__high,.forecast-day__lowdiv.module--weather .module__hourly-item.hourly-item__time,.hourly-item__tempDuckDuckGo's actual DOM structure uses React components with class names like:
div.react-module articlefor the weather widget containerdivandspanelements without the BEM-style class names aboveThe data structures and conversion logic are correct — only the CSS selectors need to be updated to match the real DOM. The old mort implementation (which worked) used text extraction + regex parsing as a workaround for the unpredictable class names, along with aria-label/title/alt attribute reading for icon hints.
Tested in production: all fields come back empty/zero when using
GetWeather()against the live DuckDuckGo site.Starting work on this. Plan of approach:
extractWeather()insites/duckduckgo/weather.gowith correct selectorsWill use the old mort approach (text extraction + regex, aria-label/title/alt for icons) as a fallback strategy if class names are unpredictable.
Work finished. PR #54 replaces all class-based CSS selectors with structural/attribute-based selectors that are resilient to DDG's randomized CSS module class names.
Key changes:
article:has(img[src*='weatherkit'])instead ofdiv.module--weatherdiv:first-child,p:first-of-type, etc.) instead of BEM class namesimg[alt]attributes (e.g., "Cloudy", "Snow", "PartlyCloudy")span > spanstructureCurrentTempderived from first hourly entry (DDG no longer shows standalone current temp)HighTemp/LowTempderived from first daily forecast entryAll unit tests updated and passing.
PR #54 looks good. The structural/attribute-based selectors should be resilient to DDG's CSS module class name randomization.
One thing to note for downstream consumers:
ConditiononDayForecastandHourlyForecastis now set to the rawIconHintvalue (e.g."PartlyCloudy") rather than human-readable text. Mort will need to normalize these (split camelCase → title case) when converting to its local types.