Skip to content

fix: ensure the region gets populated while site creation#2329

Open
khushboo5723 wants to merge 2 commits intomainfrom
ensure-region-gets-populated
Open

fix: ensure the region gets populated while site creation#2329
khushboo5723 wants to merge 2 commits intomainfrom
ensure-region-gets-populated

Conversation

@khushboo5723
Copy link
Copy Markdown
Contributor

Problem

The Tokowaka dashboard recently started rendering a Region column for each
monitored site (project-elmo-ui-data commit cb98b3a, Apr 29). The column is
populated from Site.region returned by GET /sites, but ~84% of rows are
empty: of the 7 code paths in spacecat-api-service that call Site.create(),
only 2 (plg-onboarding.js and the Slack onboard command) currently invoke
locale detection. The remaining 5 paths create sites with region: "" and
language: "" — and have done so since the region attribute was added in
spacecat-shared-data-access PR #995 (Oct 2025).

This PR plugs those 5 holes so new sites going through any creation flow get
their locale detected at creation time.

Changes

New helper: src/support/locale.js

ensureSiteLocale(site, baseUrl, log) is the shared glue around
detectLocale from @adobe/spacecat-shared-utils:

  1. Skips if site.getLanguage() and site.getRegion() are both already set
    (idempotent — won't overwrite caller-supplied values).
  2. Calls detectLocale({ baseUrl }).
  3. Sets site.setLanguage(...) / site.setRegion(...) only for fields that
    are currently empty AND for which detection returned a non-empty value.
  4. Persists with site.save() if anything changed.
  5. Swallows all errors (network, invalid URL, save failure) — locale
    enrichment is best-effort and must not block site creation.

Difference from the existing PLG/Slack-onboard pattern: the catch block
does NOT fall back to writing 'US'/'en'. When detection genuinely fails
(unreachable host, bot-blocked, timeout) the site is left with empty region
rather than fake-classified as US. Empty values are honest and can be filled
by a future backfill; fake values silently corrupt regional metrics.

5 call-sites wired

File Location
src/controllers/sites.js POST /sites
src/support/slack/commands/add-site.js Slack add site
src/support/slack/actions/approve-site-candidate.js Site candidate approval
src/controllers/llmo/llmo-onboarding.js createOrFindSite (used by LLMO onboarding)
src/controllers/llmo/llmo.js createOrUpdateStageEdgeConfig (stage-domain creation)

Each is a single await ensureSiteLocale(site, baseUrl, log); line right
after Site.create(...).

Tests

  • New test/support/locale.test.js: 19 cases covering every branch of the
    helper at 100% coverage (verified via c8 --include='src/support/locale.js').
  • 5 affected call-site test files updated to esmock the helper (matching the
    repo convention used in cdn-detection.test.js, plg-onboarding.test.js,
    etc.). Each updated test asserts the helper is invoked with the freshly
    created site and the correct base URL.

Out of scope (follow-ups)

  1. detectLocale's internal 'US'/'en' fallback (in
    spacecat-shared-utils/src/locale-detect/locale-detect.js:61-62,
    pre-existing since PR fix(deps): update external fixes #1006). When fetch succeeds but no indicator
    (TLD/path/hreflang/htmlLang/headers/metaTag/content) fires, detectLocale
    returns {region:'US', language:'en'}. Our helper trusts that result, so
    plain .com sites with no localization markup still get classified as US.
    Proper fix is to add a { allowFallback: false } option upstream so callers
    can opt out — separate PR to spacecat-shared.
  2. Backfill of historical empty-region sites. This PR fixes only sites
    created from now on. A one-off script that iterates Site.all(), calls
    ensureSiteLocale on each empty-region site, and saves will be a separate
    PR once we confirm the new write path works in stage.
  3. Refactor plg-onboarding.js and support/utils.js to use the new
    helper instead of their inline duplicates of the same pattern. Deferred
    to keep this PR scoped; both already have the (problematic) 'US'/'en'
    catch fallback that we'd want to remove in lockstep.

Test plan

  • Unit tests pass:
    npx mocha --spec='test/support/locale.test.js' --spec='test/controllers/sites.test.js' --spec='test/support/slack/commands/add-site.test.js' --spec='test/support/slack/actions/approve-site-candidate.test.js' --spec='test/controllers/llmo/llmo-onboarding.test.js' --spec='test/controllers/llmo/llmo.test.js'
    (the 14 pre-existing failures in the full repo suite — 9× detectedCdn
    Joi, 1× LlmoController before-all hook, 4× PLG timeouts — are
    unrelated to this change and reproduce on main)
  • Stage smoke (curl POST /sites against stage):
    - https://www.bmw.deregion: "DE" (TLD signal)
    - https://www.toyota.co.jpregion: "JP"
    - https://www.lemonde.frregion: "FR"
    - https://this-host-does-not-exist-xyz.invalidregion: "" (no fallback to US — this is the key behavior change)
  • Stage Slack: @spacecat add site https://www.lemonde.fr, then
    @spacecat get site https://www.lemonde.fr → confirm region: "FR".
  • Idempotency: POSTing the same site twice doesn't re-run detection or
    modify the saved record.
  • User-supplied region honored: POST {baseURL: ..., region: "GB"} is
    not overwritten by detection.
  • Day-after verification: tomorrow's llmo-usage-data-YYYY-MM-DD.json
    in project-elmo-ui-data/usage-tracking/ shows reduced empty-region
    rate for sites created in the last 24h.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 5, 2026

This PR will trigger a patch release when merged.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@khushboo5723 khushboo5723 changed the title fix: ensure the region gets populated fix: ensure the region gets populated while site creation May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants