SIGN IN SIGN UP

docs(sitemap-author): schema v1.1 — 12 patches from twitter+hackernews PoC (#1822)

* docs(sitemap-author): schema v1.1 — 12 patches from twitter+hackernews PoC

Cross-validated against two PoCs (twitter 12 files / hackernews 10 files).
v1.1 changelog at top of file. 12 patches in 3 groups:

Group 1 — Scope/boundary (6 clarifications):
- §1.1 CJK token-per-char 30-50% higher than English; split sub-file rather
  than relaxing 800-token limit (which would drift).
- §2.1 auth_strategy = primary strategy, not union; per-page contract_strength
  expresses exceptions.
- §2.5 pitfalls.md is task-executor-level only; adapter-internal pitfalls
  (queryId parsing, envelope unwrap) move to ~/.opencli/sites/<site>/notes.md.
- §2.5 pitfall id / trigger / workaround written from task-executor 1st-person
  view ("when agent does X, ..."), not adapter-implementer view.
- §2.4 apis.md entry adds optional `notes:` field for GraphQL queryId path and
  other meta info (still no URL / method / params / response — those stay in
  endpoints.json).
- §2.2 page Linked APIs may be empty when endpoints.json is still being
  collected; do not insert fake placeholder ids.

Group 2 — Reuse/compactness (3 structural):
- §2.2 + §4 partial pages: `page_id` with `_` prefix and `url_patterns: []`
  for cross-page UI (e.g. _tweet_card.md). Referenced by other pages via the
  existing `action:<id> in pages/_<name>.md` form. Eliminates duplication and
  arbitrary "which page owns the like button" calls.
- §3 introduces Form B compact YAML for actions (~80 token each vs Form A
  markdown ~250). Both forms remain valid; Form B is recommended when page
  density would otherwise blow the 800-token budget.
- §3 drops action-level `verified_at` and `source` — file-level frontmatter
  already covers both, repeated copies just drift.

Group 3 — Execution health/anchors (3 action-level):
- §3.3 cross-page UI primitive actions (the kind that live in partials)
  may write Best/Fallback inline as adapter-first + DOM fallback within a
  single action, rather than being forced up into a workflow Best/Fallback
  pair. Decouples UI-primitive routing from task-level routing.
- §3.4 Recovery may include `adapter_health_update: <adapter> -> suspect`
  directive. Consumption skill (opencli-browser-sitemap) writes the matching
  workflow's adapter_health on the local overlay so the next agent skips the
  broken Best path instead of re-running it. Write-side closure for the
  failure → next-agent-avoidance loop.
- §2.2 testid marked optional; selector_pattern promoted to first-class
  anchor with 5 acceptable shapes (id-anchored / sibling traversal / attribute
  boundary / form name / ARIA) and explicit discouraged-anchor list
  (nth-child, single-class grabs, text-content selectors). Old sites without
  testid (HN, forums) are no longer second-class.

No code changes — pure schema reference. Both PoCs remain local; promotion to
references/site-memory/{twitter,hackernews}/sitemap/ comes once this lands.

* docs(sitemap-author): apply opencli-user review nits

- Form B delimiter table (`|` enum / `||` fallback / `;` sequential) to
  disambiguate `do:` and `recover:` parsing.
- §3.3 like_tweet example updated to `||` fallback form.
- §3.4 explicit note: adapter_health recovery (suspect → healthy) is read
  side, deferred to opencli-browser-sitemap skill spec.

* docs(sitemap): align skills with schema v1.1
J
jakevin committed
dc67023be8ddf7082cbce2a2f83214df1b403277
Parent: 65cab71
Committed by GitHub <noreply@github.com> on 6/1/2026, 6:29:58 PM