Skip to content

v0.3.0 — 2026-05-07

MINOR bump combining two co-landing changes: the tag-iterative pipeline (each /gm-gdd → /gm-finalize round ships ONE SemVer tag) and the verify_report.json feedback channel (closes the retry loop where verify failures had no machine-readable signal to drive the next iteration).

Existing projects upgrade via migrations/20260507120000_introduce_tag_based_pipeline.py, which moves the pre-existing GDD/PLAN/STRUCTURE/SCENES/ASSETS into docs/tags/v0.1.0/ and writes a stub ROADMAP.md. The migration is idempotent and runs automatically on the first publish after upgrade. The verify-report side is fully additive on the state side and ships no migration of its own — see "Upgrade note" below for the SKILL redeploy step.

Added

Tag-iterative pipeline

  • Tag-iterative pipeline (ROADMAP.md + docs/tags/<Tag>/ archives + git tag <Tag> per round). The earliest entry in ROADMAP.md without a corresponding git tag is the current tag; per-tag root docs (PLAN.md, STRUCTURE.md, SCENES.md) are scoped to it, while GDD.md, ROADMAP.md, MEMORY.md, and ASSETS.md accumulate across tags (ASSETS rows carry a Tag column marking the introducing tag).
  • Playable-closed-loop hard gate in /gm-evaluate: every tag must boot godot --headless --quit, run at least one core mechanic E2E, and have at least one of {death, win, exit} reachable. The single e2e/ suite runs every still-supported mechanic on every evaluation; any failing inherited mechanic blocks approval.
  • /gm-rescue diagnostic skill (skills/core/gm-rescue/) — outside the main pipeline. Reads runtime artifacts, walks godotmaker layers (hooks → SKILL.md → config → templates → shared refs → tools), determines whether a framework defect is the cause; outputs to chat ONLY (no file writes, no code changes), drafts a GitHub issue text the user reviews and posts upstream. Privacy default: drafts exclude absolute project paths, project source code, and GDD content.
  • Migration script (migrations/20260507120000_introduce_tag_based_pipeline.py) for upgrading existing projects to the new layout. Idempotent; does NOT auto-git tag.
  • templates/ROADMAP.md template with SemVer convention header. PLAN.md / STRUCTURE.md / SCENES.md templates gain **Tag:** vX.Y.Z headers (ASSETS.md and MEMORY.md stay cross-tag — ASSETS gains a Tag column, MEMORY is snapshotted). PLAN.md adds Tag Mechanics and Inherited Mechanics sections; /gm-evaluate reads them to maintain the single e2e/ suite.
  • hooks/stage_reminder.py check_tag_archived programmatic check; hooks/metrics/get_current_tag() helper; SessionStart banner now surfaces the current tag (or "no current tag — run /gm-gdd to start one").
  • 38 new tests covering migration / hooks / metrics / gm-rescue structural contract.

Verify-report feedback channel

  • .godotmaker/verify_report.json — structured feedback channel from /gm-verify to /gm-build and /gm-fixgap. /gm-verify now writes this file on every run (PASS or FAIL) with per-check results: checks.build.errors[], checks.unit_tests.failures[], checks.lint.{issues, format_drift}, checks.static_check.issues[], plus tooling_notes[] for verification-tool crashes (gdlint / gdformat / godot etc.) carrying a suggested_fallback discriminator (exclude_file / scope_narrow / add_gdlintrc_rule / skip_check / escalate) and a matching structured operand (crashed_on / narrowed_command / rule_name / check_name) so consumers can apply the fix deterministically without parsing free-text error strings. Per-check result is a 4-value enum pass | warn | fail | errorerror distinguishes "tool crashed, project status unknown" (consumer applies a config fallback) from fail "project has problems" (consumer dispatches a code fix). On their next invocation, /gm-build and /gm-fixgap Resume Checks read this report — when its ts is newer than the last build/fixgap event in stage.jsonl and overall result == "fail", they translate each failure into pending PLAN.md / GAP.md tasks before resuming. Producers that cannot fill the required operand for a non-escalate fallback MUST emit escalate instead; consumers that see a non-escalate fallback with a missing operand MUST degrade to escalate. This closes the retry loop where verify failures had no machine-readable channel to drive the next iteration.
  • gm-verify SKILL.md "Output Format" split into A. chat-readable report and B. machine-readable JSON, with the full schema documented inline. Permission section adds verify_report.json as a third write exception alongside current_role and stage.jsonl.
  • gm-build SKILL.md "Step 0 — Process Verify Feedback" — runs only when Resume Check flags a fresh verify_report.json; per-check translation rules cover compile errors, test failures, lint issues, format drift, static-check issues, and tooling-note fallbacks (config-only fixes, never code deletions).
  • gm-fixgap SKILL.md "Step 1b — Pull failures from verify_report.json" — same translation rules as gm-build. Verify-source tasks share the existing C / J severity prefixes with evaluation-source tasks but are listed first within each letter so the mechanical layer is fixed before product-layer fixes are dispatched. Per-task Source: verify_report.json | evaluation.json line records origin.
  • templates/GAP.md — adds optional Source Verify header section, per-task Source: line, and a Source column in the Task Status table.
  • Wiki — common-problems.md (EN + zh) adds "/gm-verify keeps failing on the same issues and /gm-build retries forever" diagnostic with the three failure modes (missing report, stale report, old SKILL.md deployed) and step-by-step fixes.

Changed

Tag-iterative pipeline

  • /gm-gdd rewritten with initial vs subsequent mode (auto-detected by ROADMAP.md presence). Initial: full Socratic interview → derives ROADMAP → mandatory user confirmation gate → writes v0.1.0 docs. Subsequent: focuses earliest un-git-tagged ROADMAP entry, optionally updates GDD.md (old features marked (superseded by …) instead of deleted) and ROADMAP.md, generates this tag's working docs with explicit refactor tasks for cross-tag changes.
  • /gm-finalize drops release packaging entirely (deferred to a future release skill). New responsibilities: archive working docs to docs/tags/<Tag>/, generate per-tag CHANGELOG, run git tag <Tag> locally (does NOT push), reset per-tag runtime state.
  • /gm-evaluate maintains a single e2e/ suite that always reflects the current game (adds tests for new Tag Mechanics, prunes tests for mechanics removed by an explicit Main Build refactor task); evaluation.json schema gains tag, tag_mechanics, inherited_mechanics, e2e_tests, and orphan_tests.
  • Per-tag scope discipline enforced across /gm-build, /gm-fixgap, /gm-asset, /gm-accept. Workers may touch files from previous tags only when PLAN.md has an explicit refactor task naming those files; "cleanup detours" are forbidden.
  • Decomposer agent rewritten to consume GDD + ROADMAP + prior tag archives + cross-tag refactor hints; overwrites per-tag root artifacts with **Tag:** headers; never modifies GDD/ROADMAP/archives; supports initial / subsequent modes. SCENES.md and STRUCTURE.md are end-of-tag snapshots (carry forward unchanged prior-tag entries, mark redesigned ones).
  • templates/game-claude.md rewritten with tag-iterative flow framing and the new doc-scope rules; notes gm-rescue's position outside the main flow.
  • config/stage_schemas.json: ROADMAP.md added to gdd outputs; finalize gains tag_archived programmatic check; new no-op rescue stage schema.
  • hooks/check_file_permissions.py PLANNING_DOCS includes roadmap.md so subagents (other than the decomposer) cannot mutate it. New RESCUE_ALLOWED_GM_FILES allow-list pins rescue's two carve-outs (stage.jsonl, current_role); rescue is otherwise read-only at the hook layer, mirroring its SKILL hard rule.
  • Wiki the-9-roles.md and glossary.md (EN + zh) updated for the tag model.

Verify-report feedback channel

  • config/stage_schemas.json verify entry now declares files: [".godotmaker/verify_report.json"]. The existing stage_reminder.py path validator automatically blocks the verify completion event from being appended to stage.jsonl when the report file is missing — same gate mechanism as evaluate already uses for evaluation.json.
  • hooks/check_file_permissions.pyVERIFY_ALLOWED_GM_FILES adds .godotmaker/verify_report.json. The block message lists all three allowed paths.

Upgrade note (v0.2.x → v0.3.0)

The verify-report feedback channel only takes effect once the deployed SKILL.md files (under each project's .claude/skills/) are the v0.3.0 versions. The tools/publish.py upgrade flow normally handles this on a MINOR bump, but a project that was last published before v0.3.0 will keep running the old gm-build / gm-fixgap SKILLs (and the retry-loop bug the channel fixes) until you redeploy. To force a clean SKILL refresh:

python tools/publish.py --force <project_dir>

--force overwrites .claude/skills/, hooks, stage_schemas.json, and templates; it leaves your project state (.godotmaker/stage.jsonl, evaluation.json, PLAN.md, GAP.md, etc.) untouched.

For the tag-pipeline side, no --force is needed — the migration runs automatically on the first publish and is idempotent. The state side is fully additive (verify-report). The tag-pipeline migration relocates pre-existing root docs into docs/tags/v0.1.0/ and injects **Tag:** v0.1.0 into root PLAN.md so post-migration /gm-evaluate and /gm-finalize can read the current tag without first re-running /gm-gdd. gm-fixgap's Resume Check has a per-row backward-compat clause for GAP.md files written under the v0.2.x format.

Protocol guarantee (downstream-facing)

  • /gm-verify MUST produce .godotmaker/verify_report.json every run. Schema is documented in gm-verify/SKILL.md Output Format Section B; downstream consumers may rely on top-level keys (result, ts, checks, tooling_notes) and per-check shapes.
  • Open string discriminators (may gain new values in future releases — consumers MUST tolerate unknown values, never crash):
  • checks.static_check.issues[].check — fall back to using the raw value verbatim and treating the issue as a generic project-code fix.
  • tooling_notes[].suggested_fallback — fall back to treating it as "escalate" (do NOT auto-fix; surface the tool + error + crashed_on fields to the user and halt the current build/fixgap cycle) and recording the raw value verbatim. The "escalate" value is itself a shipped discriminator for non-lint tool crashes (tool == "godot", full-run dumps) where no in-place config edit can route around it.
  • Each non-escalate fallback ships a required operand field (exclude_filecrashed_on, scope_narrownarrowed_command, add_gdlintrc_rulerule_name, skip_checkcheck_name). A note where the operand is missing/null/empty MUST be treated as escalate by the consumer; producers MUST emit escalate rather than emitting a routable fallback they cannot operationalize.
  • Closed enums (changing requires a coordinated SKILL.md update across producer and consumers):
  • top-level result: pass | fail.
  • per-check result: pass | warn | fail | error (warn is lint-only).
  • Producer invariants (always true in any report /gm-verify writes; consumers may rely on these without further validation):
  • result: "pass" ⇔ every checks.*.result ∈ {pass, warn} AND tooling_notes == []. A non-empty tooling_notes array implies at least one checks.*.result == "error", which forces result: "fail" — these two cannot coexist with PASS.
  • Every checks.<name> carries its own required arrays even when empty: build.errors, unit_tests.failures, lint.issues, static_check.issues. unit_tests additionally carries integer passed and failed counts. Consumers may iterate these arrays without presence checks.
  • Every non-escalate tooling_notes[*].suggested_fallback carries a non-empty operand field per the table above; producers must emit escalate instead of leaving an operand unfilled.
  • stage.jsonl's existing contract is preserved — PASS still appends {"role": "verify", "ts": ...}. Existing harnesses that judge verify outcome by line-count delta and last-event role need no change.