v0.3.0 — 2026-05-07¶
MINOR bump combining two co-landing changes: the tag-iterative pipeline (each /gm-gdd → /gm-finalize round ships ONE SemVer tag) and the verify_report.json feedback channel (closes the retry loop where verify failures had no machine-readable signal to drive the next iteration).
Existing projects upgrade via migrations/20260507120000_introduce_tag_based_pipeline.py, which moves the pre-existing GDD/PLAN/STRUCTURE/SCENES/ASSETS into docs/tags/v0.1.0/ and writes a stub ROADMAP.md. The migration is idempotent and runs automatically on the first publish after upgrade. The verify-report side is fully additive on the state side and ships no migration of its own — see "Upgrade note" below for the SKILL redeploy step.
Added¶
Tag-iterative pipeline¶
- Tag-iterative pipeline (
ROADMAP.md+docs/tags/<Tag>/archives +git tag <Tag>per round). The earliest entry inROADMAP.mdwithout a correspondinggit tagis the current tag; per-tag root docs (PLAN.md,STRUCTURE.md,SCENES.md) are scoped to it, whileGDD.md,ROADMAP.md,MEMORY.md, andASSETS.mdaccumulate across tags (ASSETS rows carry aTagcolumn marking the introducing tag). - Playable-closed-loop hard gate in
/gm-evaluate: every tag must bootgodot --headless --quit, run at least one core mechanic E2E, and have at least one of {death, win, exit} reachable. The singlee2e/suite runs every still-supported mechanic on every evaluation; any failing inherited mechanic blocks approval. /gm-rescuediagnostic skill (skills/core/gm-rescue/) — outside the main pipeline. Reads runtime artifacts, walks godotmaker layers (hooks → SKILL.md → config → templates → shared refs → tools), determines whether a framework defect is the cause; outputs to chat ONLY (no file writes, no code changes), drafts a GitHub issue text the user reviews and posts upstream. Privacy default: drafts exclude absolute project paths, project source code, and GDD content.- Migration script (
migrations/20260507120000_introduce_tag_based_pipeline.py) for upgrading existing projects to the new layout. Idempotent; does NOT auto-git tag. templates/ROADMAP.mdtemplate with SemVer convention header. PLAN.md / STRUCTURE.md / SCENES.md templates gain**Tag:** vX.Y.Zheaders (ASSETS.md and MEMORY.md stay cross-tag — ASSETS gains aTagcolumn, MEMORY is snapshotted). PLAN.md addsTag MechanicsandInherited Mechanicssections;/gm-evaluatereads them to maintain the singlee2e/suite.hooks/stage_reminder.pycheck_tag_archivedprogrammatic check;hooks/metrics/get_current_tag()helper; SessionStart banner now surfaces the current tag (or "no current tag — run /gm-gdd to start one").- 38 new tests covering migration / hooks / metrics /
gm-rescuestructural contract.
Verify-report feedback channel¶
.godotmaker/verify_report.json— structured feedback channel from/gm-verifyto/gm-buildand/gm-fixgap./gm-verifynow writes this file on every run (PASS or FAIL) with per-check results:checks.build.errors[],checks.unit_tests.failures[],checks.lint.{issues, format_drift},checks.static_check.issues[], plustooling_notes[]for verification-tool crashes (gdlint / gdformat / godot etc.) carrying asuggested_fallbackdiscriminator (exclude_file/scope_narrow/add_gdlintrc_rule/skip_check/escalate) and a matching structured operand (crashed_on/narrowed_command/rule_name/check_name) so consumers can apply the fix deterministically without parsing free-texterrorstrings. Per-checkresultis a 4-value enumpass | warn | fail | error—errordistinguishes "tool crashed, project status unknown" (consumer applies a config fallback) fromfail"project has problems" (consumer dispatches a code fix). On their next invocation,/gm-buildand/gm-fixgapResume Checks read this report — when itstsis newer than the lastbuild/fixgapevent instage.jsonland overallresult == "fail", they translate each failure into pending PLAN.md / GAP.md tasks before resuming. Producers that cannot fill the required operand for a non-escalatefallback MUST emitescalateinstead; consumers that see a non-escalatefallback with a missing operand MUST degrade toescalate. This closes the retry loop where verify failures had no machine-readable channel to drive the next iteration.gm-verifySKILL.md "Output Format" split into A. chat-readable report and B. machine-readable JSON, with the full schema documented inline. Permission section addsverify_report.jsonas a third write exception alongsidecurrent_roleandstage.jsonl.gm-buildSKILL.md "Step 0 — Process Verify Feedback" — runs only when Resume Check flags a freshverify_report.json; per-check translation rules cover compile errors, test failures, lint issues, format drift, static-check issues, and tooling-note fallbacks (config-only fixes, never code deletions).gm-fixgapSKILL.md "Step 1b — Pull failures fromverify_report.json" — same translation rules as gm-build. Verify-source tasks share the existingC/Jseverity prefixes with evaluation-source tasks but are listed first within each letter so the mechanical layer is fixed before product-layer fixes are dispatched. Per-taskSource: verify_report.json | evaluation.jsonline records origin.templates/GAP.md— adds optionalSource Verifyheader section, per-taskSource:line, and aSourcecolumn in the Task Status table.- Wiki —
common-problems.md(EN + zh) adds "/gm-verifykeeps failing on the same issues and/gm-buildretries forever" diagnostic with the three failure modes (missing report, stale report, old SKILL.md deployed) and step-by-step fixes.
Changed¶
Tag-iterative pipeline¶
/gm-gddrewritten with initial vs subsequent mode (auto-detected byROADMAP.mdpresence). Initial: full Socratic interview → derives ROADMAP → mandatory user confirmation gate → writes v0.1.0 docs. Subsequent: focuses earliest un-git-tagged ROADMAP entry, optionally updatesGDD.md(old features marked(superseded by …)instead of deleted) andROADMAP.md, generates this tag's working docs with explicit refactor tasks for cross-tag changes./gm-finalizedrops release packaging entirely (deferred to a future release skill). New responsibilities: archive working docs todocs/tags/<Tag>/, generate per-tag CHANGELOG, rungit tag <Tag>locally (does NOT push), reset per-tag runtime state./gm-evaluatemaintains a singlee2e/suite that always reflects the current game (adds tests for new Tag Mechanics, prunes tests for mechanics removed by an explicit Main Build refactor task);evaluation.jsonschema gainstag,tag_mechanics,inherited_mechanics,e2e_tests, andorphan_tests.- Per-tag scope discipline enforced across
/gm-build,/gm-fixgap,/gm-asset,/gm-accept. Workers may touch files from previous tags only when PLAN.md has an explicit refactor task naming those files; "cleanup detours" are forbidden. - Decomposer agent rewritten to consume GDD + ROADMAP + prior tag archives + cross-tag refactor hints; overwrites per-tag root artifacts with
**Tag:**headers; never modifies GDD/ROADMAP/archives; supports initial / subsequent modes. SCENES.md and STRUCTURE.md are end-of-tag snapshots (carry forward unchanged prior-tag entries, mark redesigned ones). templates/game-claude.mdrewritten with tag-iterative flow framing and the new doc-scope rules; notesgm-rescue's position outside the main flow.config/stage_schemas.json:ROADMAP.mdadded togddoutputs;finalizegainstag_archivedprogrammatic check; new no-oprescuestage schema.hooks/check_file_permissions.pyPLANNING_DOCSincludesroadmap.mdso subagents (other than the decomposer) cannot mutate it. NewRESCUE_ALLOWED_GM_FILESallow-list pins rescue's two carve-outs (stage.jsonl,current_role); rescue is otherwise read-only at the hook layer, mirroring its SKILL hard rule.- Wiki
the-9-roles.mdandglossary.md(EN + zh) updated for the tag model.
Verify-report feedback channel¶
config/stage_schemas.jsonverifyentry now declaresfiles: [".godotmaker/verify_report.json"]. The existingstage_reminder.pypath validator automatically blocks theverifycompletion event from being appended tostage.jsonlwhen the report file is missing — same gate mechanism asevaluatealready uses forevaluation.json.hooks/check_file_permissions.py—VERIFY_ALLOWED_GM_FILESadds.godotmaker/verify_report.json. The block message lists all three allowed paths.
Upgrade note (v0.2.x → v0.3.0)¶
The verify-report feedback channel only takes effect once the deployed SKILL.md files (under each project's .claude/skills/) are the v0.3.0 versions. The tools/publish.py upgrade flow normally handles this on a MINOR bump, but a project that was last published before v0.3.0 will keep running the old gm-build / gm-fixgap SKILLs (and the retry-loop bug the channel fixes) until you redeploy. To force a clean SKILL refresh:
--force overwrites .claude/skills/, hooks, stage_schemas.json, and templates; it leaves your project state (.godotmaker/stage.jsonl, evaluation.json, PLAN.md, GAP.md, etc.) untouched.
For the tag-pipeline side, no --force is needed — the migration runs automatically on the first publish and is idempotent. The state side is fully additive (verify-report). The tag-pipeline migration relocates pre-existing root docs into docs/tags/v0.1.0/ and injects **Tag:** v0.1.0 into root PLAN.md so post-migration /gm-evaluate and /gm-finalize can read the current tag without first re-running /gm-gdd. gm-fixgap's Resume Check has a per-row backward-compat clause for GAP.md files written under the v0.2.x format.
Protocol guarantee (downstream-facing)¶
/gm-verifyMUST produce.godotmaker/verify_report.jsonevery run. Schema is documented ingm-verify/SKILL.mdOutput Format Section B; downstream consumers may rely on top-level keys (result,ts,checks,tooling_notes) and per-check shapes.- Open string discriminators (may gain new values in future releases — consumers MUST tolerate unknown values, never crash):
checks.static_check.issues[].check— fall back to using the raw value verbatim and treating the issue as a generic project-code fix.tooling_notes[].suggested_fallback— fall back to treating it as"escalate"(do NOT auto-fix; surface the tool + error + crashed_on fields to the user and halt the current build/fixgap cycle) and recording the raw value verbatim. The"escalate"value is itself a shipped discriminator for non-lint tool crashes (tool == "godot", full-run dumps) where no in-place config edit can route around it.- Each non-
escalatefallback ships a required operand field (exclude_file→crashed_on,scope_narrow→narrowed_command,add_gdlintrc_rule→rule_name,skip_check→check_name). A note where the operand is missing/null/empty MUST be treated asescalateby the consumer; producers MUST emitescalaterather than emitting a routable fallback they cannot operationalize. - Closed enums (changing requires a coordinated SKILL.md update across producer and consumers):
- top-level
result:pass | fail. - per-check
result:pass | warn | fail | error(warnis lint-only). - Producer invariants (always true in any report
/gm-verifywrites; consumers may rely on these without further validation): result: "pass"⇔ everychecks.*.result ∈ {pass, warn}ANDtooling_notes == []. A non-emptytooling_notesarray implies at least onechecks.*.result == "error", which forcesresult: "fail"— these two cannot coexist with PASS.- Every
checks.<name>carries its own required arrays even when empty:build.errors,unit_tests.failures,lint.issues,static_check.issues.unit_testsadditionally carries integerpassedandfailedcounts. Consumers may iterate these arrays without presence checks. - Every non-
escalatetooling_notes[*].suggested_fallbackcarries a non-empty operand field per the table above; producers must emitescalateinstead of leaving an operand unfilled. stage.jsonl's existing contract is preserved — PASS still appends{"role": "verify", "ts": ...}. Existing harnesses that judge verify outcome by line-count delta and last-event role need no change.