Migrations
This page shows how to migrate an existing signac workspace when an action's state point schema changes.
In grubicy, downstream jobs store the upstream job id in their own state point (via
deps.sp_key, default parent_action). When you change an upstream action's state
point, job ids can change, so those downstream pointers must be rewritten.
grubicy migrations are two-phase:
migrate-plan: compute old -> new ids/state points and detect collisions.migrate-apply: mutate state points (and copy workspaces if needed), then cascade pointer rewrites downstream, with progress logging for resume.
Worked example: add a defaulted key to an upstream action
Scenario: you want to add a new key b to the s1 action, defaulting to 0 for all
existing s1 jobs.
1) Before: pipeline spec
Your pipeline.toml might look like:
[[actions]]
name = "s1"
sp_keys = ["p1"]
[[experiment]]
[experiment.s1]
p1 = 1
2) Create a plan
From the signac project directory:
grubicy migrate-plan pipeline.toml s1 --setdefault b=0
This writes a plan under .pipeline_migrations/ (for example
.pipeline_migrations/plan_s1_YYYYmmddTHHMMSS.json).
If your config is TOML, migrate-plan also updates pipeline.toml to include the
new key in:
- the action's
sp_keys - each experiment's
[experiment.s1]block (only if it already exists) when the key is missing
If your config is not TOML, you must update it manually.
3) Inspect the plan (optional)
You can view the plan by running migrate-apply in --dry-run mode:
grubicy migrate-apply pipeline.toml s1 --dry-run
--dry-run prints the plan JSON (it does not change the workspace).
To use a specific plan file (instead of the latest one), pass --plan:
grubicy migrate-apply pipeline.toml s1 --plan .pipeline_migrations/plan_s1_YYYYmmddTHHMMSS.json --dry-run
4) Apply the migration
grubicy migrate-apply pipeline.toml s1
What happens during apply:
s1jobs get updated state points (here:bis added if missing).- If an
s1job id changes, its workspace directory is copied to the new job id. - All downstream actions are scanned in topological order; any parent pointer state point keys that reference an old id are rewritten to the new id.
- Progress is written under
.pipeline_migrations/run_s1_YYYYmmddTHHMMSS/progress.jsonso you can resume after interruptions.
After applying, it is usually a good idea to regenerate your row workflow:
grubicy render-row pipeline.toml --output workflow.toml
Collisions
With the currently supported CLI migration transform (--setdefault, which only adds
new keys and does not overwrite existing values), collisions are not expected: adding
a key preserves the existing distinctions between jobs.
grubicy still performs collision detection as a safety check, but if you are only
using migrate-plan ... --setdefault ..., you should treat collisions as an
unexpected sign that the workspace already contains inconsistent/duplicate state
points for that action.
Resume and locking
migrate-apply is restartable by default. If interrupted, rerun the same command and
it will resume from the progress log. Use --no-resume to force a fresh run.
To prevent concurrent migrations, apply uses a lock file .pipeline_lock in the
project directory. If you get a lock error, ensure no other migration is running. If
the lock is stale (e.g. after a crash), remove .pipeline_lock and rerun.
Tips
- Prefer migrating upstream actions first (roots), then downstream, to keep pointer rewrites simple.
- Run
grubicy status pipeline.tomlafter a migration to see which jobs are missing declared products. - Keep
.pipeline_migrations/in your project directory; it contains the plan and the apply progress logs used for audit and resume.