shore balance
Plan splits to bring a multi-block mesh's MPI load imbalance below tolerance. Reads a .grd (and optionally an .adjacency.json), iteratively schedules cuts on the heaviest block, and emits a split.toml in the shore split schema.
shore balance MESH.grd -o splits.toml --np N \
[--tolerance 0.05] [--max-iterations 200] \
[--adjacency MESH.adjacency.json]Why this exists
shore proc-input's greedy weight-balanced assignment is optimal for the block list it is given. When the imbalance still exceeds the user's tolerance, the only way to do better is to subdivide the heaviest block(s). shore balance plans that subdivision: it iteratively cuts the heaviest block at its cell-count midpoint until the predicted post-split imbalance drops below the target.
The result is a plan, not an executed split. The user reviews the plan, runs shore split to apply it, and re-runs shore proc-input on the now-larger block list.
Workflow
# 1. Plan the splits
shore balance wall.grd -o splits.toml --np 16 --adjacency wall.adjacency.json
# 2. Apply the plan
shore split -c splits.toml wall_*.geo
# 3. Repack the now-larger block list
shore grd wall_split_*.geo -o wall.grd
# 4. Final balance
shore proc-input wall.grd wall.proc.input --np 16The reason the plan-then-execute split is preferred over a single "do everything" command: each step is reviewable on its own. You can cat splits.toml before mutating the mesh; you can shore info or shore check between steps; you can manually edit the plan.
Arguments and options
| Argument / Option | Default | Description |
|---|---|---|
GRD | required | Path to the input .grd (block ordering and weights). |
-o / --output | required | Output split.toml path. |
--np INTEGER | required | Number of MPI ranks the plan targets (>= 1). |
--tolerance FLOAT | 0.05 | Target fractional imbalance (max - min) / mean. Default 5%. |
--max-iterations INTEGER | 200 | Cap on planning iterations. Each iteration adds one cut, possibly with paired co-splits across SHARED / DIRICHLET seams. Raise for high --np runs on small block counts. |
--adjacency PATH | — | Optional path to a .adjacency.json sidecar. Strongly recommended — without it the planner uses longest-axis-first with no seam awareness, and the resulting plan may be rejected by shore split on a topology with k-coupled SHARED seams. |
Algorithm
Read
.grdmetadata: per-block(ni, nj, nk)cell counts. Initial weight =ni * nj * nk.Greedy weight-balanced rank assignment across
--npranks.While imbalance > tolerance and iterations <
--max-iterations:a. Pick the heaviest block
b.b. Pick a topology-safe split axis for
b. Two-level priority: first, axes free of perpendicular SHARED / DIRICHLET partners (no co-splits needed); second, longest extent. This means k is preferred on body-fitted topologies becausek_lo/k_hiare typically WALL / FREE.c. One axis per original block.
shore splitv1 supports only one axis per block per call. Once an axis is chosen for a block, later iterations can add more cuts on that same axis but cannot switch.d. Cut at the cell-count midpoint along the chosen axis. If the axis has perpendicular SHARED / DIRICHLET partners, schedule paired co-splits on each partner at the matching index.
e. Re-balance, recompute imbalance.
Group cuts on the same
(label, axis)into one[[splits]]TOML entry with a sortedat = [...]list.
Output schema
The TOML follows shore split's kind = "split" schema. A typical 4-rank cubed-sphere plan:
version = 1
kind = "split"
# Generated by shore balance: target tolerance 5.0%, final imbalance 4.8% after 6 iteration(s).
[[splits]]
label = "sub0"
axis = "k"
at = [9]
[[splits]]
label = "sub1"
axis = "k"
at = [9]
[[splits]]
label = "sub2"
axis = "k"
at = [9]
[[splits]]
label = "sub3"
axis = "k"
at = [9]
[[splits]]
label = "cap_north"
axis = "k"
at = [9]
[[splits]]
label = "cap_south"
axis = "k"
at = [9]Topology awareness
Without --adjacency, the planner trusts the user — every axis with extent ≥ 2 is a valid candidate. This is fine on single-block meshes and on multi-block meshes where the user confirmed by inspection that the chosen axis is safe. It is not safe by default on the cubed sphere: cutting along i without a paired cut on the cap-equator partner produces a plan shore split will reject.
With --adjacency, the planner walks the seam graph: for every axis it considers, it checks the perpendicular faces and (a) refuses PERIODIC seams (the j-ring on cubed-sphere sub-blocks), (b) emits paired co-splits for SHARED / DIRICHLET partners. The output is then guaranteed to pass shore split's validation.
Convergence and capping
When the planner can't reach the tolerance — typically because the heaviest block has too few cells along the only safe axis — it returns the best-found plan and emits a warning:
shore balance: initial imbalance 671.0% → final 8.2% after 200 iteration(s) (200 cut(s) across 6 block(s))
Warning: planner did not reach the tolerance (8.2% > 5.0%). Increase --max-iterations or accept the best-found plan.Common reasons:
- Too few cells on the safe axis. For body-fitted meshes the planner prefers
k. Ifnkis small (say 10 cells) the plan caps at 9 cuts per block. Increase the original mesh'snk. - Too many ranks for the block budget. Distributing 6 blocks across 256 ranks needs each block subdivided into ~43 chunks — feasible only with very large blocks.
Examples
# Cubed-sphere wall mesh, 16 MPI ranks, default 5% tolerance
shore balance wall.grd -o wall.splits.toml --np 16 \
--adjacency wall.adjacency.json
# Looser tolerance, fewer splits
shore balance wall.grd -o wall.splits.toml --np 8 \
--adjacency wall.adjacency.json --tolerance 0.10
# More aggressive — high rank count, raise the iteration cap
shore balance wall.grd -o wall.splits.toml --np 64 \
--adjacency wall.adjacency.json --max-iterations 500Python API
See shore.balance for the plan_balance(...) function, the BalancePlan dataclass, and the BalancePlan.to_toml() helper.
Limitations
- One axis per original block. Inherited from
shore splitv1. A block subdivided along k cannot also be subdivided along j in the same plan. Compose by runningshore balance→shore split→shore balanceif multi-axis is needed. - Cell-count midpoint cuts only. The planner has no access to per-cell volumes (it reads only the
.grdheader). Stretched grids therefore see midpoint-by-cell-count, not midpoint-by-volume. For uniform spacing the two coincide; for highly stretched grids the result is approximate. - Greedy, not optimal. The planner is a hill-climber; it doesn't search for globally-minimal splits. For most assemblies the greedy result is within a few percent of optimal.
See also
shore split— applies the plan.shore proc-input— final rank assignment after splits.shore.balance— Python API.- Algorithm — overset-exploded — the Fortran tool that inspired this design.