Appearance
test_seam_interpolation_fnl_kernels
Device kernels for the seam-interpolation device test: MODULE-LEVEL subroutines (the
adam_fnl_field_kernelsproduction shape — NOT contained kernels, whose private arrays are untrustworthy on nvfortran, see CLAUDE-gpu / #22 F1-bis). Each kernel gathers its case data SCALAR-WISE from shared device arrays into CONSTANT-BOUND private arrays, then calls the single-source!$acc routine seqevaluators with whole-array actuals: no array sections cross a device call boundary (nvfortran materializes such sections into copy-in temporaries — the F1-bis race shape — and a host-sidecontiguousrepack of a device pointer segfaults outright). This is EXACTLY the pattern the F3 fill kernels must use.
Source: src/tests/amr/test_seam_interpolation_fnl.F90
Dependencies
Contents
- dump_tables_dev
- eval_tricubic_dev
- eval_compatible_dev
- eval_quadratic_dev
- meta_roundtrip_dev
- shift_clamp_dev
Subroutines
dump_tables_dev
Copy the module parameter weight tables into device arrays, element-wise from device code: pins the GA3 parameter-residency risk in isolation.
fortran
subroutine dump_tables_dev(w4_dev, w3_dev, wq_dev)Arguments
| Name | Type | Intent | Attributes | Description |
|---|---|---|---|---|
w4_dev | real(kind=R8P) | inout | Device copy of SEAM_W_TRICUBIC. | |
w3_dev | real(kind=R8P) | inout | Device copy of SEAM_W_COMPATIBLE. | |
wq_dev | real(kind=R8P) | inout | Device copy of SEAM_W_QUADRATIC. |
eval_tricubic_dev
Evaluate the tricubic interpolant on device, one thread per case: scalar gather into constant-bound private arrays, whole-array call.
fortran
subroutine eval_tricubic_dev(ncase, fp_dev, sub_dev, pos_dev, got_dev)Arguments
| Name | Type | Intent | Attributes | Description |
|---|---|---|---|---|
ncase | integer(kind=I4P) | in | Number of cases. | |
fp_dev | real(kind=R8P) | in | Footprints (4,4,4,ncase). | |
sub_dev | integer(kind=I4P) | in | Octant sub-positions (3,ncase). | |
pos_dev | integer(kind=I4P) | in | Anchor positions (3,ncase). | |
got_dev | real(kind=R8P) | inout | Interpolated values (ncase). |
Call graph
eval_compatible_dev
Evaluate the restriction-compatible interpolant on device, one thread per case.
fortran
subroutine eval_compatible_dev(ncase, fp_dev, sub_dev, pos_dev, got_dev)Arguments
| Name | Type | Intent | Attributes | Description |
|---|---|---|---|---|
ncase | integer(kind=I4P) | in | Number of cases. | |
fp_dev | real(kind=R8P) | in | Footprints (3,3,3,ncase). | |
sub_dev | integer(kind=I4P) | in | Octant sub-positions (3,ncase). | |
pos_dev | integer(kind=I4P) | in | Anchor positions (3,ncase). | |
got_dev | real(kind=R8P) | inout | Interpolated values (ncase). |
Call graph
eval_quadratic_dev
Evaluate the centered quadratic interpolant on device, one thread per case.
fortran
subroutine eval_quadratic_dev(ncase, fp_dev, sub_dev, got_dev)Arguments
| Name | Type | Intent | Attributes | Description |
|---|---|---|---|---|
ncase | integer(kind=I4P) | in | Number of cases. | |
fp_dev | real(kind=R8P) | in | Footprints (3,3,3,ncase). | |
sub_dev | integer(kind=I4P) | in | Octant sub-positions (3,ncase). | |
got_dev | real(kind=R8P) | inout | Interpolated values (ncase). |
Call graph
meta_roundtrip_dev
Pack then unpack the per-ghost metadata on device (ishft/ibits in device code): each thread round-trips its own case through the packed integer.
fortran
subroutine meta_roundtrip_dev(ncase, sub_dev, p4_dev, p3_dev, meta_dev, osub_dev, op4_dev, op3_dev)Arguments
| Name | Type | Intent | Attributes | Description |
|---|---|---|---|---|
ncase | integer(kind=I4P) | in | Number of cases. | |
sub_dev | integer(kind=I4P) | in | Octant sub-positions in (3,ncase). | |
p4_dev | integer(kind=I4P) | in | Tricubic anchor positions in (3,ncase). | |
p3_dev | integer(kind=I4P) | in | Compatible anchor positions in (3,ncase). | |
meta_dev | integer(kind=I4P) | inout | Packed metadata out (ncase). | |
osub_dev | integer(kind=I4P) | inout | Unpacked sub-positions out (3,ncase). | |
op4_dev | integer(kind=I4P) | inout | Unpacked tricubic positions out (3,ncase). | |
op3_dev | integer(kind=I4P) | inout | Unpacked compatible positions out (3,ncase). |
Call graph
shift_clamp_dev
Compute the centered anchor position and the shift-inward clamp on device, one thread per case (scalar arguments only: no gather needed).
fortran
subroutine shift_clamp_dev(ncase, anchor_dev, ncell_dev, subv_dev, fpn_dev, pc_dev, p_dev)Arguments
| Name | Type | Intent | Attributes | Description |
|---|---|---|---|---|
ncase | integer(kind=I4P) | in | Number of cases. | |
anchor_dev | integer(kind=I4P) | in | Donor anchor cell indexes (ncase). | |
ncell_dev | integer(kind=I4P) | in | Donor block interior cell counts (ncase). | |
subv_dev | integer(kind=I4P) | in | Octant sub-positions (ncase). | |
fpn_dev | integer(kind=I4P) | in | Footprint widths, 4 tricubic | 3 compatible (ncase). | |
pc_dev | integer(kind=I4P) | inout | Centered anchor positions out (ncase). | |
p_dev | integer(kind=I4P) | inout | Clamped anchor positions out (ncase). |
Call graph