Skip to content

Claude/disc config struct nd xr3#1

Merged
Tuesdaythe13th merged 175 commits intomainfrom
claude/disc-config-struct-ndXr3
Apr 6, 2026
Merged

Claude/disc config struct nd xr3#1
Tuesdaythe13th merged 175 commits intomainfrom
claude/disc-config-struct-ndXr3

Conversation

@Tuesdaythe13th
Copy link
Copy Markdown
Owner

Description

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

Test plan

Bug fix

import warp as wp
# Code that demonstrates the bug

New feature / enhancement

import warp as wp
# Code that demonstrates the new capability

claude and others added 30 commits March 10, 2026 17:58
Adds notebooks/disc_cooling_sim.ipynb with an Open-in-Colab badge,
covering 2-D axisymmetric heat diffusion, Avrami crystallinity
kinetics, warp-risk scoring, 2-D field visualisations, radial
profile plots, and a mould-temperature parameter sweep.

https://claude.ai/code/session_016zF8WWzQUxkQpC2hmiRkuB
Signed-off-by: Claude <noreply@anthropic.com>
When a launcher runs a module via runpy.run_module(mod, run_name="__main__"),
the module may already be imported under its qualified name. The previous
approach used inspect.getmodule() first, which matched by filename and
returned the pre-imported module's qualified name instead of "__main__".
This caused set_module_options() to target a different module than
@wp.kernel (which uses f.__module__ == "__main__"), silently ignoring
the user's options.

Use frame.f_globals["__name__"] as the primary source for module name
resolution, ensuring consistency with @wp.kernel's use of f.__module__.
Fall back to inspect.getmodule() and filename matching only when
__name__ is unavailable.

Also:
- Use sys._getframe() instead of inspect.stack() to avoid building
  FrameInfo objects for the entire call stack
- Use try/finally to clean up frame references promptly
- Use os.path.realpath() instead of os.path.abspath() to handle symlinks

Signed-off-by: Eric Shi <ershi@nvidia.com>
Fix caller module detection for runpy-based execution [NVIDIAGH-1274]

See merge request omniverse/warp!2101
When multiple processes compile CUDA kernels concurrently with a shared kernel cache, NVRTC's precompiled header files (.pch) are written to the shared `--pch-dir` directory without any synchronization. One process can read a partially-written `.pch` while another is still writing it, causing a segfault inside NVRTC's `nvrtcCompileProgram`.

This was observed as intermittent CI failures on Newton's parallel test runner (8 processes, shared kernel cache, Blackwell sm_120 GPU). The crash always occurs in `build_cuda` during the first CUDA kernel compilation in whichever test process loses the race.

The fix directs `--pch-dir` to the per-process build directory (already unique via `_p<pid>_t<tid>` suffix) instead of the shared kernel cache. PCH files are cleaned up together with the build directory after `safe_rename` moves the final outputs to the cache.

`pch_dir` is a required keyword argument to `build_cuda()` so that future callers cannot silently revert to the racy shared-directory behavior.
Fix PCH race condition in concurrent CUDA compilation [NVIDIAGH-1284]

See merge request omniverse/warp!2109
Include cuBQL headers and build files under warp/native/cuBQL/,
add Apache 2.0 license to licenses/, and exclude cuBQL from typos
pre-commit checks.

Signed-off-by: Eric Shi <ershi@nvidia.com>
Add cppcheck suppressions for warp/native/cuBQL/ in both GitLab CI
and GitHub Actions, mark the directory as linguist-vendored, and
update the contribution guide to note cuBQL as third-party code.

Signed-off-by: Eric Shi <ershi@nvidia.com>
Add cuBQL as vendored third-party dependency

See merge request omniverse/warp!2113
Signed-off-by: Eric Shi <ershi@nvidia.com>
Fix HashGrid truncation for negative coordinates (NVIDIAGH-1256)

See merge request omniverse/warp!2058
Exclude vendored cuBQL and NanoVDB from CodeRabbit reviews

See merge request omniverse/warp!2114
)

Add an NDim TypeVar with a PEP 696 default (under TYPE_CHECKING) so
that Array and its subclasses are parameterized by both DType and NDim.
This lets static type checkers accept both array[dtype] and
array[dtype, Literal[ndim]] without requiring a new runtime dependency.

Signed-off-by: Eric Shi <ershi@nvidia.com>
Fix mypy not recognizing wp.array[dtype] subscript syntax [NVIDIAGH-1278]

See merge request omniverse/warp!2119
Add external texture support, refactor texture interop runtime (NVIDIAGH-1238)

See merge request omniverse/warp!2112
Integrate the cuBQL library as an optional BVH backend for wp.Mesh,
selectable via bvh_constructor="cubql". This backend supports ray queries
(closest-hit, any-hit, count-all) on both CPU and GPU but does not
support point queries, AABB queries, grouped meshes, or winding numbers.

Key changes:
- Add CuBQLBVH struct and cuBQL build/refit/rebuild/destroy for host
  and device in bvh.h, bvh.cpp, bvh.cu
- Add templated cubql_ray_traversal in mesh.h with ClosestHit, AnyHit,
  and CountAll modes
- Replace bvh_constructor_values dict with BvhConstructor IntEnum
- Block CUBQL on wp.Bvh (standalone BVH, no traversal support)
- Unsupported mesh queries silently return no results when cuBQL is
  active (documented in Mesh docstring)
Add cuBQL BVH backend for wp.Bvh and wp.Mesh [NVIDIAGH-1286]

See merge request omniverse/warp!2111
Fixing clang compile issue in cuBQL

See merge request omniverse/warp!2121
Mark the 2x multi-GPU runner jobs as allow_failure since they are
frequently crashing for non-actionable reasons. Remove allow_failure
from the clang build-and-test pipeline now that it has stabilized.

Signed-off-by: Eric Shi <ershi@nvidia.com>
Update CI allow_failure for multi-GPU and clang jobs

See merge request omniverse/warp!2123
Fix inaccurate "GPU-based" docstring on BvhConstructor.CUBQL since
cuBQL also has a CPU path. Add braces to cubql if/else branches in
mesh.cpp and mesh.cu for consistent style. Add test_mesh_refit that
verifies BVH refit correctness by moving the mesh and checking ray
queries. Add ValueError test for invalid bvh_constructor strings.
Clarify CuBQLNode comment about child pair storage. Rewrite changelog
entry to state supported/unsupported query types.

Signed-off-by: Eric Shi <ershi@nvidia.com>
Fix cuBQL docstrings, brace style, and test coverage

See merge request omniverse/warp!2122
Increase _SUITE_TIMEOUT from 2400s to 3600s to avoid premature
timeouts on slower runners (e.g. Jetson Orin). Bump Windows test
job timeouts to 75m to provide buffer over the new suite timeout.

Signed-off-by: Eric Shi <ershi@nvidia.com>
Bump test suite timeout and CI job timeouts

See merge request omniverse/warp!2127
Disable FEM example tests and remove multi-GPU allow_failure

See merge request omniverse/warp!2126
The struct field setter extracted the raw Python value from Warp scalars
for the ctypes backing store but then stored that unwrapped value as the
Python attribute, causing e.g. wp.uint8 to decay to int after assignment.

Re-wrap the value in the declared Warp type when the caller passed a Warp
scalar. Plain Python values (int, float, bool) are stored as-is to avoid
breaking downstream isinstance checks (e.g. wp.launch dim arguments).

Signed-off-by: Eric Shi <ershi@nvidia.com>
Fix struct field assignment unwrapping Warp scalar types [NVIDIAGH-1288]

See merge request omniverse/warp!2120
shi-eric and others added 25 commits March 31, 2026 11:05
Signed-off-by: Eric Shi <ershi@nvidia.com>
Update uv.lock (Pygments 2.20.0, requests 2.33.1)

See merge request omniverse/warp!2182
Move the CUB include before cuBQL to avoid a CCCL bug where
<stdexcept> (from cuBQL's math/common.h) makes __throw_out_of_range
non-constexpr, breaking a static_assert in typeid.h.

Signed-off-by: Eric Shi <ershi@nvidia.com>
Fix bvh.cu compilation with CUDA 13.2 and GCC < 12

See merge request omniverse/warp!2183
…AGH-1270)

Rename from wp.get_optimal_block_dim to wp.get_suggested_block_size to
better reflect that the result is a suggestion based on per-SM occupancy,
not a universally optimal choice. The function now returns both
block_size and min_grid_size from cuOccupancyMaxPotentialBlockSize,
letting callers check whether their launch is large enough to benefit
from the suggested block size.

Signed-off-by: Eric Shi <ershi@nvidia.com>
Add wp.get_suggested_block_size for CUDA occupancy queries [NVIDIAGH-1270]

See merge request omniverse/warp!2147
Document how to run ASV benchmarks and explain why
--launch-method spawn should be used on Linux to avoid
leaking NVRTC precompiled-header directories in /tmp.

Signed-off-by: Eric Shi <ershi@nvidia.com>
Add ASV benchmarking section to contribution guide

See merge request omniverse/warp!2179
* upgrade cu13 libmathdx to latest

Approved-by: Eric Shi <ershi@nvidia.com>

See merge request omniverse/warp!2186
Upgrade cu13 libmathdx dependency to version 0.3.2

See merge request omniverse/warp!2186
… options

* Address MR feedback for module_options validation

Check isinstance before module="unique" so the most specific error
fires first. Remove redundant mark_modified() on freshly constructed
modules.

Signed-off-by: Eric Shi <ershi@nvidia.com>

* Add `module_options` dict parameter to `@wp.kernel` for inline module options

Allow per-kernel module compilation options (e.g. `fast_math`, `mode`)
via a new `module_options` dict on the `@wp.kernel` decorator. Requires
`module="unique"` for any non-None value; raises `ValueError` otherwise.
Unknown keys are validated against the module's known options. Empty
dicts are accepted as a no-op with unique modules.

Signed-off-by: Eric Shi <ershi@nvidia.com>

Approved-by: Alain Denzler <adenzler@nvidia.com>
Approved-by: Lukasz Wawrzyniak <lwawrzyniak@nvidia.com>

See merge request omniverse/warp!2067
Add `module_options` dict parameter to `@wp.kernel` for inline module options

See merge request omniverse/warp!2067
…IAGH-1310]

* Introduce is_cpu local for readability in _compile()

Replace bare output_arch checks with a named boolean so the intent
(CPU vs CUDA target) is immediately obvious.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Nicolas Capens <ncapens@nvidia.com>

* Default CPU optimization level to -O2, keep -O3 for CUDA

When optimization_level is None (the default), CPU kernels now compile
with -O2 while CUDA kernels use -O3. The LLVM backend barely
distinguishes O2 from O3 and the O3-only frontend passes have low
relevance to Warp's generated code patterns. Users can still set
optimization_level=3 explicitly for both targets.

Add hash-consistency test and changelog entry.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Nicolas Capens <ncapens@nvidia.com>

* Make CPU optimization level configurable

Thread config.optimization_level through the Clang frontend (-O flag)
and the LLVM backend (CodeGenOptLevel passed to createTargetMachine),
so the setting now controls the full CPU compilation pipeline.
Previously the frontend was hardcoded to -O2 and the backend always
used CodeGenOptLevel::Default regardless of the config value.

Add ctypes argtypes for wp_compile_cpp and wp_compile_cuda.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Nicolas Capens <ncapens@nvidia.com>

Approved-by: Lukasz Wawrzyniak <lwawrzyniak@nvidia.com>
Approved-by: Eric Shi <ershi@nvidia.com>

See merge request omniverse/warp!2137
Make CPU kernel optimization level configurable, default to -O2 [NVIDIAGH-1310]

See merge request omniverse/warp!2137
* Last Greptile comment

* Add geometry-driven fp64 precision support to warp.fem (Layers 0-3)

Introduce infrastructure for full fp64 FEM pipelines, propagating scalar
precision from the geometry through the entire stack.

Approved-by: Eric Shi <ershi@nvidia.com>
Approved-by: Gilles Daviet <gdaviet@nvidia.com>

See merge request omniverse/warp!2172
* More Greptile comments

Approved-by: Gilles Daviet <gdaviet@nvidia.com>

See merge request omniverse/warp!2192
* Extend tile_fft/tile_ifft to support N-D tiles (NVIDIAGH-1317)

Generalize wp.tile_fft() and wp.tile_ifft() from strictly 2-D tiles to
arbitrary N-D tiles (N >= 2).  The FFT is computed along the last
dimension; all leading dimensions are treated as independent batches.

Separate FFT tests into test_tile_fft.py.

Signed-off-by: snidhan <snidhan@nvidia.com>

Approved-by: Eric Shi <ershi@nvidia.com>

See merge request omniverse/warp!2178
* Remove Python 3.9 support

- Bump requires-python to >=3.10 and remove 3.9 classifier
- Remove deprecation warnings from build_lib.py and context.py
- Remove inspect.get_annotations() backport and ast.Index/ast.ExtSlice
  compat code from codegen.py
- Remove 3.9 from GitLab CI test matrix
- Update docs (README, installation, compatibility, C++ examples)
- Regenerate uv.lock
- Apply ruff pyupgrade fixes for Python 3.10+ target

Signed-off-by: Eric Shi <ershi@nvidia.com>

Approved-by: Nicolas Capens <ncapens@nvidia.com>
Approved-by: Eric Shi <ershi@nvidia.com>

See merge request omniverse/warp!2187
* Fix array annotation repr and matrix type_repr (NVIDIAGH-1341)

Fix _ArrayAnnotationBase.__repr__() interpolating raw class objects
into the format string, producing unreadable output like
`wp.array(dtype=<class 'warp._src.types.uint32'>, ndim=4)`.

The dtype is now resolved to a human-readable name: `wp.X` for types
available in the warp namespace, struct keys for structs, and the
descriptive type_repr form for exotic vector/matrix types.

Also fix type_repr for small matrix types emitting a spurious pair of
parentheses (e.g. `mat44f(f)` instead of `mat44ff` -> `mat44f`).

Signed-off-by: Eric Shi <ershi@nvidia.com>

Approved-by: Eric Shi <ershi@nvidia.com>

See merge request omniverse/warp!2196
* Fix PCH code review feedback: handle None pch_dir in CUDA path, clean up partial PCH files

- Make build_cuda() handle None pch_dir like build_cpu() already does,
  removing the misleading `or build_dir` fallback for CUDA >= 13.0
- Remove partial .pch files on failed generation to avoid a wasted
  fallback-retry on the next compilation

Signed-off-by: Eric Shi <ershi@nvidia.com>

* Address code review feedback for PCH diagnostic ownership

- Fix use-after-free: scope setClient ownership transfer to LLVM >= 21
  only; LLVM < 21 path correctly passes nullptr to createDiagnostics
  which creates its own internal printer
- Guard get_clang_pch_dir() call with use_precompiled_headers check to
  avoid unnecessary temp directory allocation when PCH is disabled

Signed-off-by: Eric Shi <ershi@nvidia.com>

* Add CPU precompiled header support to reduce kernel compile times

Extend precompiled header (PCH) support to the CPU compilation path
(Clang/LLVM), matching the existing CUDA PCH support via NVRTC.

On the first CPU kernel compilation, Clang generates a PCH from
builtin.h. Subsequent compilations in the same process reuse the
serialized AST, skipping redundant header parsing. For multi-module
workloads like warp.fem, this reduces total CPU compile time by ~65%
(e.g., FEM diffusion: 45s -> 16s, Stokes transfer: 84s -> 30s).

Key details:
- Controlled by warp.config.use_precompiled_headers (same as CUDA)
- PCH files are per-thread temp directories to avoid races
- Fallback: if a PCH is corrupt, Clang retries without it and deletes
  the stale file
- PCH filename encodes block_dim and preprocessor flags so different
  configurations get separate files

Signed-off-by: Eric Shi <ershi@nvidia.com>

Approved-by: Nicolas Capens <ncapens@nvidia.com>

See merge request omniverse/warp!2170
* Reduce memory usage in array shape int-promotion tests

Replace tests that allocated ~3.4 GB to verify numpy integer shape
elements are promoted to Python int. The new test uses a small array
and asserts the type of shape elements directly.

Signed-off-by: Eric Shi <ershi@nvidia.com>

Approved-by: Eric Shi <ershi@nvidia.com>

See merge request omniverse/warp!2199
* Add three recent publications to PUBLICATIONS.md

Signed-off-by: Eric Shi <ershi@nvidia.com>

Approved-by: Eric Shi <ershi@nvidia.com>

See merge request omniverse/warp!2200
* Add Quick Start example to README

Show a complete 20-line N-body gravity simulation that demonstrates
kernel definition, vec3 math, array creation, constant capture, and
launch with one million particles.

Signed-off-by: Eric Shi <ershi@nvidia.com>

* Fix stale notebook commit hash in basics.rst

Update accelerated-computing-hub notebook links to match the newer
commit hash already used in README.md.

Signed-off-by: Eric Shi <ershi@nvidia.com>

* Clean up README examples section

Remove unit test instructions (developer-facing, covered in
contribution guide), consolidate USD viewing note with example
descriptions, and update example descriptions to match docs.

Signed-off-by: Eric Shi <ershi@nvidia.com>

* Streamline docs landing page and align with product messaging

- Slim down index.rst to intro, quickstart, and example gallery
- Move tutorial notebooks to basics.rst
- Move Omniverse section to installation.rst
- Remove sections duplicated in sidebar pages (Learn More, Support,
  License, Contributing, Publications)
- Replace "spatial computing" and "graphics code" with product-aligned
  language in both index.rst and README.md

Signed-off-by: Eric Shi <ershi@nvidia.com>

* Fix conda installation docs to match available variants

The previous example referenced cuda126 builds which no longer exist.
conda-forge now publishes cuda129 and cuda130 variants. Show the
default install command and build string filters for specific variants.

Signed-off-by: Eric Shi <ershi@nvidia.com>

* Add announcement banner linking to latest release notes

Signed-off-by: Eric Shi <ershi@nvidia.com>

* Reduce TOC depth for changelog, publications, and API reference

Prevent per-version changelog entries, per-year publication entries,
and full API class/method hierarchies from cluttering the landing page
TOC and sidebar navigation.

Signed-off-by: Eric Shi <ershi@nvidia.com>

Approved-by: Eric Shi <ershi@nvidia.com>

See merge request omniverse/warp!2201
* Update docs, changelog, and benchmarks for v1.12.1 release

Bump version references in docs announcement banner and installation
URLs, add v1.12.1 to ASV benchmark tags, and clean up Unreleased
changelog entries for clarity and consistency.

Signed-off-by: Eric Shi <ershi@nvidia.com>

Approved-by: Eric Shi <ershi@nvidia.com>

See merge request omniverse/warp!2203
@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@Tuesdaythe13th Tuesdaythe13th merged commit 845f9a9 into main Apr 6, 2026
Tuesdaythe13th added a commit that referenced this pull request Apr 6, 2026
Merge pull request #1 from Tuesdaythe13th/claude/disc-config-struct-n…
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request includes significant updates to the Warp documentation, including a new quick-start guide, updated installation instructions, and a new example notebook for disc cooling simulation. It also includes maintenance updates such as removing Kit extensions, updating the minimum Python version to 3.10, and adding support for cuBQL. The review feedback highlights a physical inaccuracy in the crystallisation kinetics model within the new notebook and suggests updating the kernel array type hints to the new subscript syntax for consistency.

Comment on lines +266 to +291
"@wp.kernel\n",
"def update_crystallinity(\n",
" T: wp.array(dtype=wp.float32),\n",
" chi_in: wp.array(dtype=wp.float32),\n",
" chi_out: wp.array(dtype=wp.float32),\n",
" config: DiscConfig,\n",
" params: CoolingParams,\n",
"):\n",
" \"\"\"Simple Avrami-style crystallisation kinetics.\n",
"\n",
" Crystal growth is fastest mid-way between T_g and T_m and saturates\n",
" at chi_max. Replace with a Nakamura model for production use.\n",
" \"\"\"\n",
" tid = wp.tid()\n",
" temp = T[tid]\n",
" chi = chi_in[tid]\n",
"\n",
" if temp > config.T_g and temp < config.T_m:\n",
" x = (temp - config.T_g) / (config.T_m - config.T_g)\n",
" x = wp.max(0.0, wp.min(1.0, x))\n",
" rate = params.avrami_k0 * x * (1.0 - chi / params.chi_max)\n",
" chi = chi + params.dt * params.avrami_n * rate\n",
" chi = wp.max(0.0, wp.min(params.chi_max, chi))\n",
"\n",
" chi_out[tid] = wp.float32(chi)\n",
"\n",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The crystallisation kinetics implementation in update_crystallinity appears to be physically incorrect and contradicts the docstring.

  1. Temperature dependence: The docstring states that growth is fastest mid-way between $T_g$ and $T_m$. However, the code uses x = (temp - T_g) / (T_m - T_g), which makes the rate peak at $T_m$. In reality, the driving force for crystallisation is undercooling ($T_m - T$), so the rate should be zero at $T_m$. Consider using a term like x * (1.0 - x) or a more realistic Nakamura/Hoffman-Lauritzen model.
  2. Avrami exponent: params.avrami_n is used as a linear multiplier for the rate. In the Avrami model, $n$ is an exponent that characterizes the dimensionality of growth. A linear scaling does not capture the sigmoidal nature of the transformation for $n &gt; 1$.

Comment on lines +193 to +327
"@wp.kernel\n",
"def init_temperature(\n",
" T: wp.array(dtype=wp.float32),\n",
" params: CoolingParams,\n",
"):\n",
" tid = wp.tid()\n",
" T[tid] = wp.float32(params.T_init)\n",
"\n",
"\n",
"@wp.kernel\n",
"def init_scalar(\n",
" a: wp.array(dtype=wp.float32),\n",
" value: float,\n",
"):\n",
" tid = wp.tid()\n",
" a[tid] = wp.float32(value)\n",
"\n",
"\n",
"@wp.kernel\n",
"def step_temperature(\n",
" T_in: wp.array(dtype=wp.float32),\n",
" T_out: wp.array(dtype=wp.float32),\n",
" config: DiscConfig,\n",
" params: CoolingParams,\n",
"):\n",
" \"\"\"Explicit finite-difference heat diffusion in cylindrical coordinates.\n",
"\n",
" Solves ∂T/∂t = α (∂²T/∂r² + (1/r)∂T/∂r + ∂²T/∂z²)\n",
" with Dirichlet mould-wall BCs on the top and bottom (z) faces and\n",
" Neumann (zero-flux) BCs on the axis (r=0) and outer radius.\n",
" \"\"\"\n",
" tid = wp.tid()\n",
" i = tid // config.nz\n",
" j = tid - i * config.nz\n",
"\n",
" alpha = config.k / (config.rho * config.cp)\n",
" dr = config.dr\n",
" dz = config.dz\n",
" r = (float(i) + 0.5) * dr\n",
"\n",
" # Dirichlet BC on top/bottom mould walls.\n",
" if j == 0 or j == config.nz - 1:\n",
" T_out[tid] = wp.float32(params.T_mold)\n",
" return\n",
"\n",
" # Neumann BC: mirror stencil at axis and outer edge.\n",
" im = clamp_i(i - 1, 0, config.nx - 1)\n",
" ip = clamp_i(i + 1, 0, config.nx - 1)\n",
" if i == 0:\n",
" im = 1\n",
" if i == config.nx - 1:\n",
" ip = config.nx - 2\n",
"\n",
" jm = j - 1\n",
" jp = j + 1\n",
"\n",
" Tc = T_in[idx(i, j, config.nz)]\n",
" Trm = T_in[idx(im, j, config.nz)]\n",
" Trp = T_in[idx(ip, j, config.nz)]\n",
" Tzm = T_in[idx(i, jm, config.nz)]\n",
" Tzp = T_in[idx(i, jp, config.nz)]\n",
"\n",
" d2Tdr2 = (Trp - 2.0 * Tc + Trm) / (dr * dr)\n",
" dTdr_over_r = 0.0\n",
" if i > 0:\n",
" dTdr_over_r = (Trp - Trm) / (2.0 * dr * r)\n",
" d2Tdz2 = (Tzp - 2.0 * Tc + Tzm) / (dz * dz)\n",
"\n",
" lap = d2Tdr2 + dTdr_over_r + d2Tdz2\n",
" Tnew = Tc + params.dt * alpha * lap\n",
" T_out[tid] = wp.float32(Tnew)\n",
"\n",
"\n",
"@wp.kernel\n",
"def update_crystallinity(\n",
" T: wp.array(dtype=wp.float32),\n",
" chi_in: wp.array(dtype=wp.float32),\n",
" chi_out: wp.array(dtype=wp.float32),\n",
" config: DiscConfig,\n",
" params: CoolingParams,\n",
"):\n",
" \"\"\"Simple Avrami-style crystallisation kinetics.\n",
"\n",
" Crystal growth is fastest mid-way between T_g and T_m and saturates\n",
" at chi_max. Replace with a Nakamura model for production use.\n",
" \"\"\"\n",
" tid = wp.tid()\n",
" temp = T[tid]\n",
" chi = chi_in[tid]\n",
"\n",
" if temp > config.T_g and temp < config.T_m:\n",
" x = (temp - config.T_g) / (config.T_m - config.T_g)\n",
" x = wp.max(0.0, wp.min(1.0, x))\n",
" rate = params.avrami_k0 * x * (1.0 - chi / params.chi_max)\n",
" chi = chi + params.dt * params.avrami_n * rate\n",
" chi = wp.max(0.0, wp.min(params.chi_max, chi))\n",
"\n",
" chi_out[tid] = wp.float32(chi)\n",
"\n",
"\n",
"@wp.kernel\n",
"def compute_warp_risk(\n",
" T: wp.array(dtype=wp.float32),\n",
" chi: wp.array(dtype=wp.float32),\n",
" warp_risk: wp.array(dtype=wp.float32),\n",
" config: DiscConfig,\n",
" params: CoolingParams,\n",
"):\n",
" \"\"\"Score each radial position by thermal gradient and crystallinity asymmetry.\n",
"\n",
" Only threads at the mid-plane (j == nz/2) write to warp_risk[i].\n",
" \"\"\"\n",
" tid = wp.tid()\n",
" i = tid // config.nz\n",
" j = tid - i * config.nz\n",
"\n",
" if j != config.nz // 2:\n",
" return\n",
"\n",
" top = T[idx(i, 0, config.nz)]\n",
" bot = T[idx(i, config.nz - 1, config.nz)]\n",
" mid = T[idx(i, j, config.nz)]\n",
"\n",
" chi_top = chi[idx(i, 1, config.nz)]\n",
" chi_bot = chi[idx(i, config.nz - 2, config.nz)]\n",
"\n",
" dT_thickness = wp.abs(top - bot)\n",
" dT_mid = wp.abs(mid - 0.5 * (top + bot))\n",
" dchi = wp.abs(chi_top - chi_bot)\n",
"\n",
" risk = (\n",
" params.warp_temp_coeff * (dT_thickness + dT_mid)\n",
" + params.warp_chi_coeff * dchi\n",
" )\n",
" warp_risk[i] = wp.float32(risk)"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The kernels in this notebook use the older wp.array(dtype=wp.float32) syntax for array type hints. To maintain consistency with the extensive documentation updates in this PR (which transition to the subscript syntax), these should be updated to use wp.array[float] or wp.array[wp.float32].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.