Add Linux support for ProcessStartInfo.KillOnParentExit using prctl(PR_SET_PDEATHSIG)#127112
Add Linux support for ProcessStartInfo.KillOnParentExit using prctl(PR_SET_PDEATHSIG)#127112
Conversation
…R_SET_PDEATHSIG)
Implements PR_SET_PDEATHSIG-based KillOnParentExit on Linux using a dedicated
long-lived thread to ensure the death signal fires on process exit, not thread exit.
Changes:
- Add HAVE_PR_SET_PDEATHSIG cmake check and config
- Add killOnParentExit parameter to SystemNative_ForkAndExecProcess
- Implement dedicated pdeathsig thread with mutex/condvar synchronization
- Call prctl(PR_SET_PDEATHSIG, SIGKILL) in child after fork
- Pass KillOnParentExit through managed interop layer
- Add [SupportedOSPlatform("linux")] to ref and impl
- Enable KillOnParentExitTests on Linux
Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/85b53068-8457-4ee3-bdbf-4339fc5a007e
Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
…necessary prctl in thread, improve getppid comment Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/85b53068-8457-4ee3-bdbf-4339fc5a007e Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
|
@copilot fix following build failure: |
Add return NULL after infinite while(1) loop to satisfy GCC's -Werror=return-type, with clang pragma to suppress -Wunreachable-code-return. Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/2659e9ab-8d2c-48d8-a7af-d5356d056e46 Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
There was a problem hiding this comment.
@copilot fix the WASM build
FAILED: libs-native/System.Native/CMakeFiles/System.Native-Static.dir/pal_process.c.o
/__w/1/s/src/mono/browser/emsdk/emscripten/emcc -DCOMPILER_SUPPORTS_W_RESERVED_IDENTIFIER -DDISABLE_CONTRACTS -DGEN_PINVOKE=1 -DHOST_32BIT=1 -DHOST_BROWSER -DHOST_UNIX -DHOST_WASM -DNDEBUG -DTARGET_32BIT -DTARGET_BROWSER -DTARGET_UNIX -DTARGET_WASM -DTARGET_WASM32 -DURTBLDENV_FRIENDLY=Retail -D_FILE_OFFSET_BITS=64 -D_TIME_BITS=64 -I/__w/1/s/artifacts/obj/coreclr/browser.wasm.Release/libs-native/System.Native -I/__w/1/s/src/native/libs/System.Native -I/__w/1/s/src/native -I/__w/1/s/src/native/inc -std=gnu99 -Wno-declaration-after-statement -Wno-pre-c11-compat -O3 -DNDEBUG -std=gnu11 -fwasm-exceptions -mbulk-memory -msimd128 -O3 -Wall -Wno-null-conversion -glldb -Wno-unused-parameter -Wno-alloca -Wno-implicit-int-float-conversion -fno-omit-frame-pointer -fno-strict-overflow -fno-strict-aliasing -Werror -Wno-unused-variable -Wno-unused-value -Wno-unused-function -Wno-tautological-compare -Wno-unknown-pragmas -Wimplicit-fallthrough -Wno-unused-but-set-variable -ffp-contract=off -Wno-unknown-warning-option -ferror-limit=4096 -Wno-unused-private-field -Wno-constant-logical-operand -Wno-pragma-pack -Wno-incompatible-ms-struct -Wno-reserved-identifier -Wno-unsafe-buffer-usage -Wno-single-bit-bitfield-constant-conversion -Wno-cast-function-type-strict -Wno-switch-default -fsigned-char -fvisibility=hidden -ffunction-sections -I/__w/1/s/src/native/libs/Common -I/__w/1/s/artifacts/obj/coreclr/browser.wasm.Release/libs-native/Common -Oz -Weverything -Wno-format-nonliteral -Wno-disabled-macro-expansion -Wno-padded -Wno-empty-translation-unit -Wno-cast-align -Wno-typedef-redefinition -Wno-thread-safety-analysis -Wno-c11-extensions -Wno-pre-c11-compat -MD -MT libs-native/System.Native/CMakeFiles/System.Native-Static.dir/pal_process.c.o -MF libs-native/System.Native/CMakeFiles/System.Native-Static.dir/pal_process.c.o.d -o libs-native/System.Native/CMakeFiles/System.Native-Static.dir/pal_process.c.o -c /__w/1/s/src/native/libs/System.Native/pal_process.c
/__w/1/s/src/native/libs/System.Native/pal_process.c:537:12: error: call to undeclared function 'ForkAndExecProcessInternal'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
537 | return ForkAndExecProcessInternal(
| ^
/__w/1/s/src/native/libs/System.Native/pal_process.c:544:16: error: static declaration of 'ForkAndExecProcessInternal' follows non-static declaration
544 | static int32_t ForkAndExecProcessInternal(
| ^
/__w/1/s/src/native/libs/System.Native/pal_process.c:537:12: note: previous implicit declaration is here
537 | return ForkAndExecProcessInternal(
| ^
2 errors generated.
Move ForkAndExecProcessInternal forward declaration outside #if HAVE_PR_SET_PDEATHSIG so it's visible on all platforms (fixes WASM build). Simplify return NULL in PDeathSigThreadFunc by removing pragma guards. Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/0261d155-fd8a-43c3-80bc-9085dec73b97 Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
🤖 Copilot Code Review — PR #127112Note This review was AI/Copilot-generated using the code-review skill, with sub-agent analysis from Gemini 2.5 Pro and GPT-5.3-Codex. Holistic AssessmentMotivation: The PR extends Approach: The dedicated long-lived thread design is the correct approach. Since Summary: ✅ LGTM. All previous review findings have been addressed, including the concurrent-submit race (fixed in 0f92552). The code is correct and well-structured. API shape matches the approved proposal. Both sub-agent models verified the core correctness. Detailed Findings✅ API approval — VerifiedThe ✅ Thread safety of double-checked locking — Verified correct (both models agree)
✅ Mutex/condvar pattern — Verified correct (both models agree)The producer-consumer pattern between
✅ prctl placement — Verified correctOne sub-agent flagged that ✅ vfork safety — Verified correct (both models agree)
✅ Orphan check (
|
|
@jkotas the PR is ready for the first round of review. Please let me know what you think about this approach. |
The approach looks fine to me. |
Agent-Logs-Url: https://github.com/dotnet/runtime/sessions/4ca5bf12-28f7-4dc4-8f03-a60786fa9bfe Co-authored-by: adamsitnik <6011991+adamsitnik@users.noreply.github.com>
🤖 Copilot Code Review — PR #127112Note This review was AI/Copilot-generated using the code-review skill, with sub-agent analysis from Gemini 2.5 Pro and GPT-5.3-Codex. Holistic AssessmentMotivation: The PR extends Approach: The dedicated long-lived thread design is the correct approach. Since Summary: ✅ LGTM. All previous review findings have been addressed. The code is correct and well-structured. API shape matches the approved proposal. Both sub-agent models verified the core correctness. Detailed Findings✅ API approval — VerifiedThe ✅ Thread safety of double-checked locking — Verified correct (both models agree)
✅ Mutex/condvar pattern — Verified correct (both models agree)
✅ vfork safety — Verified correct
✅ Orphan detection — Verified correctCaptures ✅ Error handling in
|
adamsitnik
left a comment
There was a problem hiding this comment.
LGTM and is ready for review (cc @jkotas @tmds @davidfowl )
| setCredentials, userId, groupId, groups, | ||
| out childPid, stdinHandle, stdoutHandle, stderrHandle, | ||
| startInfo.StartDetached, inheritedHandles); | ||
| startInfo.StartDetached, OperatingSystem.IsLinux() ? startInfo.KillOnParentExit : false, inheritedHandles); |
There was a problem hiding this comment.
Is this needed? Would it be better to pass startInfo.KillOnParentExit to the native shim unconditionally so that we do not need to worry about keeping the native shim logic in sync with what's here?
| check_symbol_exists( | ||
| PR_SET_PDEATHSIG | ||
| "sys/prctl.h" | ||
| HAVE_PR_SET_PDEATHSIG) |
There was a problem hiding this comment.
Just wondering - are we going to detect the presence of this flag on Android?
Contributes to #101985
Approved API: #125838 (comment)
Description
Extends
ProcessStartInfo.KillOnParentExitwith Linux support usingprctl(PR_SET_PDEATHSIG, SIGKILL).On Linux,
PR_SET_PDEATHSIGsends the death signal when the thread that calledprctlexits, not when the process exits. To ensure the signal fires only on process exit, this implementation creates a dedicated long-lived thread that performs all fork+exec operations whenKillOnParentExitis set. The child process callsprctl(PR_SET_PDEATHSIG, SIGKILL)after fork, before exec. Afterprctl, the child verifies it hasn't been reparented (e.g., to a subreaper in a container) by comparinggetppid()against the expected parent PID captured before fork.Changes
Native layer (
pal_process.c,pal_process.h,pal_process_wasi.c)killOnParentExitparameter toSystemNative_ForkAndExecProcessprctl(PR_SET_PDEATHSIG, SIGKILL)after fork, before exec_Atomic intwithatomic_load_explicit/atomic_store_explicitfor double-checked lockinggetppid()matches after prctl, correctly detecting reparenting to subreapersForkAndExecProcessInternalforward declaration outside#if HAVE_PR_SET_PDEATHSIGso it is visible on all platforms (including WASM)pthread_cond_broadcast(notpthread_cond_signal) for done condition to prevent lost-wakeup deadlock when multiple callers wait concurrentlypthread_attr_initandpthread_attr_setdetachstatereturn values, with proper cleanup on failure (following the established pattern frompal_signal.c)expectedParentPiddeclaration in the variable declarations block before the firstgototo fix C++-Wjump-misses-initbuild errors_pdeathsig_request == NULL) before submitting, and each request has a per-requestdoneflag so callers wait on their own completion rather than a shared pointer becoming NULLassert(s_pdeathsig_request == NULL)before submitting request to the dedicated threadCMake / Config
HAVE_PR_SET_PDEATHSIGcheck inconfigure.cmake(checkssys/prctl.h)#cmakedefine01 HAVE_PR_SET_PDEATHSIGinpal_config.h.inManaged layer
killOnParentExitparameter toInterop.ForkAndExecProcess.csP/InvokestartInfo.KillOnParentExitthroughSafeProcessHandle.Unix.cs, gated withOperatingSystem.IsLinux()to passfalseon non-Linux Unix platforms[SupportedOSPlatform("linux")]to ref and implementationTests
KillOnParentExitTests.csto run onTestPlatforms.Windows | TestPlatforms.Linux