Skip to content

[NativeAOT] Add cDAC data descriptor infrastructure#126972

Merged
max-charlamb merged 29 commits intomainfrom
dev/max-charlamb/managed-type-descriptors
May 1, 2026
Merged

[NativeAOT] Add cDAC data descriptor infrastructure#126972
max-charlamb merged 29 commits intomainfrom
dev/max-charlamb/managed-type-descriptors

Conversation

@max-charlamb
Copy link
Copy Markdown
Member

@max-charlamb max-charlamb commented Apr 15, 2026

Note

This PR was created with assistance from GitHub Copilot.

Summary

Adds the cDAC data descriptor infrastructure for NativeAOT, enabling diagnostic tools (cDAC reader, SOS) to inspect NativeAOT runtime state through the same contract-based mechanism used by CoreCLR.

Changes

Native data descriptor (datadescriptor.inc)

  • Thread/ThreadStore: Thread state, OS ID, exception tracker, stack bounds, alloc context, transition frame, thread link
  • EEAllocContext/GCAllocContext: Allocation pointer, limit, bytes allocated
  • MethodTable (EEType): Flags, base size, related type, vtable slots, interfaces, hash code — with flag constants exposed via cdac_data<> friend pattern
  • ExInfo: Exception linked list traversal
  • StressLog/ThreadStressLog: Stress log infrastructure (guarded by STRESS_LOG)
  • Globals: ThreadStore static pointer, free object MethodTable, GC bounds, thread state flags, object unmask, stress log
  • Contracts: Thread (n1), Exception (c1), RuntimeTypeSystem (n1), StressLog (c2)
  • Sub-descriptors: GC (workstation + server) and managed type descriptors

ILC managed type descriptor (ManagedDataDescriptorNode)

  • Computes managed type field offsets at compile time in ILC
  • Emits a ContractDescriptor (DotNetManagedContractDescriptor) with JSON-encoded type layouts using Utf8JsonWriter
  • Types and fields discovered via [DataContract] attribute on types in MetadataManager.GetTypesWithEETypes()
  • Type name mangling: System.Threading.Thread -> System_Threading_Thread
  • Referenced by the native descriptor as a sub-descriptor via CDAC_GLOBAL_SUB_DESCRIPTOR
  • Currently registers System.Threading.Thread fields (ManagedThreadId, Name)

GC sub-descriptor

  • Enabled GC sub-descriptor for NativeAOT by setting GC_INTERFACE_*_VERSION before GC_Initialize
  • Added GC_DESCRIPTOR compile definition (guarded on non-WASM)
  • Linked both WKS and SVR GC descriptor objects into Runtime.ServerGC (ServerGC compiles both paths)
  • Added #ifdef HEAP_ANALYZE guards in shared GC datadescriptor files (NativeAOT disables HEAP_ANALYZE)

Attribute-based type discovery

  • [DataContract] attribute in System.Diagnostics namespace (internal, targets Class/Struct/Field)
  • Applied to System.Threading.Thread fields in Thread.NativeAot.cs
  • ILC scans for annotated types in GetTypesWithEETypes() ensuring only types with MethodTables are included

Build integration

  • CMake integration using shared clrdatadescriptors.cmake infrastructure
  • nativeaot_runtime_includes interface library captures all Runtime include paths for cross-target compilation
  • Separate GC descriptor targets for workstation and server GC
  • cdac-build-tool enabled for NativeAOT via ClrNativeAotSubset in runtime.proj
  • Symbol export via --export-dynamic-symbol in Microsoft.NETCore.Native.targets (WASM excluded)
  • Local copy of cdacdata.h template in Runtime/inc/ (matching GC pattern for self-contained builds)

Key design decisions

  • Contract versions: n1 for NativeAOT-specific contracts, c1/c2 for contracts shared with CoreCLR (same version)
  • ThreadStore: Uses SPTR_DECL/SPTR_IMPL for s_pThreadStore static member, matching CoreCLR pattern
  • Singleton node: ManagedDataDescriptorNode does not override CompareToImpl — follows the ILC singleton pattern (base class throws on duplicates)
  • SList: Unified slist.h shared between CoreCLR VM and NativeAOT Runtime

Validation

  • Build: build.cmd clr.aot+libs -rc release — 0 errors, 0 warnings
  • Symbol verified in Runtime.WorkstationGC.lib via dumpbin
  • cDAC reader tests: 1586/1586 passed
  • tools.cdac tests: All passed
  • Dump inspection: All 3 sub-descriptors verified (main: 4 contracts/11 types/20 globals, managed: System_Threading_Thread with fields, GC: 1 contract/10 types/41 globals)

@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to this area: @agocke, @dotnet/ilc-contrib
See info in area-owners.md if you want to be subscribed.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds cDAC contract descriptor generation to the NativeAOT runtime, plus an ILC-emitted managed sub-descriptor so diagnostic tools can inspect NativeAOT runtime/managed state via the shared contract mechanism.

Changes:

  • Integrates NativeAOT cDAC contract descriptor (and GC sub-descriptors) into the NativeAOT CMake build and runtime libraries.
  • Introduces a managed type layout sub-descriptor emitted by ILC (DotNetManagedContractDescriptor) and wires it into the NativeAOT descriptor as a sub-descriptor.
  • Exposes select private NativeAOT runtime offsets/constants to the descriptor via the cdac_data<T> friend pattern and exports the main contract descriptor symbol for diagnostics.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/coreclr/tools/aot/ILCompiler/Program.cs Adds the managed descriptor root provider to ILC compilation roots.
src/coreclr/tools/aot/ILCompiler.Compiler/ILCompiler.Compiler.csproj Includes new managed descriptor provider/node sources in the build.
src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/ManagedDataDescriptorProvider.cs Registers managed types to be described and roots the descriptor + JSON blob.
src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/ManagedDataDescriptorNode.cs Emits a ContractDescriptor-shaped symbol containing JSON type layout data.
src/coreclr/nativeaot/Runtime/threadstore.h Exposes ThreadStore private offsets for descriptor generation via cdac_data<>.
src/coreclr/nativeaot/Runtime/inc/MethodTable.h Exposes MethodTable offsets and flag constants for descriptor consumption via cdac_data<>.
src/coreclr/nativeaot/Runtime/datadescriptor/datadescriptor.inc Defines the NativeAOT data descriptor types/globals/contracts and sub-descriptors.
src/coreclr/nativeaot/Runtime/datadescriptor/datadescriptor.h Provides includes and declares the managed sub-descriptor symbol address for inclusion.
src/coreclr/nativeaot/Runtime/datadescriptor/CMakeLists.txt Adds descriptor generation targets for NativeAOT runtime + GC (wks/svr).
src/coreclr/nativeaot/Runtime/RuntimeInstance.h Exposes RuntimeInstance private offsets via cdac_data<>.
src/coreclr/nativeaot/Runtime/Full/CMakeLists.txt Links the generated descriptor libraries into WorkstationGC/ServerGC runtime libs.
src/coreclr/nativeaot/Runtime/CMakeLists.txt Adds the datadescriptor subdirectory to the NativeAOT runtime build (non-WASM).
src/coreclr/nativeaot/BuildIntegration/Microsoft.NETCore.Native.targets Exports DotNetRuntimeContractDescriptor symbol for diagnostics on all OSes.
Comments suppressed due to low confidence (1)

src/coreclr/nativeaot/Runtime/datadescriptor/CMakeLists.txt:73

  • target_compile_definitions entries should be raw preprocessor symbols (e.g., SERVER_GC), not compiler flags. Passing -DSERVER_GC here will typically result in an invalid definition being forwarded to the compiler. Use SERVER_GC (or SERVER_GC=1) instead.

Comment thread src/coreclr/nativeaot/Runtime/datadescriptor/datadescriptor.inc Outdated
Comment thread src/coreclr/nativeaot/Runtime/datadescriptor/datadescriptor.h Outdated
Comment thread src/coreclr/nativeaot/Runtime/datadescriptor/datadescriptor.inc Outdated
Comment thread src/coreclr/nativeaot/BuildIntegration/Microsoft.NETCore.Native.targets Outdated
Comment thread src/coreclr/nativeaot/Runtime/datadescriptor/datadescriptor.inc
Copilot AI review requested due to automatic review settings April 16, 2026 20:49
@max-charlamb max-charlamb force-pushed the dev/max-charlamb/managed-type-descriptors branch from 9462d5c to f226bc3 Compare April 16, 2026 20:49
@max-charlamb max-charlamb deleted the dev/max-charlamb/managed-type-descriptors branch April 16, 2026 20:53
@max-charlamb max-charlamb restored the dev/max-charlamb/managed-type-descriptors branch April 16, 2026 20:54
@max-charlamb max-charlamb reopened this Apr 16, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 7 comments.

Comment thread src/coreclr/nativeaot/Runtime/datadescriptor/CMakeLists.txt Outdated
Comment thread src/coreclr/nativeaot/Runtime/datadescriptor/CMakeLists.txt
Comment thread src/coreclr/nativeaot/Runtime/datadescriptor/CMakeLists.txt
Comment thread src/coreclr/nativeaot/Runtime/datadescriptor/datadescriptor.inc
@github-actions

This comment has been minimized.

Copilot AI review requested due to automatic review settings April 17, 2026 16:57
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 1 comment.

Comment thread src/coreclr/nativeaot/Runtime/RuntimeInstance.h
Comment thread src/coreclr/nativeaot/Runtime/RuntimeInstance.cpp Outdated
Copilot AI review requested due to automatic review settings April 17, 2026 19:44
Walk ContainingType chain to produce fully-qualified names
with + separator (e.g., System.Foo.Outer+Inner).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread src/coreclr/nativeaot/Runtime/datadescriptor/datadescriptor.h Outdated
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 28, 2026 18:35
Copy link
Copy Markdown
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MichalStrehovsky Could you please signoff as well?

Comment thread src/coreclr/nativeaot/Runtime/datadescriptor/datadescriptor.inc Outdated
Comment thread src/coreclr/nativeaot/Runtime/datadescriptor/datadescriptor.inc Outdated
Comment thread src/coreclr/nativeaot/Runtime/datadescriptor/datadescriptor.inc Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 1 comment.

@github-actions

This comment has been minimized.

…ed descriptor

- Remove MethodTable flag constant globals from datadescriptor.inc
  and cdac_data<MethodTable> in MethodTable.h — these are already
  defined as part of the contract in MethodTableFlags_1.cs
- Add baseline and contracts properties to managed sub-descriptor
  JSON for self-describing format consistency

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

🤖 Copilot Code Review — PR #126972

Note

This review was generated by GitHub Copilot.

Holistic Assessment

Motivation: This PR adds cDAC (data access component) data descriptor infrastructure to NativeAOT, enabling diagnostic tools (debuggers, crash dump analyzers) to inspect NativeAOT runtime state without symbols. This is well-motivated — it's a prerequisite for cDAC support in NativeAOT, analogous to what already exists for CoreCLR.

Approach: The approach is sound — it reuses the existing generate_data_descriptors() CMake infrastructure and shared datadescriptor.cpp machinery. Moving ThreadStore from RuntimeInstance::m_pThreadStore to a static ThreadStore::s_pThreadStore matches the CoreCLR pattern. The managed type descriptor emitted by ILC as a sub-descriptor integrates cleanly with the existing ContractDescriptorParser. The HEAP_ANALYZE guards fix real compilation errors for NativeAOT GC builds.

Summary: ⚠️ Needs Human Review. The implementation is largely correct and well-structured, but there are design questions around contract versioning (n1 vs c1) and its interaction with the cDAC reader that a domain expert should verify. A human reviewer should confirm whether n1 contracts are intentionally non-functional placeholders or need corresponding reader support.


Detailed Findings

⚠️ Contract Versions — n1 not registered in cDAC reader (advisory, not merge-blocking)

The NativeAOT descriptor declares:

CDAC_GLOBAL_CONTRACT(Thread, n1)
CDAC_GLOBAL_CONTRACT(RuntimeTypeSystem, n1)

However, the managed cDAC reader (CoreCLRContracts.cs:38) only registers c1 versions:

registry.Register<IThread>("c1", static t => new Thread_1(t));

There is no n1 handler anywhere in src/native/managed/cdac/. This means these contracts will not be resolved when diagnosing a NativeAOT process. If this is intentional (placeholder for future NativeAOT-specific contract implementations), consider adding a comment. If it's expected to work now, corresponding contract factories are needed.

Files: src/coreclr/nativeaot/Runtime/datadescriptor/datadescriptor.inc (lines ~155-158)

✅ HEAP_ANALYZE guards — Correct fix

HEAP_ANALYZE is only defined when FEATURE_NATIVEAOT is NOT set (gcpriv.h:200-203). Without these guards, the GC data descriptor would fail to compile for NativeAOT. The guards are correctly placed in both datadescriptor.h and datadescriptor.inc, with proper #endif comments.

✅ ThreadStore refactoring — Correct and well-versioned

Moving m_pThreadStore from RuntimeInstance to ThreadStore::s_pThreadStore is consistent with CoreCLR's cDAC pattern. The DebugHeader major version is correctly bumped from 5→6 with appropriate documentation. The SPTR_DECL/SPTR_IMPL pattern matches existing DAC infrastructure. The initialization order in RuntimeInstance::Initialize() correctly assigns the static after g_pTheRuntimeInstance is set.

✅ ManagedDataDescriptorNode — JSON format matches reader expectations

The emitted JSON uses:

  • "!" sigil for value type sizes (matches TypeDescriptorSizeSigil in ContractDescriptorParser)
  • Plain numbers for field offsets (matches FieldDescriptorConverter compact format)
  • "version": 0, "baseline": "empty" top-level properties (match ContractDescriptor schema)

The ContractDescriptor C struct layout (magic, flags, descriptor_size, descriptor ptr, pointer_data_count, pad, pointer_data ptr) matches the shared contract-descriptor.h definition.

✅ WASM exclusion — Consistent

WASM is excluded via if(NOT CLR_CMAKE_TARGET_ARCH_WASM) for both the GC_DESCRIPTOR define and the datadescriptor subdirectory in CMake, and via '$(_targetOS)' != 'browser' for the export in MSBuild targets. This matches the broader WASM exclusion pattern in the NativeAOT Runtime CMakeLists.txt.

✅ GC version initialization — Correct

Adding g_gc_dac_vars.major_version_number and minor_version_number before GC_Initialize matches the CoreCLR pattern and ensures the GC sub-descriptor has version information.

✅ Build system integration — Well structured

The new datadescriptor/CMakeLists.txt correctly uses include(${CLR_DIR}/clrdatadescriptors.cmake), creates separate interface libraries for WKS/SVR GC descriptors, uses EXPORT_VISIBLE only for the main contract descriptor, and properly propagates include directories via nativeaot_runtime_includes.

💡 ManagedDataDescriptorProvider unconditionally added for WASM

ManagedDataDescriptorProvider is always added in Program.cs (lines 266, 278), even for WASM targets where the native datadescriptor isn't built. The ILC-emitted DotNetManagedContractDescriptor symbol is unused dead data on WASM. Non-blocking, but could be gated on !TargetsBrowser for binary size if desired. (Follow-up improvement, not in-scope for this PR.)

💡 DataContractAttribute naming overlap

System.Diagnostics.DataContractAttribute shares its short name with System.Runtime.Serialization.DataContractAttribute. No actual conflict exists (different namespaces, the new one is internal), but it could cause momentary confusion. The naming aligns with cDAC "data contract" terminology so it's appropriate — just noting for awareness.

Generated by Code Review for issue #126972 ·

Copy link
Copy Markdown
Member

@MichalStrehovsky MichalStrehovsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good otherwise! What is the testing strategy for this? The DotNetRuntimeDebugHeader was pretty much untested because the code to read it lived elsewhere. Do we have the ability to test this in the dotnet/runtime repo?

Comment thread src/coreclr/tools/aot/ILCompiler/Program.cs Outdated
@max-charlamb
Copy link
Copy Markdown
Member Author

Looks good otherwise! What is the testing strategy for this? The DotNetRuntimeDebugHeader was pretty much untested because the code to read it lived elsewhere. Do we have the ability to test this in the dotnet/runtime repo?

Some of these values are read by the existing cDAC contracts, some will be read by new contracts (in a different repo). We don't have tests automated yet, but it is one of the next items I am working on.

- Simplify GetSection to always use ReadOnlyDataSection
- Add Debug.Assert for header size before emitting JSON
- Remove Phase override (default unordered is fine)
- Gate ManagedDataDescriptorProvider on EnableDebugInfo

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 30, 2026 18:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 2 comments.

Comment thread src/coreclr/nativeaot/Runtime/datadescriptor/datadescriptor.inc
Comment thread src/coreclr/nativeaot/Runtime/DebugHeader.cpp
Revert GetSection to use DataSection on non-Windows platforms.
Nodes with pointer relocations require writable sections on ELF.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@max-charlamb max-charlamb merged commit e883467 into main May 1, 2026
110 checks passed
@max-charlamb max-charlamb deleted the dev/max-charlamb/managed-type-descriptors branch May 1, 2026 05:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants