You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Instrumentation approach: PLT hooking (GOT patching), NOT kernel syscall level
write/read hooks filter non-socket fds via a lock-free fd-type cache; non-socket fds pass through with one atomic load overhead
NOT in scope: sendto/recvfrom (UDP), connect, close, shutdown
Feature activation: explicit opt-in via profiler argument nativesocket
Rate limiting: ~83 events/s (~5000/min) via PID-based subsampling using a standalone RateLimiter class
Separate JFR event type datadog.NativeSocketEvent
Linux required; macOS compiles as no-op stub
Sampling Model
Time-weighted Poisson-process sampling: the sampling unit is call duration in TSC ticks, not bytes
Probability of sampling a call of duration d with mean interval T: P = 1 - exp(-d/T)
Inverse-transform weight: weight = 1/P; invariant: E[weight × duration] = duration — sum(weight_i × duration_i) is an unbiased estimator of total I/O time
Long blocking calls (d >> T) have P ≈ 1 and are virtually always sampled regardless of interval magnitude; short fast calls are down-sampled
Per-thread PoissonSampler instances for outbound (send+write) and inbound (recv+read) directions; both read from a single shared RateLimiter
One shared RateLimiter covers all four hooks combined: ~83 events/s (~5000/min) total budget across all directions. A write-heavy workload raises the interval but does not suppress long blocking reads because P → 1 as d >> T
PID controller adjusts interval_ticks once per second based on aggregate fire count across all threads
Epoch-based lazy TLS reset: _epoch is bumped on start(); per-thread samplers reinitialise lazily on next hook invocation
Acceptance Criteria
Socket I/O tracking can be explicitly enabled via profiler argument nativesocket
Instrumentation hooks libc send, recv, write, read via PLT hooking; write/read skip non-socket fds via a lock-free fd-type cache
Sampling is time-weighted (duration in TSC ticks): P = 1 - exp(-duration/interval), weight = 1/(1-exp(-duration/interval))
PID-based rate control targets ~83 events/s (~5000/min) combined across all four hooks; rate is not user-configurable
Outbound (send+write) and inbound (recv+read) use separate per-thread PoissonSampler instances sharing one RateLimiter
Track blocking I/O operations at the libc function level.
Scope
send,recv,write,read(TCP/SOCK_STREAM only)write/readhooks filter non-socket fds via a lock-free fd-type cache; non-socket fds pass through with one atomic load overheadsendto/recvfrom(UDP),connect,close,shutdownnativesocketRateLimiterclassdatadog.NativeSocketEventSampling Model
dwith mean intervalT:P = 1 - exp(-d/T)weight = 1/P; invariant:E[weight × duration] = duration—sum(weight_i × duration_i)is an unbiased estimator of total I/O timePoissonSamplerinstances for outbound (send+write) and inbound (recv+read) directions; both read from a single sharedRateLimiterRateLimitercovers all four hooks combined: ~83 events/s (~5000/min) total budget across all directions. A write-heavy workload raises the interval but does not suppress long blocking reads because P → 1 as d >> Tinterval_ticksonce per second based on aggregate fire count across all threads_epochis bumped onstart(); per-thread samplers reinitialise lazily on next hook invocationAcceptance Criteria
nativesocketsend,recv,write,readvia PLT hooking;write/readskip non-socket fds via a lock-free fd-type cacheP = 1 - exp(-duration/interval),weight = 1/(1-exp(-duration/interval))PoissonSamplerinstances sharing oneRateLimiterdatadog.NativeSocketEvent:eventThread,stackTrace,startTime,duration,operation(SEND/RECV),remoteAddress(ip:port string),bytesTransferred(u64),weight(float)getpeername()and cached per fd; cache bounded to 65536 entriessendto/recvfrom) and connection/close operations are NOT trackedsum(weight × duration)is positive and within reasonable bounds of wall time