Security: Fragile Monkeypatching of asyncio Internals#5326
Security: Fragile Monkeypatching of asyncio Internals#5326tang-vu wants to merge 1 commit intolivekit:mainfrom
Conversation
The `hook_slow_callbacks` function in `livekit.agents.utils.aio.debug` monkeypatches the private method `asyncio.events.Handle._run`. Relying on private internal APIs of the Python standard library is fragile and can lead to unexpected crashes or behavior changes when Python is updated. In a production environment, this could lead to service instability. Affected files: debug.py Signed-off-by: Tang Vu <145498528+tang-vu@users.noreply.github.com>
| asyncio.events.Handle._run = instrumented # type: ignore | ||
| loop = asyncio.get_event_loop() | ||
| loop.slow_callback_duration = slow_duration | ||
| loop.set_debug(True) |
There was a problem hiding this comment.
🔴 set_debug(True) unconditionally enables full asyncio debug mode, adding significant overhead beyond slow callback detection
The old implementation monkey-patched asyncio.events.Handle._run to precisely detect only slow callbacks, with no other side effects. The new implementation calls loop.set_debug(True) which is required for slow_callback_duration to take effect (CPython only checks slow callbacks inside _run_once when self._debug is True), but this also enables the full asyncio debug mode. This includes: storing creation tracebacks for all tasks/transports (memory overhead), logging destroyed pending tasks, extra resource-cleanup checks, and other diagnostics. The function is named hook_slow_callbacks suggesting it should be safe for production performance monitoring, but full debug mode introduces non-trivial overhead. The old implementation achieved targeted slow callback detection without any of these side effects.
Note the contrast with proc_client.py:63 where set_debug() is conditional on an explicit asyncio_debug flag, indicating the codebase treats debug mode as an opt-in concern separate from slow callback duration.
Prompt for agents
The problem is that loop.set_debug(True) is needed for slow_callback_duration to work via the built-in asyncio mechanism, but it enables the full asyncio debug mode with significant overhead (task creation tracebacks, resource tracking, etc.). The old monkey-patching approach in this function was specifically designed to detect slow callbacks without enabling full debug mode.
Possible approaches:
1. Revert to the targeted monkey-patching approach from the old code, which only wrapped Handle._run with timing logic and logged via the agents logger. This gives slow callback detection without debug mode overhead.
2. If the built-in mechanism is preferred, document clearly that this function enables full asyncio debug mode as a side effect, and possibly rename the function to reflect its broader impact.
3. Consider a hybrid: keep the monkey-patching for slow callback detection but use the agents own logger instead of relying on asyncio's built-in debug logging.
Was this helpful? React with 👍 or 👎 to provide feedback.
Problem
The
hook_slow_callbacksfunction inlivekit.agents.utils.aio.debugmonkeypatches the private methodasyncio.events.Handle._run. Relying on private internal APIs of the Python standard library is fragile and can lead to unexpected crashes or behavior changes when Python is updated. In a production environment, this could lead to service instability.Severity:
mediumFile:
livekit-agents/livekit/agents/utils/aio/debug.pySolution
Avoid monkeypatching private members of the
asynciomodule. Use official debugging tools likeloop.set_debug(True)or custom event loop policies if monitoring callback duration is necessary.Changes
livekit-agents/livekit/agents/utils/aio/debug.py(modified)Testing