Summary
executor.execute() can return status: RUNNING with empty output even when the user's worker has thrown an exception. This is correct behavior — it is not a bug — but it is confusing for first-time users who don't know how Conductor's retry system works.
The docs should explain the relationship between wait_for_seconds, task retry delays, and what to do when a workflow appears stuck.
What's missing
-
wait_for_seconds vs retry delay: The default wait_for_seconds=10 is shorter than the default task retry delay (retryDelaySeconds=60). A workflow with a failing task will almost always return RUNNING if you use the default. This is never explained anywhere near the execute() API.
-
How to debug a stuck workflow: There's no documented path for "my workflow returned RUNNING and never completed — now what?" New users don't know to:
- Increase
wait_for_seconds to outlast the retry cycle
- Check the Conductor UI for task failure details
- Use
get_workflow(id, include_tasks=True) to inspect task statuses programmatically
- Read the
TaskHandler background logs for the worker exception traceback
-
reason_for_incompletion is deprecated with no documented replacement: The most obvious thing to check is deprecated and warns, with no pointer to an alternative.
Suggested doc additions
- In the
execute() docstring: note that RUNNING is the normal return when wait_for_seconds expires before the workflow completes, and that task retries extend this window significantly.
- In the quickstart / README: a short "debugging a stuck workflow" section covering the Conductor UI,
get_workflow(), and worker logs.
- Mention
wait_for_seconds alongside the retry defaults so users can set it appropriately for their task definition.
Background
Originally filed as #41 (closed — behavior is correct, not a bug).
Summary
executor.execute()can returnstatus: RUNNINGwith empty output even when the user's worker has thrown an exception. This is correct behavior — it is not a bug — but it is confusing for first-time users who don't know how Conductor's retry system works.The docs should explain the relationship between
wait_for_seconds, task retry delays, and what to do when a workflow appears stuck.What's missing
wait_for_secondsvs retry delay: The defaultwait_for_seconds=10is shorter than the default task retry delay (retryDelaySeconds=60). A workflow with a failing task will almost always returnRUNNINGif you use the default. This is never explained anywhere near theexecute()API.How to debug a stuck workflow: There's no documented path for "my workflow returned RUNNING and never completed — now what?" New users don't know to:
wait_for_secondsto outlast the retry cycleget_workflow(id, include_tasks=True)to inspect task statuses programmaticallyTaskHandlerbackground logs for the worker exception tracebackreason_for_incompletionis deprecated with no documented replacement: The most obvious thing to check is deprecated and warns, with no pointer to an alternative.Suggested doc additions
execute()docstring: note thatRUNNINGis the normal return whenwait_for_secondsexpires before the workflow completes, and that task retries extend this window significantly.get_workflow(), and worker logs.wait_for_secondsalongside the retry defaults so users can set it appropriately for their task definition.Background
Originally filed as #41 (closed — behavior is correct, not a bug).