Add Workflow Streams library by jssmith · Pull Request #1423 · temporalio/sdk-python

jssmith · 2026-04-07T22:06:51Z

What was changed

Adds temporalio.contrib.workflow_stream, a reusable primitive for streaming data through Temporal workflows. The module and its integrations with the OpenAI Agents and Google ADK plugins are marked experimental. Plugin streaming is opt-in: callers must set streaming_event_topic to enable publishing.

Why?

Streaming incremental results from long-running workflows (e.g., AI agent token streams, progress updates) is a common need with no built-in solution. This module provides a correct, reusable implementation so users don't have to roll their own poll/signal/dedup logic.

Checklist

Closes — N/A (new contrib module, no existing issue)
How was this tested:
- 29 pytest tests in tests/contrib/workflow_stream/test_workflow_stream.py covering batching, flush safety, CAN serialization, replay guards, dedup (TTL pruning, truncation), offset-based resumption, max_batch_size, drain, and error handling, plus a payload round-trip prototype test
- Demo application
- Shared with prospective users
- 8-hour load test
Any docs updates needed?
- Module includes README.md with usage examples and API reference
- Design doc: DESIGN.md (covers CAN, dedup, and topic semantics)
- docs.temporal.io updates are prepared on a separate branch and will land soon

A workflow mixin (PubSubMixin) that turns any workflow into a pub/sub broker. Activities and starters publish via batched signals; external clients subscribe via long-poll updates exposed as an async iterator. Key design decisions: - Payloads are opaque bytes for cross-language compatibility - Topics are plain strings, no hierarchy or prefix matching - Global monotonic offsets (not per-topic) for simple continuation - Batching built into PubSubClient with Nagle-like timer + priority flush - Structured concurrency: no fire-and-forget tasks, trio-compatible - Continue-as-new support: drain_pubsub() + get_pubsub_state() + validator to cleanly drain polls, plus follow_continues on the subscriber side Module layout: _types.py — PubSubItem, PublishInput, PollInput, PollResult, PubSubState _mixin.py — PubSubMixin (signal, update, query handlers) _client.py — PubSubClient (batcher, async iterator, CAN resilience) 9 E2E integration tests covering: activity publish + subscribe, topic filtering, offset-based replay, interleaved workflow/activity publish, priority flush, iterator cancellation, context manager flush, concurrent subscribers, and mixin coexistence with application signals/queries. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

PubSubState is now a Pydantic model so it survives serialization through Pydantic-based data converters when embedded in Any-typed fields. Without this, continue-as-new would fail with "'dict' object has no attribute 'log'" because Pydantic deserializes Any fields as plain dicts. Added two CAN tests: - test_continue_as_new_any_typed_fails: documents that Any-typed fields lose PubSubState type information (negative test) - test_continue_as_new_properly_typed: verifies CAN works with properly typed PubSubState | None fields Simplified subscribe() exception handling: removed the broad except Exception clause that tried _follow_continue_as_new() on every error. Now only catches WorkflowUpdateRPCTimeoutOrCancelledError for CAN follow. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

README.md: usage-oriented documentation covering workflow mixin, activity publishing, subscribing, continue-as-new, and cross-language protocol. flush() safety: items are now removed from the buffer only after the signal succeeds. Previously, buffer.clear() ran before the signal, losing items on failure. Added test_flush_retains_items_on_signal_failure. init_pubsub() guard: publish() and _pubsub_publish signal handler now check for initialization and raise a clear RuntimeError instead of a cryptic AttributeError. PubSubClient.for_workflow() factory: preferred constructor that takes a Client + workflow_id. Enables follow_continues in subscribe() without accessing private WorkflowHandle._client. The handle-based constructor remains for simple cases that don't need CAN following. activity_pubsub_client() now uses for_workflow() internally with proper keyword-only typed arguments instead of **kwargs: object. CAN test timing: replaced asyncio.sleep(2) with assert_eq_eventually polling for a different run_id, matching sdk-python test patterns. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

_pubsub_poll and _pubsub_offset now call _check_initialized() for a clear RuntimeError instead of cryptic AttributeError when init_pubsub() is forgotten. README CAN example now includes the required imports (@DataClass, workflow) and @workflow.init decorator. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The poll validator accesses _pubsub_draining, which would AttributeError if init_pubsub() was never called. Added _check_initialized() guard. Fixed PubSubState docstring: the field must be typed as PubSubState | None, not Any. The old docstring incorrectly implied Any-typed fields would work. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

get_pubsub_state() and drain_pubsub() now call _check_initialized(). Previously drain_pubsub() could silently set _pubsub_draining on an uninitialized instance, which init_pubsub() would then reset to False. New tests: - test_max_batch_size: verifies auto-flush when buffer reaches limit, using max_cached_workflows=0 to also test replay safety - test_replay_safety: interleaved workflow/activity publish with max_cached_workflows=0, proving the mixin is determinism-safe Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Review comments (#@agent: annotations) capture design questions on: - Topic offset model and information leakage (resolved: global offsets with BFF-layer containment, per NATS JetStream model) - Exactly-once publish delivery (resolved: publisher ID + sequence number dedup, per Kafka producer model) - Flush concurrency (resolved: asyncio.Lock with buffer swap) - CAN follow behavior, poll rate limiting, activity context detection, validator purpose, pyright errors, API ergonomics DESIGN-ADDENDUM-TOPICS.md: full exploration of per-topic vs global offsets with industry survey (Kafka, Redis, NATS, PubNub, Google Pub/Sub, RabbitMQ). Concludes global offsets are correct for workflow-scoped pub/sub; leakage contained at BFF trust boundary. DESIGN-ADDENDUM-DEDUP.md: exactly-once delivery via publisher ID + monotonic sequence number. Workflow dedup state is dict[str, int], bounded by publisher count. Buffer swap pattern with sequence reuse on failure. PubSubState carries publisher_sequences through CAN. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Types: - Remove offset from PubSubItem (global offset is now derived) - Add publisher_id + sequence to PublishInput for exactly-once dedup - Add base_offset + publisher_sequences to PubSubState for CAN - Use Field(default_factory=...) for Pydantic mutable defaults Mixin: - Add _pubsub_base_offset for future log truncation support - Add _pubsub_publisher_sequences for signal deduplication - Dedup in signal handler: reject if sequence <= last seen - Poll uses base_offset arithmetic for offset translation - Class-body type declarations for basedpyright compatibility - Validator docstring explaining drain/CAN interaction - Module docstring gives specific init_pubsub() guidance Client: - asyncio.Lock + buffer swap for flush concurrency safety - Publisher ID (uuid) + monotonic sequence for exactly-once delivery - Sequence advances on failure to prevent data loss when new items merge with retry batch (found via Codex review) - Remove follow_continues param — always follow CAN via describe() - Configurable poll_interval (default 0.1s) for rate limiting - Merge activity_pubsub_client() into for_workflow() with auto-detect - _follow_continue_as_new is async with describe() check Tests: - New test_dedup_rejects_duplicate_signal - Updated flush failure test for new sequence semantics - All activities use PubSubClient.for_workflow() - Remove PubSubItem.offset assertions - poll_interval=0 in test helper for speed Docs: - DESIGN-v2.md: consolidated design doc superseding original + addenda - README.md: updated API reference - DESIGN-ADDENDUM-DEDUP.md: corrected flush failure semantics Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Rewrite the client-side dedup algorithm to match the formally verified TLA+ protocol: failed flushes keep a separate _pending batch and retry with the same sequence number. Only advance the confirmed sequence on success. TLC proves NoDuplicates and OrderPreserved for the correct algorithm, and finds duplicates in the old algorithm. Add TTL-based pruning of publisher dedup entries during continue-as-new (default 15 min). Add max_retry_duration (default 600s) to bound client retries — must be less than publisher_ttl for safety. Both constraints are formally verified in PubSubDedupTTL.tla. Add truncate_pubsub() for explicit log prefix truncation. Add publisher_last_seen timestamps for TTL tracking. Preserve legacy state without timestamps during upgrade. API changes: for_workflow→create, flush removed (use priority=True), poll_interval→poll_cooldown, publisher ID shortened to 16 hex chars. Includes TLA+ specs (correct, broken, inductive, multi-publisher TTL), PROOF.md with per-action preservation arguments, scope and limitations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

New analysis document evaluates whether publishing should use signals or updates, examining Temporal's native dedup (Update ID per-run, request_id for RPCs) vs the application-level (publisher_id, sequence) protocol. Conclusion: app-level dedup is permanent for signals but could be dropped for updates once temporal/temporal#6375 is fixed. Non-blocking flush keeps signals as the right choice for streaming. Updates DESIGN-v2.md section 6 to be precise about the two Temporal guarantees that signal ordering relies on: sequential send order and history-order handler invocation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Analyzes deduplication through the end-to-end principle lens. Three types of duplicates exist in the pipeline, each handled at the layer that introduces them: - Type A (duplicate LLM work): belongs at application layer — data escapes to consumers before the duplicate exists, so only the application can resolve it - Type B (duplicate signal batches): belongs in pub/sub workflow — encapsulates transport details and is the only layer that can detect them correctly - Type C (duplicate SSE delivery): belongs at BFF/browser layer Concludes the (publisher_id, sequence) protocol is correctly placed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

… design Fill gaps identified during design review: - Document why per-topic offsets were rejected (trust model, cursor portability, unjustified complexity) inline rather than only in historical addendum - Expand BFF section with the four reconnection options considered and the decision to use SSE Last-Event-ID with BFF-assigned gapless IDs - Add poll efficiency characteristics (O(new items) common case) - Document BFF restart fallback (replay from turn start) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Wire types (PublishEntry, _WireItem, PollResult, PubSubState) encode data as base64 strings for cross-language compatibility across all Temporal SDKs. User-facing types (PubSubItem) use native bytes. Conversion happens inside handlers: - Signal handler decodes base64 → bytes on ingest - Poll handler encodes bytes → base64 on response - Client publish() accepts bytes, encodes for signal - Client subscribe() decodes poll response, yields bytes This means Go/Java/.NET ports get cross-language compat for free since their JSON serializers encode byte[] as base64 by default. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Remove the bounded poll wait from PubSubMixin and trim trailing whitespace from types. Update DESIGN-v2.md with streaming plugin rationale (no fencing needed, UI handles repeat delivery). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add opt-in streaming code path to both agent framework plugins. When enabled, the model activity calls the streaming LLM endpoint, publishes TEXT_DELTA/THINKING_DELTA/TOOL_CALL_START events via PubSubClient as a side channel, and returns the complete response for the workflow to process (unchanged interface). OpenAI Agents SDK: - ModelActivityParameters.enable_streaming flag - New invoke_model_activity_streaming method on ModelActivity - ModelResponse reconstructed from ResponseCompletedEvent - Uses @_auto_heartbeater for periodic heartbeats - Routing in _temporal_model_stub (rejects local activities) Google ADK: - TemporalModel(streaming=True) constructor parameter - New invoke_model_streaming activity using stream=True - Registered in GoogleAdkPlugin Both use batch_interval=0.1s for near-real-time token delivery. No pubsub module changes needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The Pydantic BaseModel was introduced as a workaround for Any-typed fields losing type information during continue-as-new serialization. The actual fix is using concrete type annotations (PubSubState | None), which the default data converter handles correctly for dataclasses — no Pydantic dependency needed. This removes the pydantic import from the pubsub contrib module entirely, making it work out of the box with the default data converter. All 18 tests pass, including both continue-as-new tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Implements DESIGN-ADDENDUM-ITEM-OFFSET.md. The poll handler now annotates each item with its global offset (base_offset + position in log), enabling subscribers to track fine-grained consumption progress for truncation. This is needed for the voice-terminal agent where audio chunks must not be truncated until actually played, not merely received. - Add offset field to PubSubItem and _WireItem (default 0) - Poll handler computes offset from base_offset + log_offset + enumerate index - subscribe() passes wire_item.offset through to yielded PubSubItem - Tests: per-item offsets, offsets with topic filtering, offsets after truncation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Documents the motivation and design for adding offset fields to PubSubItem and _WireItem, enabling subscribers to track consumption at item granularity rather than batch boundaries. Driven by the voice-terminal agent's need to truncate only after audio playback, not just after receipt. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Three changes: 1. Poll handler: replace ValueError with ApplicationError(non_retryable=True) when requested offset has been truncated. This fails the UPDATE (client gets the error) without crashing the WORKFLOW TASK — avoids the poison pill during replay that caused permanent workflow failures. 2. Poll handler: treat from_offset=0 as "from the beginning of whatever exists" (i.e., from base_offset). This lets subscribers recover from truncation by resubscribing from 0 without knowing the current base. 3. PubSubClient.subscribe(): catch WorkflowUpdateFailedError with type TruncatedOffset and retry from offset 0, auto-recovering. New tests: - test_poll_truncated_offset_returns_application_error - test_poll_offset_zero_after_truncation - test_subscribe_recovers_from_truncation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Verify that PubSubClient can subscribe to events from a different workflow (same namespace) and that Nexus operations can start pub/sub broker workflows in a separate namespace with cross-namespace subscription working end-to-end. No library changes needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Poll responses now estimate wire size (base64 data + topic) and stop adding items once the response exceeds 1MB. The new `more_ready` flag on PollResult tells the subscriber that more data is available, so it skips the poll_cooldown sleep and immediately re-polls. This avoids unnecessary latency during big reloads or catch-up scenarios while keeping individual update payloads within Temporal's recommended limits. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Codify the four wire evolution rules that have been followed implicitly through four addenda: additive-only fields with defaults, immutable handler names, forward-compatible PubSubState, and no application-level version negotiation. Includes a precedent table showing all past changes and reasoning for why version fields in payloads would cause silent data loss on signals. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

After max_retry_duration expires, the client dropped the pending batch without advancing _sequence. The next batch reused the same sequence number, which could be silently deduplicated by the workflow if the timed-out signal was actually delivered — causing permanent data loss for those items. The fix advances _sequence to _pending_seq before clearing _pending, ensuring subsequent batches always get a fresh sequence number. TLA+ verification: - Added DropPendingBuggy/DropPendingFixed actions to PubSubDedup.tla - Added SequenceFreshness invariant: (pending=<<>>) => (confirmed_seq >= wf_last_seq) - BuggyDropSpec FAILS SequenceFreshness (confirmed_seq=0 < wf_last_seq=1) - FixedDropSpec PASSES all invariants (489 distinct states) - NoDuplicates passes for both — the bug causes data loss, not duplicates Python test: - test_retry_timeout_sequence_reuse_causes_data_loss demonstrates the end-to-end consequence: reused seq=1 is rejected, fresh seq=2 accepted Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

# Conflicts: # temporalio/contrib/google_adk_agents/_model.py

This is a new release with no legacy to support. Changes: - _mixin.py: Remove ts-is-None fallback that retained publishers without timestamps. All publishers always have timestamps, so this was dead code. - _types.py: Clean up docstrings referencing addendum docs - DESIGN-v2.md: Remove backward-compat framing, addendum references, and historical file listing. Keep the actual evolution rules. - PROOF.md: "Legacy publisher_id" → "Empty publisher_id" - README.md: Reference DESIGN-v2.md instead of deleted addendum - Delete DESIGN.md and 4 DESIGN-ADDENDUM-*.md files (preserved in the top-level streaming-comparisons repo) - Delete stale TLA+ trace .bin files Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Simplify the README to focus on essential API patterns. Rename for_workflow() to create() throughout, condense the topics section, remove the exactly-once and type-warning sections (these details belong in DESIGN-v2.md), and update the API reference table with current parameter signatures. Also fix whitespace alignment in DESIGN-v2.md diagram. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…de pubsub state The CAN example only showed pubsub_state being passed through, which could mislead readers into thinking that's all that's needed. Updated to include a representative application field (items_processed) to make it clear that your own workflow state must also be carried across the CAN boundary. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Runner.run() and Runner.run_streamed() duplicated ~80 lines of workflow-only setup: callable-tool rejection, MCP server type validation, SQLiteSession rejection, RunConfig defaulting, string-model -> _TemporalModelStub replacement, sandbox configuration validation, and the recursive _convert_agent walk over the handoff graph. Drift between the two paths was a real risk — a fix to one would not automatically apply to the other. Extract _prepare_workflow_run, called by both. The helper mutates kwargs in place (writing back the rewritten run_config) and returns the converted starting agent. Both call sites then splat **kwargs into the underlying SDK runner. Side effect: run() previously forwarded a hand-maintained whitelist of named kwargs (context, max_turns, hooks, run_config, previous_response_id, session) and silently dropped the other RunOptions keys — error_handlers, auto_previous_response_id, conversation_id. The splat shape forwards the full RunOptions surface, matching what a non-workflow caller would see. run_streamed() also tightens its kwargs type from **kwargs: Any to **kwargs: Unpack[RunOptions[TContext]], matching run(). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Both openai_agents.ModelActivityParameters and google_adk_agents. TemporalModel previously defaulted streaming_event_topic to "events". That meant any workflow using Runner.run_streamed (OpenAI) or generate_content_async(stream=True) (ADK) would silently publish every stream event to topic "events" — even if the workflow never hosted a PubSub broker, in which case the publish signals were unhandled and dropped. Pydantic-ai's TemporalModel already defaults the same option to None (opt-in). This commit aligns the other two plugins with that shape: publishing is now an explicit opt-in, set the topic to enable. Tests that exercise the publish path now set streaming_event_topic="events" explicitly. The OpenAI README's streaming snippet no longer constructs a PubSub broker (the workflow- side stream_events() iteration doesn't need one); a follow-up paragraph documents the explicit OpenAIAgentsPlugin(...) config required for external publishing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Selected feature name is "Workflow Streams" (see docs/rename-to-workflow-streams.md and docs/naming-analysis.md in the streaming-comparisons superrepo). The contrib module, classes, wire- protocol handlers, and tests are renamed in one atomic change so the build stays green; cross-module callers in openai_agents and google_adk_agents are updated in the same commit because they import WorkflowStreamClient directly. Module: temporalio.contrib.pubsub -> temporalio.contrib.workflow_stream Classes: PubSub -> WorkflowStream PubSubClient -> WorkflowStreamClient PubSubState -> WorkflowStreamState PubSubItem -> WorkflowStreamItem _WireItem -> _WorkflowStreamWireItem Wire handlers: __temporal_pubsub_publish -> __temporal_workflow_stream_publish __temporal_pubsub_poll -> __temporal_workflow_stream_poll __temporal_pubsub_offset -> __temporal_workflow_stream_offset File rename: _broker.py -> _stream.py (the class is the stream itself, not a workflow; "broker" carried pub/sub framing) Method verbs publish/subscribe stay literal per the rename doc. The operation-level dataclasses PublishEntry/PublishInput/PollInput/ PollResult/PublisherState are also kept bare for parity with the verbs; the doc's mapping for PublishEntry is intentionally not followed. Module path is singular workflow_stream (not plural workflow_streams as in the rename doc) to match every other single-feature contrib module in sdk-python (aws, langsmith, opentelemetry, pubsub, pydantic) and sdk-typescript (activity, client, worker, workflow, contrib-pubsub). Plurals in both SDKs are reserved for genuine collections. The wire-handler rename does break compatibility with any in-flight workflow; per the rename doc that is acceptable since this contrib has not been publicly released and the demo app rebuilds against the new SDK in a follow-up PR. The whitespace-only edit to openai_agents/_mcp.py is a pre-existing lint failure picked up by ruff --fix during this work; flagged here because it is unrelated to the rename. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The "v2" suffix was a holdover from when an earlier design was kept alongside. There is now a single canonical design document; the filename should match. Title and the file-tree code block inside the doc are also updated; remaining "v2" references in the body refer to hypothetical future protocol versions, not to this document. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Removed two non-streaming tests already covered by the main test files (test_hello_world_agent in test_openai.py, test_single_agent in test_google_adk_agents.py) and a dead TruncatedStreamingTestModel class. Strengthened the remaining streaming tests: - OpenAI workflow-side assertion now requires exact ordered match against the published list instead of `in` membership. - OpenAI `streaming_event_topic=None` test registers a WorkflowStream and asserts offset==0 to actually prove no publishing occurred. - ADK StreamingTestModel raises if called with stream=False, so a regression that drops the flag fails the test. - ADK final-result assertion checks `result == "world!"` instead of the vacuous `result is not None`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

jssmith · 2026-04-29T04:24:11Z

Heads up: the module was renamed from pubsub to workflow_stream (commits 5890c58, 57e52f4), so paths in the original threads no longer resolve. Type names changed too: PubSubClient → WorkflowStreamClient, PubSub → WorkflowStream, PubSubItem → WorkflowStreamItem.

Updates the intro paragraph to mention "associated Activities" alongside workflows, and adds a one-line note in the activity-side section that the target workflow must construct a WorkflowStream from @workflow.init or publish signals are dropped. The dropped-signal warning was already in the OpenAI/ADK plugin docstrings; this restates it where someone landing on the activity-side example would see it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- DESIGN.md: redraw architecture diagram so handler names and state fields fit cleanly inside the box. Layout-only — no information changes. Update the in-doc test path reference for the rename below. - _types.py: drop a stale cross-reference to docs/pubsub-payload-migration.md (lives in a different repo and has not been renamed in lockstep). Remaining DESIGN.md §5 reference is sufficient. - Rename tests/contrib/workflow_stream/test_payload_roundtrip_prototype.py to test_payload_roundtrip.py. The file is no longer a prototype that de-risked the migration; it is the regression guard for the chosen Payload wire format. Filename now matches the docstring framing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

# Conflicts: # uv.lock

from_activity() requires an activity scheduled by a workflow. Clarify the docstring, give a more actionable error message that points at create() with an explicit workflow id, and add a README example for the standalone-activity pattern. Three new integration tests exercise publish, subscribe, and the from_activity misuse error from activities started directly via Client.start_activity (skipped under the Java time-skipping server, which does not support that API). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

tconley1428 · 2026-04-29T16:02:30Z

    previous_response_id: str | None
    conversation_id: str | None
    prompt: Any | None
+    streaming_event_topic: str | None


I would prefer that the non-streaming activity doesn't take these inputs that don't make sense for it and can't use. You can define a separate input as a subclass of this one I believe.

Ok, can do that.

Done in 1f4099a. Base ActivityModelInput no longer carries the streaming-only fields; invoke_model_activity_streaming takes a new StreamingActivityModelInput(ActivityModelInput) subclass with streaming_event_topic: Required[str] and streaming_event_batch_interval: timedelta.

tconley1428 · 2026-04-29T16:05:36Z


+@activity.defn
+async def invoke_model_streaming(
+    llm_request: LlmRequest,


Activities should typically have single input dataclasses.

Right... will fix.

Done in 7c910de. invoke_model_streaming now takes a single StreamingInvokeInput dataclass (llm_request, streaming_event_topic, streaming_event_batch_interval). Left invoke_model taking LlmRequest directly since that already satisfies "single input" — happy to wrap it too if you prefer full uniformity.

tconley1428 · 2026-04-29T16:06:22Z

+@activity.defn
+async def invoke_model_streaming(
+    llm_request: LlmRequest,
+    streaming_event_topic: str | None,


How does it make sense to invoke streaming without a stream topic, which then just doesn't publish?

Weak reasons - probably not practical ones. We may be able to avoid a potential footgun here.

Dropped the topic=None branch in both ADK (7c910de) and OpenAI (1f4099a). The streaming path always opens a WorkflowStreamClient and publishes; no-topic now fails fast (see thread on test_openai_streaming.py:211).

tconley1428 · 2026-04-29T16:07:35Z

@@ -0,0 +1,1403 @@
+# Temporal Workflow Streams — Design Document


Do we want to check this whole document in? It seems like keeping it up to date could be a burden.

I've been debating that—I think we can remove it.

Removed in 22ad024. The canonical guide stays on docs.temporal.io; the long-form design notes are preserved out-of-tree for future reference. README and _types.py no longer reference the file.

tconley1428 · 2026-04-29T16:10:31Z

+    activity is configured with ``streaming_event_topic=None``.
+
+    Registers a :class:`WorkflowStream` so the test can subscribe and
+    verify the activity did not publish anything.


I guess we may not want to fail because we would have to do so at runtime, potentially well after the workflow has started, but I think at least a warning is in order. It doesn't really make a lot of sense to do this.

Went stricter than a warning — TemporalOpenAIRunner.run_streamed now raises AgentsWorkflowError before delegating to the agents framework when streaming_event_topic is unset, and same for use_local_activity=True (1f4099a). I'd originally put the check inside _TemporalModelStub.stream_response, but the framework runs the model in a background task and stuffs errors into RunResultStreaming._stored_exception, which gets dropped if the queue completion sentinel is read before the task is observed as done — so the error never surfaced and the workflow silently returned final_output=None. Validating in the runner short-circuits before the framework starts the task. ADK uses ApplicationError(non_retryable=True) in TemporalModel.generate_content_async for the same reason (7c910de). Two new regression tests cover both cases. If you'd rather it be a warning, easy to soften.

tconley1428 · 2026-04-29T16:11:29Z

+    normalization."""
+    async with AgentEnvironment(
+        model=StreamingTestModel(),
+        model_params=ModelActivityParameters(


I think we should consider a way to push these parameters down to the specific model usage, or at least the runner. At least as an option. @JasonSteving99 - That's not a thing we should address here though.

Acknowledged as a follow-up — out of scope for this PR.

Adds a Future Work pointer to docs/pubsub-design-analysis/final-flag-prune.md, which proposes a `final: bool` field on PublishInput so cleanly-exited publishers can have their dedup PublisherState pruned on a tighter schedule than the full publisher_ttl. Deferred for now. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The Workflow Streams user guide now lives at https://docs.temporal.io/develop/python/workflows/workflow-stream and is the primary reference. Replace the long quick-start / API-reference README with a short motivating summary, key technical highlights, and a prominent link to the docs site. DESIGN.md gets one extra sentence at the top reframing it as the contributor/internals doc and pointing readers at the user docs.

Sushisource

Overall looking good to me, I only focused on API not impl

Sushisource · 2026-04-29T17:48:49Z

+        update/signal handlers that read ``WorkflowStream`` state can
+        observe pre-publish state when both land in the same activation.
+        Make such handlers ``async`` and ``await asyncio.sleep(0)`` before
+        reading state. See the "Gotcha" section of this module's


I find this wording very confusing

Sushisource · 2026-04-29T17:50:05Z

+        The check inspects the immediate caller's frame and requires the
+        function name to be ``__init__``.


Imo rather than this it might make more sense for us to just set a local around our own invocation of __init__ from the SDK, but, that involves touching the core SDK so maybe we don't wanna bother for now

Sushisource · 2026-04-29T17:51:38Z

+        Prunes publisher dedup entries older than ``publisher_ttl``. The
+        TTL must exceed the ``max_retry_duration`` of any client that
+        may still be retrying a failed flush.


Not immediately clear what "publisher dedup entries" means

Sushisource · 2026-04-29T17:52:35Z

+    def drain(self) -> None:
+        """Unblock all waiting poll handlers and reject new polls.


Still not a huge fan of this name. Maybe finalize? Not blocking since I don't really have a much better idea

Sushisource · 2026-04-29T17:52:56Z

+        Replaces the three-line recipe ``drain()`` →
+        ``wait_condition(all_handlers_finished)`` →
+        ``workflow.continue_as_new(args=...)`` for the common case where
+        the only CAN parameter that varies is ``args``.


Should probably be a code block rather than this odd arrow thing

Sushisource · 2026-04-29T17:59:09Z

+    """
+
+    topic: str
+    data: Any


I think we might want this to be Payload | Decoded(T), there is an (admittedly very niche, but totally possible) edge case where the user wants T to be Payload, and in that case we might do the wrong thing because of the type confusion?

Sushisource · 2026-04-29T17:59:44Z

@@ -0,0 +1,1419 @@
+# Temporal Workflow Streams — Design Document
+
+Consolidated design document reflecting the current implementation. This


Do we want to check this whole thing in?

Sushisource · 2026-04-29T18:00:58Z

+        """Create a stream client from a Temporal client and workflow ID.
+
+        Use this when the caller has an explicit ``Client`` and
+        ``workflow_id`` in hand (starters, BFFs, other workflows'


Best friends forever?

Sushisource · 2026-04-29T18:01:48Z

+        )
+
+    @classmethod
+    def from_activity(


This is maybe more like from_within_activity?

Sushisource · 2026-04-29T18:05:33Z

+        while self._pending is not None or self._buffer:
+            await self._flush()
+
+    def publish(self, topic: str, value: Any, force_flush: bool = False) -> None:


The fact that you can publish different Ts to the same topic is something we could rectify with topic handles.

Like:

topic = client.topic(topic, type=T) topic.publish("hi")

Internally this could enforce that you can't create multiple handles to the same topic with different T.

This is somewhere between the choices we discussed earlier about topics-as-streams or topics-in-streams.

Kind of a big change late in the game, but, I think it'd be nice. We don't have to do it now, but, maybe worth considering iterating on.

Reviewer flagged the 1419-line design doc as a maintenance burden — the implementation has already drifted from a few sections, and keeping it synchronized in-tree adds churn for every refactor. Move the canonical design notes out of the SDK; the README continues to point at the docs.temporal.io guide for users, and the design file is preserved in the streaming-comparisons project for future reference. Strip the ``DESIGN.md`` references from README.md and _types.py so no in-tree pointer remains. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Address tconley feedback that (a) the non-streaming activity should not carry inputs only meaningful to the streaming path, and (b) invoking streaming with topic=None is a footgun with no real benefit (the workflow gets the chunked list batched at activity completion either way; "no-publish streaming" doesn't deliver real-time value to anyone). Changes: - Split ``ActivityModelInput`` (TypedDict) into the base shape used by ``invoke_model_activity`` and a ``StreamingActivityModelInput`` subclass with ``streaming_event_topic: Required[str]`` and the batch interval used only by ``invoke_model_activity_streaming``. The streaming activity now always opens a ``WorkflowStreamClient``; the ``topic is None`` branch and its docstring caveat are removed. - Validate at the runner before delegating to the agents framework. ``TemporalOpenAIRunner.run_streamed`` raises ``AgentsWorkflowError`` when ``model_params.streaming_event_topic`` is unset, or when ``use_local_activity=True`` (local activities have no heartbeat or signal channel). Both checks must happen here rather than inside the stub's ``stream_response``: the agents framework runs the model in a background task and silently captures errors into ``RunResultStreaming._stored_exception``, which can be lost when the queue completion sentinel is read before the task is observed as done — failing in the runner short-circuits before the framework starts the task. The stub keeps a defensive guard for direct callers. - Update ``ModelActivityParameters`` and the integration README so the documented contract matches the runtime behavior. - Replace the ``StreamingWithoutStreamTopicWorkflow`` test with ``StreamingRequiresTopicWorkflow`` covering the topic-missing path, and add ``test_streaming_rejects_local_activity`` for the use_local_activity case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Address tconley feedback on the ADK streaming activity — activities should take a single dataclass input, and invoking streaming without a topic is a footgun for the same reasons as on the OpenAI side (the workflow only sees chunks batched at activity completion, so the "streaming without publishing" path delivers no real-time value). Changes: - Wrap ``invoke_model_streaming`` inputs in ``StreamingInvokeInput`` (llm_request + streaming_event_topic + streaming_event_batch_interval). Drop the ``topic is None`` branch; the activity always opens a ``WorkflowStreamClient`` and publishes each chunk. ``invoke_model`` (non-streaming) keeps its existing ``LlmRequest`` argument since that already satisfies the single-input convention. - Validate in ``TemporalModel.generate_content_async`` before scheduling the streaming activity. Raise ``ApplicationError(non_retryable=True)`` so the failure surfaces as a terminal workflow failure without needing plugin-level ``workflow_failure_exception_types`` registration. - Update the constructor docstring to reflect the now-required topic. - Add ``StreamingAdkRequiresTopicWorkflow`` plus ``test_streaming_requires_topic`` covering the no-topic path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Per PR review feedback (Sushisource), rename the WorkflowStreamClient classmethod and update all call sites, references, error messages, comments, and tests. The new name reads more clearly at the call site — it documents that the method must be invoked from inside an activity rather than that it builds something derived from one — and matches how we already describe it in the docstring ("must be called from within an activity").

Per PR review feedback, drop the special case where subscribe() with no result_type yields a raw Payload, and instead delegate to the payload converter's default Any decoding (the same behavior as signal/update/query handlers without a type hint). Callers that want the original Payload pass result_type=temporalio.common.RawValue, mirroring the standard Temporal convention. The only caller-visible change for typed callers is the no-result_type path: a JSON-converter consumer that previously got back a Payload now gets back a Python dict/list/scalar (or bytes for binary payloads). Heterogeneous-topic dispatchers relying on Payload.metadata should switch to result_type=RawValue and read item.data.payload.metadata. Adds a regression test covering both default decode (dict) and RawValue passthrough (Payload bytes preserved).

jssmith and others added 15 commits April 5, 2026 21:33

Remove TLA+ proof references from implementation code

42b0df1

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Update uv.lock

c87a65a

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

tconley1428 reviewed Apr 7, 2026

View reviewed changes

Comment thread temporalio/contrib/pubsub/_types.py Outdated

jssmith and others added 14 commits April 7, 2026 20:10

Merge remote-tracking branch 'origin/main' into contrib/pubsub

6f0f345

# Conflicts: # temporalio/contrib/google_adk_agents/_model.py

jssmith and others added 4 commits April 28, 2026 14:16

jssmith changed the title ~~Add temporalio.contrib.pubsub module~~ Add Workflow Streams library Apr 29, 2026

jssmith and others added 2 commits April 28, 2026 21:51

jssmith requested review from JasonSteving99, Sushisource, brianstrauch and tconley1428 April 29, 2026 05:31

jssmith and others added 3 commits April 29, 2026 00:21

Merge remote-tracking branch 'origin/main' into contrib/pubsub

4b88baf

# Conflicts: # uv.lock

workflow_stream: ruff format test_workflow_stream.py

4279c2e

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

tconley1428 reviewed Apr 29, 2026

View reviewed changes

jssmith and others added 2 commits April 29, 2026 09:29

Sushisource reviewed Apr 29, 2026

View reviewed changes

jssmith and others added 5 commits April 29, 2026 11:09

		@@ -0,0 +1,1403 @@
		# Temporal Workflow Streams — Design Document

		The check inspects the immediate caller's frame and requires the
		function name to be ``__init__``.

		def drain(self) -> None:
		"""Unblock all waiting poll handlers and reject new polls.

		@@ -0,0 +1,1419 @@
		# Temporal Workflow Streams — Design Document

		Consolidated design document reflecting the current implementation. This

Conversation

jssmith commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What was changed

Why?

Checklist

Uh oh!

Uh oh!

jssmith commented Apr 29, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jssmith Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Sushisource left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

jssmith commented Apr 7, 2026 •

edited

Loading

jssmith Apr 29, 2026 •

edited

Loading