Summary
DatabaseSessionService.get_session() and list_sessions() are pure read operations, but they execute inside SQLAlchemy's default read-write transactions. On Cloud Spanner, this means they use OCC read-write transactions instead of read-only snapshots, which causes RetryAborted errors when another request is concurrently writing to the same session via append_event().
Observed behavior
When two requests for the same session overlap (e.g. a new turn arrives while the previous turn is still writing events), the get_session() read in the new request can fail with RetryAborted from google.cloud.spanner_dbapi.exceptions. This is a transient Spanner OCC conflict — the read-write transaction's read set intersects with the concurrent append_event() write.
This is surprising because get_session() never writes — it only executes SELECT statements.
Real-world impact
In production workloads with rapid successive turns on the same session, the get_session() call fails with RetryAborted, causing the turn to start with missing session state. Application-level retry works around this, but it shouldn't be necessary for a read-only operation.
Root cause
DatabaseSessionService uses a single async_sessionmaker for all operations:
# database_session_service.py, line ~143
self.database_session_factory: async_sessionmaker[DatabaseSessionFactory] = (
async_sessionmaker(bind=self.db_engine, expire_on_commit=False)
)
Both get_session() and list_sessions() use async with self.database_session_factory() as sql_session: which opens a read-write transaction. On Spanner, read-write transactions participate in OCC and can abort when conflicting with concurrent writes.
The spanner_dbapi connection already supports read_only=True (see google.cloud.spanner_dbapi.connection, line ~78), which switches to snapshot reads that:
- Never abort due to OCC conflicts
- Have lower latency (no commit phase)
- Don't hold locks
Proposed fix
Create a separate async_sessionmaker with read_only=True for read-only operations, and use it in get_session() and list_sessions(). This would only affect the Spanner dialect — other databases (PostgreSQL, MySQL, SQLite) would see no behavioral difference since they don't have Spanner's OCC semantics.
One approach:
# In __init__:
self.database_session_factory = async_sessionmaker(bind=self.db_engine, expire_on_commit=False)
# For Spanner, create a read-only engine/session factory
# Option A: execution_options on the session
# Option B: separate engine with connect_args={"read_only": True}
The cleanest path may be passing execution_options or using SQLAlchemy's Session.execute() with connection-level options, but the exact mechanism depends on how sqlalchemy-spanner propagates read_only to the underlying spanner_dbapi connection.
Environment
- ADK version: 1.23.x
- Backend: Spanner (
spanner+spanner:///...)
- SQLAlchemy-Spanner: 1.14+
Related
Summary
DatabaseSessionService.get_session()andlist_sessions()are pure read operations, but they execute inside SQLAlchemy's default read-write transactions. On Cloud Spanner, this means they use OCC read-write transactions instead of read-only snapshots, which causesRetryAbortederrors when another request is concurrently writing to the same session viaappend_event().Observed behavior
When two requests for the same session overlap (e.g. a new turn arrives while the previous turn is still writing events), the
get_session()read in the new request can fail withRetryAbortedfromgoogle.cloud.spanner_dbapi.exceptions. This is a transient Spanner OCC conflict — the read-write transaction's read set intersects with the concurrentappend_event()write.This is surprising because
get_session()never writes — it only executesSELECTstatements.Real-world impact
In production workloads with rapid successive turns on the same session, the
get_session()call fails withRetryAborted, causing the turn to start with missing session state. Application-level retry works around this, but it shouldn't be necessary for a read-only operation.Root cause
DatabaseSessionServiceuses a singleasync_sessionmakerfor all operations:Both
get_session()andlist_sessions()useasync with self.database_session_factory() as sql_session:which opens a read-write transaction. On Spanner, read-write transactions participate in OCC and can abort when conflicting with concurrent writes.The
spanner_dbapiconnection already supportsread_only=True(seegoogle.cloud.spanner_dbapi.connection, line ~78), which switches to snapshot reads that:Proposed fix
Create a separate
async_sessionmakerwithread_only=Truefor read-only operations, and use it inget_session()andlist_sessions(). This would only affect the Spanner dialect — other databases (PostgreSQL, MySQL, SQLite) would see no behavioral difference since they don't have Spanner's OCC semantics.One approach:
The cleanest path may be passing
execution_optionsor using SQLAlchemy'sSession.execute()with connection-level options, but the exact mechanism depends on howsqlalchemy-spannerpropagatesread_onlyto the underlyingspanner_dbapiconnection.Environment
spanner+spanner:///...)Related
append_event(same underlying Spanner concurrency theme)FOR UPDATElock on app/user states