Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
4b0274b
feat(sdk): update for coordinode-server v0.4.1
polaz Apr 19, 2026
d0b05df
refactor(sdk): drop orphaned HybridResult and clear notebook outputs
polaz Apr 19, 2026
c2ce320
chore(demo): strip non-deterministic notebook metadata
polaz Apr 19, 2026
8d69841
feat(sdk): expose consistency controls and document v0.4 features
polaz Apr 19, 2026
354a32d
fix(demo): install protobuf-compiler before embedded build in Colab
polaz Apr 19, 2026
4175e96
fix(sdk,demo): tighten consistency validation and pin coordinode SDK …
polaz Apr 19, 2026
4f6a797
fix(demo): pin coordinode SDK in 03 notebook Colab branch, surface un…
polaz Apr 19, 2026
c8cea55
fix(demo): remove unpinned coordinode override from Colab install block
polaz Apr 19, 2026
5957b8b
fix(demo): tighten pin condition and hard-fail on unhealthy gRPC port
polaz Apr 19, 2026
443b189
fix(demo,tests,docs): guard apt install, proto stub import, README ex…
polaz Apr 19, 2026
991f6e7
fix(demo): close failed CoordinodeClient before raise; reinstate hard…
polaz Apr 19, 2026
0b7fd95
fix(demo): drop port probe, switch embedded to file-backed persistence
polaz Apr 19, 2026
d53ee73
test(integration): exercise cypher() consistency kwargs; build(demo):…
polaz Apr 19, 2026
c8ae149
style(tests,build): use module-level pytest import; gitignore embedde…
polaz Apr 19, 2026
e938194
fix(demo,tests,docs): restore embedded adapter wrap, tighten test cle…
polaz Apr 19, 2026
8f88a67
docs(coordinode): note CREATE TEXT INDEX prerequisite for hybrid sear…
polaz Apr 19, 2026
10ab38d
fix(demo,build,tests): always pin SDK in Colab, broaden extension ign…
polaz Apr 20, 2026
2cd87d3
docs(demo): expand seed success message to cover embedded + server modes
polaz Apr 20, 2026
15e6a20
build(demo): bump Colab pin to 2cd87d3092f5d0112c662f20309846cec38ea793
polaz Apr 20, 2026
c2d62b0
fix(demo): stable DEMO_TAG in embedded mode and portable temp dir
polaz Apr 20, 2026
ced47e8
build(demo): bump Colab pin to c2d62b0595867651df43c2b3d3fdbec4341d6642
polaz Apr 20, 2026
50ddc08
fix(sdk,demo): validate causal-read precondition; clarify embedded in…
polaz Apr 20, 2026
f6c3f50
build(demo): bump Colab pin to 50ddc08a89a21fca73be007cb22b57a0054225c3
polaz Apr 20, 2026
9f86b93
style(sdk): strip whitespace when normalizing write_concern for causa…
polaz Apr 20, 2026
d62e53b
fix(sdk): validate after_index type before causal-read precondition c…
polaz Apr 21, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,4 @@ CLAUDE.md
DEVLOG*.md

**/.ipynb_checkpoints/
.claude/
3 changes: 3 additions & 0 deletions coordinode-embedded/.gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,5 @@
target/
Cargo.lock
*.so
Comment thread
polaz marked this conversation as resolved.
*.pyd
*.dylib
2 changes: 1 addition & 1 deletion coordinode-rs
Submodule coordinode-rs updated 72 files
+1 −1 .github/workflows/release.yml
+6 −0 .release-plz.toml
+161 −0 CHANGELOG.md
+40 −35 Cargo.lock
+1 −1 Cargo.toml
+1 −1 Dockerfile
+7 −4 README.md
+9 −102 crates/coordinode-client/proto_gen/coordinode.v1.query.rs
+374 −0 crates/coordinode-client/proto_gen/google.api.rs
+244 −10 crates/coordinode-client/src/lib.rs
+5 −0 crates/coordinode-core/Cargo.toml
+42 −3 crates/coordinode-core/src/graph/doc_delta.rs
+2 −0 crates/coordinode-core/src/txn/mod.rs
+134 −0 crates/coordinode-core/src/txn/read_consistency.rs
+396 −0 crates/coordinode-core/src/txn/watermark.rs
+4 −0 crates/coordinode-embed/src/db/mod.rs
+3 −0 crates/coordinode-embed/tests/integration/concurrent.rs
+9 −0 crates/coordinode-embed/tests/integration/helpers.rs
+5 −2 crates/coordinode-embed/tests/integration/hnsw.rs
+11 −5 crates/coordinode-embed/tests/integration/text_index.rs
+12 −10 crates/coordinode-query/src/advisor/detectors.rs
+35 −0 crates/coordinode-query/src/advisor/fingerprint.rs
+1 −5 crates/coordinode-query/src/advisor/procedures.rs
+2 −2 crates/coordinode-query/src/advisor/registry.rs
+1 −1 crates/coordinode-query/src/advisor/source.rs
+126 −0 crates/coordinode-query/src/cypher/ast.rs
+67 −1 crates/coordinode-query/src/cypher/cypher.pest
+460 −12 crates/coordinode-query/src/cypher/parser.rs
+25 −0 crates/coordinode-query/src/cypher/semantic.rs
+196 −0 crates/coordinode-query/src/executor/eval.rs
+4,494 −2,639 crates/coordinode-query/src/executor/runner.rs
+1,402 −17 crates/coordinode-query/src/planner/builder.rs
+309 −1 crates/coordinode-query/src/planner/logical.rs
+475 −0 crates/coordinode-query/tests/attach_document.rs
+499 −0 crates/coordinode-query/tests/detach_document.rs
+719 −0 crates/coordinode-query/tests/hybrid_scoring_contract.rs
+2,118 −4 crates/coordinode-query/tests/query_integration.rs
+418 −0 crates/coordinode-query/tests/snapshot_api_contract.rs
+2 −3 crates/coordinode-raft/build.rs
+175 −2 crates/coordinode-raft/src/cluster/mod.rs
+414 −1 crates/coordinode-raft/src/storage.rs
+156 −2 crates/coordinode-search/src/tantivy/mod.rs
+10 −0 crates/coordinode-search/src/tantivy/multi_lang.rs
+142 −0 crates/coordinode-search/src/tantivy/segment_registry.rs
+194 −0 crates/coordinode-search/tests/snapshot_filter.rs
+3 −0 crates/coordinode-server/Cargo.toml
+2 −3 crates/coordinode-server/build.rs
+104 −1 crates/coordinode-server/src/cli.rs
+75 −1 crates/coordinode-server/src/config/mod.rs
+142 −1 crates/coordinode-server/src/grpc/mod.rs
+90 −37 crates/coordinode-server/src/main.rs
+0 −697 crates/coordinode-server/src/services/text.rs
+31 −0 crates/coordinode-storage/src/oplog/manager.rs
+86 −0 crates/coordinode-storage/src/oplog/segment.rs
+21 −8 docs/QUICKSTART.md
+103 −0 docs/cypher/extensions.md
+123 −1 docs/cypher/functions.md
+86 −1 docs/cypher/reference.md
+393 −1 examples/quickstart/hybrid-query.json
+1 −1 proto
+1 −1 rust-toolchain.toml
+73 −11 scripts/gen-api-docs.py
+1 −0 tests/integration/Cargo.toml
+0 −57 tests/integration/proto_gen/coordinode.v1.common.rs
+0 −576 tests/integration/proto_gen/coordinode.v1.graph.rs
+53 −517 tests/integration/proto_gen/coordinode.v1.query.rs
+16 −0 tests/integration/proto_gen/coordinode.v1.replication.rs
+374 −0 tests/integration/proto_gen/google.api.rs
+101 −5 tests/integration/src/harness.rs
+153 −0 tests/integration/tests/client.rs
+43 −0 tests/integration/tests/restart.rs
+227 −0 tests/integration/tests/server.rs
66 changes: 65 additions & 1 deletion coordinode/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ db.cypher(
params={"title": "RAG intro", "vec": [0.1] * 384},
)

# Nearest-neighbour search (requires HNSW index — coming in v0.4)
# Nearest-neighbour search
results = db.vector_search(
label="Doc",
property="embedding",
Expand All @@ -117,6 +117,70 @@ for r in results:
print(r.node.id, r.distance)
```

## Hybrid Search (v0.4+)

Fuse BM25 full-text and vector similarity using Cypher scoring functions:

```python
# Full-text scoring (text_score / text_match) requires a TEXT INDEX on the
# queried property — without it those calls return zero/no matches.
db.create_text_index("idx_doc_body", "Doc", "body")

# Reciprocal Rank Fusion of text + vector. Projecting `d AS doc_id` returns the
# internal node id (an integer) — fetch properties explicitly when needed.
rows = db.cypher("""
MATCH (d:Doc)
WHERE text_match(d, $q) OR d.embedding IS NOT NULL
RETURN d AS doc_id,
d.title AS title,
rrf_score(
text_score(d, $q),
vec_score(d.embedding, $vec)
) AS score
ORDER BY score DESC LIMIT 10
""", params={"q": "graph neural network", "vec": [0.1] * 384})
# Full node properties: db.get_node(rows[0]["doc_id"]).
```

Helpers available in Cypher (evaluated server-side in coordinode-rs ≥ v0.4.0):
``text_score``, ``vec_score``, ``doc_score``, ``text_match``, ``rrf_score``,
``hybrid_score``. These are built-in Cypher functions; nothing to import on the
Python side.
Comment thread
coderabbitai[bot] marked this conversation as resolved.

## ATTACH / DETACH DOCUMENT (v0.4+)

Promote a nested property to a graph node (and back):

```python
db.cypher("MATCH (a:Article {id: $id}) DETACH DOCUMENT a.body AS (d:Body)",
params={"id": 1})
db.cypher("MATCH (a:Article {id: $id})-[:HAS_BODY]->(d:Body) "
"ATTACH DOCUMENT d INTO a.body", params={"id": 1})
```

## Consistency Controls

```python
# Majority read for strict freshness. `n AS node_id` returns the integer id;
# use get_node(id) or project explicit properties (e.g. n.email AS email).
db.cypher(
"MATCH (n:Account) RETURN n AS node_id, n.email AS email",
read_concern="majority",
)

# Majority write (required for causal reads)
db.cypher("CREATE (n:Event {t: timestamp()})", write_concern="majority")

# Causal read: see at least state at raft index 42
db.cypher("MATCH (n) RETURN count(n) AS total", after_index=42)
Comment thread
polaz marked this conversation as resolved.
```

Accepted values:

- ``read_concern``: ``local`` (default) · ``majority`` · ``linearizable`` · ``snapshot``
- ``write_concern``: ``w0`` · ``w1`` (default) · ``majority``
- ``read_preference``: ``primary`` (default) · ``primary_preferred`` · ``secondary`` · ``secondary_preferred`` · ``nearest``

## Related Packages

| Package | Description |
Expand Down
2 changes: 0 additions & 2 deletions coordinode/coordinode/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@
CoordinodeClient,
EdgeResult,
EdgeTypeInfo,
HybridResult,
LabelInfo,
NodeResult,
PropertyDefinitionInfo,
Expand All @@ -44,7 +43,6 @@
"EdgeResult",
"VectorResult",
"TextResult",
"HybridResult",
"LabelInfo",
"EdgeTypeInfo",
"PropertyDefinitionInfo",
Expand Down
214 changes: 112 additions & 102 deletions coordinode/coordinode/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,23 +113,6 @@ def __repr__(self) -> str:
return f"TextResult(node_id={self.node_id}, score={self.score:.4f}, snippet={self.snippet!r})"


class HybridResult:
"""A single result from hybrid text + vector search (RRF-ranked)."""

def __init__(self, proto_result: Any) -> None:
self.node_id: int = proto_result.node_id
# Combined RRF score: text_weight/(60+rank_text) + vector_weight/(60+rank_vec).
self.score: float = proto_result.score
# NOTE: proto HybridResult carries only node_id + score (no embedded Node
# message). A full node is not included by design — the server returns IDs
# for efficiency. Callers that need node properties should use the client
# API: `client.get_node(self.node_id)`, or match on an application-level
# property in Cypher (e.g. WHERE n.id = <value>).

def __repr__(self) -> str:
return f"HybridResult(node_id={self.node_id}, score={self.score:.6f})"


class PropertyDefinitionInfo:
"""A property definition from the schema (name, type, required, unique)."""

Expand Down Expand Up @@ -270,16 +253,54 @@ async def cypher(
self,
query: str,
params: dict[str, PyValue] | None = None,
*,
read_concern: str | None = None,
write_concern: str | None = None,
read_preference: str | None = None,
after_index: int | None = None,
) -> list[dict[str, Any]]:
"""Execute an OpenCypher query. Returns rows as list of dicts."""
"""Execute an OpenCypher query. Returns rows as list of dicts.

Consistency parameters (all optional; server defaults apply when omitted):

- ``read_concern``: ``"local"`` (default), ``"majority"``, ``"linearizable"``, ``"snapshot"``.
- ``write_concern``: ``"w0"``, ``"w1"`` (default, leader-ack), ``"majority"``. Required
``"majority"`` when using causal reads (``after_index`` > 0).
Comment thread
polaz marked this conversation as resolved.
- ``read_preference``: ``"primary"`` (default), ``"primary_preferred"``, ``"secondary"``,
``"secondary_preferred"``, ``"nearest"``.
- ``after_index``: raft log index for causal reads — returned rows reflect at least
the state at this index.
"""
from coordinode._proto.coordinode.v1.query.cypher_pb2 import ( # type: ignore[import]
ExecuteCypherRequest,
)

# Validate after_index type/range BEFORE any numeric comparison so that
# True (bool is a subclass of int) and "7" (str) produce a clear
# "must be a non-negative integer" error instead of a misleading
# causal-read violation or a raw TypeError.
if after_index is not None and (
not isinstance(after_index, int) or isinstance(after_index, bool) or after_index < 0
):
raise ValueError(f"after_index must be a non-negative integer, got {after_index!r}")
# Causal reads (after_index > 0) are only satisfiable when writes were
# acknowledged by a majority; otherwise the referenced index may never
# replicate and the read would hang. Mirror the server's rejection.
if after_index is not None and after_index > 0 and (write_concern or "").strip().lower() != "majority":
raise ValueError(
"after_index > 0 requires write_concern='majority' — causal reads "
"depend on majority-committed writes. Pass write_concern='majority'."
)
Comment thread
polaz marked this conversation as resolved.
Comment thread
polaz marked this conversation as resolved.
req = ExecuteCypherRequest(
query=query,
parameters=dict_to_props(params or {}),
)
if read_concern is not None or after_index is not None:
req.read_concern.CopyFrom(_make_read_concern(read_concern, after_index))
if write_concern is not None:
req.write_concern.CopyFrom(_make_write_concern(write_concern))
if read_preference is not None:
req.read_preference = _make_read_preference(read_preference)
resp = await self._cypher_stub.ExecuteCypher(req, timeout=self._timeout)
columns = list(resp.columns)
return [{col: from_property_value(val) for col, val in zip(columns, row.values)} for row in resp.rows]
Expand Down Expand Up @@ -776,64 +797,6 @@ async def text_search(
resp = await self._text_stub.TextSearch(req, timeout=self._timeout)
return [TextResult(r) for r in resp.results]

async def hybrid_text_vector_search(
self,
label: str,
text_query: str,
vector: Sequence[float],
*,
limit: int = 10,
text_weight: float = 0.5,
vector_weight: float = 0.5,
vector_property: str = "embedding",
) -> list[HybridResult]:
"""Fuse BM25 text search and cosine vector search using Reciprocal Rank Fusion (RRF).

Runs text and vector searches independently, then combines their ranked
lists::

rrf_score(node) = text_weight / (60 + rank_text)
+ vector_weight / (60 + rank_vec)

Args:
label: Node label to search (e.g. ``"Article"``).
text_query: Full-text query string (same syntax as :meth:`text_search`).
vector: Query embedding vector. Must match the dimensionality stored
in *vector_property*.
limit: Maximum fused results to return (default 10). The server may
apply its own upper bound; pass a reasonable value (e.g. ≤ 1000).
text_weight: Weight for the BM25 component (default 0.5).
vector_weight: Weight for the cosine component (default 0.5).
vector_property: Node property containing the embedding (default
``"embedding"``).

Returns:
List of :class:`HybridResult` ordered by RRF score descending.

Note:
A full-text index covering *label* **must exist** before calling this
method — create one with :meth:`create_text_index` or a
``CREATE TEXT INDEX`` Cypher statement. Calling this method on a
label without a text index returns an empty list.
"""
if not isinstance(limit, int) or isinstance(limit, bool) or limit < 1:
raise ValueError(f"limit must be an integer >= 1, got {limit!r}.")
from coordinode._proto.coordinode.v1.query.text_pb2 import ( # type: ignore[import]
HybridTextVectorSearchRequest,
)

req = HybridTextVectorSearchRequest(
label=label,
text_query=text_query,
vector=[float(v) for v in vector],
limit=limit,
text_weight=text_weight,
vector_weight=vector_weight,
vector_property=vector_property,
)
resp = await self._text_stub.HybridTextVectorSearch(req, timeout=self._timeout)
return [HybridResult(r) for r in resp.results]

async def health(self) -> bool:
Comment thread
polaz marked this conversation as resolved.
from coordinode._proto.coordinode.v1.health.health_pb2 import ( # type: ignore[import]
HealthCheckRequest,
Expand Down Expand Up @@ -907,9 +870,23 @@ def cypher(
self,
query: str,
params: dict[str, PyValue] | None = None,
*,
read_concern: str | None = None,
write_concern: str | None = None,
read_preference: str | None = None,
after_index: int | None = None,
) -> list[dict[str, Any]]:
"""Execute an OpenCypher query. Returns rows as list of dicts."""
return self._run(self._async.cypher(query, params))
"""Execute an OpenCypher query. See :meth:`AsyncCoordinodeClient.cypher` for consistency args."""
return self._run(
self._async.cypher(
query,
params,
read_concern=read_concern,
write_concern=write_concern,
read_preference=read_preference,
after_index=after_index,
)
)

def vector_search(
self,
Expand Down Expand Up @@ -1016,34 +993,67 @@ def text_search(
"""Run a full-text BM25 search over all indexed text properties for *label*."""
return self._run(self._async.text_search(label, query, limit=limit, fuzzy=fuzzy, language=language))

def hybrid_text_vector_search(
self,
label: str,
text_query: str,
vector: Sequence[float],
*,
limit: int = 10,
text_weight: float = 0.5,
vector_weight: float = 0.5,
vector_property: str = "embedding",
) -> list[HybridResult]:
"""Fuse BM25 text search and cosine vector search using RRF ranking."""
return self._run(
self._async.hybrid_text_vector_search(
label,
text_query,
vector,
limit=limit,
text_weight=text_weight,
vector_weight=vector_weight,
vector_property=vector_property,
)
)

def health(self) -> bool:
return self._run(self._async.health())


# ── Consistency helpers ──────────────────────────────────────────────────────


_READ_CONCERN_MAP = {
"local": "READ_CONCERN_LEVEL_LOCAL",
"majority": "READ_CONCERN_LEVEL_MAJORITY",
"linearizable": "READ_CONCERN_LEVEL_LINEARIZABLE",
"snapshot": "READ_CONCERN_LEVEL_SNAPSHOT",
}
_WRITE_CONCERN_MAP = {
"w0": "WRITE_CONCERN_LEVEL_W0",
"w1": "WRITE_CONCERN_LEVEL_W1",
"majority": "WRITE_CONCERN_LEVEL_MAJORITY",
}
_READ_PREFERENCE_MAP = {
"primary": "READ_PREFERENCE_PRIMARY",
"primary_preferred": "READ_PREFERENCE_PRIMARY_PREFERRED",
"secondary": "READ_PREFERENCE_SECONDARY",
"secondary_preferred": "READ_PREFERENCE_SECONDARY_PREFERRED",
"nearest": "READ_PREFERENCE_NEAREST",
}


def _normalize_consistency_key(value: Any, field: str, mapping: dict[str, str]) -> str:
if not isinstance(value, str) or not value.strip():
raise ValueError(f"{field} must be a non-empty string; got {value!r}")
enum_name = mapping.get(value.strip().lower())
if enum_name is None:
raise ValueError(f"invalid {field} {value!r}; expected one of {sorted(mapping)}")
return enum_name


def _make_read_concern(level: str | None, after_index: int | None) -> Any:
from coordinode._proto.coordinode.v1.replication import consistency_pb2 as pb # type: ignore[import]

kwargs: dict[str, Any] = {}
if level is not None:
kwargs["level"] = getattr(pb, _normalize_consistency_key(level, "read_concern", _READ_CONCERN_MAP))
if after_index is not None:
if not isinstance(after_index, int) or isinstance(after_index, bool) or after_index < 0:
raise ValueError(f"after_index must be a non-negative integer, got {after_index!r}")
kwargs["after_index"] = after_index
return pb.ReadConcern(**kwargs)


def _make_write_concern(level: str) -> Any:
from coordinode._proto.coordinode.v1.replication import consistency_pb2 as pb # type: ignore[import]

return pb.WriteConcern(level=getattr(pb, _normalize_consistency_key(level, "write_concern", _WRITE_CONCERN_MAP)))


def _make_read_preference(pref: str) -> Any:
from coordinode._proto.coordinode.v1.replication import consistency_pb2 as pb # type: ignore[import]

return getattr(pb, _normalize_consistency_key(pref, "read_preference", _READ_PREFERENCE_MAP))


# ── Stub factories (deferred import) ─────────────────────────────────────────


Expand Down
4 changes: 2 additions & 2 deletions demo/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,8 @@ Interactive notebooks for LlamaIndex, LangChain, and LangGraph integrations.

> **Note:** First run installs `coordinode-embedded` from source (Rust build, ~5 min).
> Subsequent runs use Colab's pip cache.
> The embedded Colab install is pinned to a specific commit that bundles coordinode-rs v0.3.17; the Colab notebook links above target `main`.
> The Docker Compose stack below uses the CoordiNode **server** image v0.3.17.
> The embedded Colab install is pinned to a specific commit that bundles coordinode-rs v0.4.1; the Colab notebook links above target `main`.
> The Docker Compose stack below uses the CoordiNode **server** image v0.4.1.

## Run locally (Docker Compose)

Expand Down
2 changes: 1 addition & 1 deletion demo/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
services:
coordinode:
# Keep version in sync with root docker-compose.yml
image: ghcr.io/structured-world/coordinode:0.3.17
image: ghcr.io/structured-world/coordinode:0.4.1
container_name: demo-coordinode
ports:
- "127.0.0.1:37080:7080" # gRPC (native API) — localhost-only
Expand Down
Loading
Loading