Skip to content

JiT Worker in CLI and RPC infrastructure#2109

Open
rafal-hawrylak wants to merge 1 commit intomainfrom
cli_jit_workers
Open

JiT Worker in CLI and RPC infrastructure#2109
rafal-hawrylak wants to merge 1 commit intomainfrom
cli_jit_workers

Conversation

@rafal-hawrylak
Copy link
Copy Markdown
Collaborator

This change establishes the foundation for executing JiT compilation in an isolated environment. It introduces the worker process management, the RPC bridge for database access during JiT, and the necessary Bazel targets.

@rafal-hawrylak rafal-hawrylak marked this pull request as ready for review March 9, 2026 19:59
@rafal-hawrylak rafal-hawrylak requested a review from a team as a code owner March 9, 2026 19:59
@rafal-hawrylak rafal-hawrylak enabled auto-merge (squash) March 9, 2026 19:59
@rafal-hawrylak rafal-hawrylak self-assigned this Mar 9, 2026
Comment thread cli/api/commands/base_worker.ts Outdated
Comment thread cli/api/commands/jit_compile_child_process.ts Outdated
Comment thread cli/api/commands/jit/compiler.ts
Comment thread cli/api/commands/jit_compile_child_process.ts Outdated
Comment thread cli/api/commands/base_worker.ts
Comment thread cli/api/utils/constants.ts
Comment thread cli/vm/jit_worker.ts
Comment thread cli/vm/jit_worker.ts
Comment thread cli/api/commands/jit/rpc_test.ts Outdated
@@ -0,0 +1,101 @@
import { IDbAdapter, IDbClient } from "df/cli/api/dbadapters";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: is this protobuf bridge protocol the same for GCP and CLI implementations?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both the CLI and GCP implementations follow the dataform.DbAdapter service defined in jit.proto.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment below on the issue with "Execute" compatibility

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL on the approach with executeRaw

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread cli/api/commands/jit_compile_child_process.ts Outdated
Comment thread protos/jit.proto Outdated
Comment thread cli/api/commands/jit/rpc.ts Outdated
return new Uint8Array();
}

function mapRowToProto(row: { [key: string]: any }): google.protobuf.IStruct {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jit.proto declares that struct contains:

// Rows. For BigQuery, see
  // https://docs.cloud.google.com/bigquery/docs/reference/rest/v2/jobs/getQueryResults.

In other words, right now we expect a raw API result of "f,v" JSON struct, not bespoke conversion.

The BigQuery client for Node is strange in respect for this - it forcefully decodes those and removes rows from the original request. We could either:
a) Use googleapis package client instead for JiT
b) Come up with another protocol for encoding rows and implement it both here and in GCP. Let me know on chat if you need code pointers for GCP part.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL on the approach with executeRaw

Copy link
Copy Markdown
Collaborator

@ikholopov-omni ikholopov-omni Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has exactly the same problem. query() inside rawExecute already removes rows from response and only returns decoded ones as first component in the tuple [rows, _, response]

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rafal-hawrylak rafal-hawrylak force-pushed the cli_jit_workers branch 10 times, most recently from a05cf58 to be56edb Compare March 12, 2026 15:50
Comment thread cli/api/commands/jit/rpc.ts
Comment thread cli/api/commands/jit/rpc.ts Outdated
Comment thread cli/api/commands/jit/rpc.ts Outdated
Comment thread cli/api/dbadapters/bigquery.ts Outdated
Comment thread cli/api/dbadapters/bigquery.ts Outdated
Comment thread cli/vm/jit_worker.ts Outdated
Comment thread cli/vm/jit_worker.ts Outdated
Comment thread cli/vm/jit_worker.ts Outdated
@rafal-hawrylak rafal-hawrylak force-pushed the cli_jit_workers branch 5 times, most recently from 9ca1f45 to 0fe90aa Compare March 27, 2026 18:00
Comment thread cli/api/commands/jit/rpc.ts Outdated
Comment thread cli/api/commands/jit/rpc_test.ts
Comment thread cli/api/commands/jit/rpc_test.ts Outdated
Comment thread cli/api/dbadapters/bigquery.ts Outdated
Comment thread cli/api/dbadapters/bigquery.ts Outdated
Comment thread cli/api/dbadapters/bigquery.ts Outdated
Comment thread cli/vm/jit_worker.ts Outdated
@rafal-hawrylak rafal-hawrylak force-pushed the cli_jit_workers branch 5 times, most recently from c6650a7 to 22d3661 Compare March 29, 2026 08:40
Comment thread cli/api/dbadapters/bigquery.ts
@rafal-hawrylak rafal-hawrylak force-pushed the cli_jit_workers branch 2 times, most recently from d512e48 to 38433f2 Compare March 30, 2026 06:56
* feat: add worker process management for JiT compilation

- Introduce base worker and JiT-specific child process logic

- Implement RPC bridge for database access during JiT

- Add VM scripts and loader for isolated execution

- Add unit tests for the RPC mechanism
Comment thread cli/api/dbadapters/bigquery.ts
Comment thread cli/api/dbadapters/bigquery.ts
Comment thread cli/api/dbadapters/bigquery.ts
Comment thread cli/api/commands/jit/rpc.ts Outdated
Copy link
Copy Markdown
Contributor

@kolina kolina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current PR LGTM, there are a couple of questions about diffs in BigQuery adapters.

After they're responded, I can approve from my side

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants