Skip to content

[SPARK-56413] Add gRPC UDF execution protocol#55657

Open
haiyangsun-db wants to merge 2 commits intoapache:masterfrom
haiyangsun-db:SPARK-56413
Open

[SPARK-56413] Add gRPC UDF execution protocol#55657
haiyangsun-db wants to merge 2 commits intoapache:masterfrom
haiyangsun-db:SPARK-56413

Conversation

@haiyangsun-db
Copy link
Copy Markdown
Contributor

@haiyangsun-db haiyangsun-db commented May 3, 2026

What changes were proposed in this pull request?

Adds udf_protocol.proto, the gRPC wire contract between the Spark engine and a
UDF worker process, as described in SPIP. Sits next to the existing worker_spec.proto.

Defines a Worker service with two RPCs:

  • Execute(stream UdfRequest) returns (stream UdfResponse) — one bidirectional
    stream per UDF execution. Lifecycle on the stream: Init → 0..N
    DataRequest / DataResponse → exactly one Finish or Cancel.
    PayloadChunk streams oversized UDF bodies.
  • Manage(WorkerRequest) returns (WorkerResponse) — unary, worker-scoped
    (heartbeat, graceful shutdown).

UdfPayload carries the engine-opaque callable bytes plus a format tag,
an eval_type worker-dispatch hint, and optional input/output encoders.
Init carries data_format, schemas, session_conf, task_context, and
timezone (the first graduate from session_conf); a reserved field range
absorbs future graduates.

Also fixes two typos in common.proto (exachanged/bidrectional).

Out of scope

No planning info on the wire (no execution-shape / cardinality enum, no
chained-UDF metadata). Both can be added additively later.

Why are the changes needed?

Spark Connect's UDF support today is Python-only and tied to a Python-specific
socket protocol. Onboarding other client languages requires a structured,
language-neutral wire contract. This PR lands the proto layer; engine and
worker implementations will follow.

Does this PR introduce any user-facing change?

No. Wire contract only; not yet wired into any end-to-end path.

How was this patch tested?

Verified the proto compiles with protoc against common.proto and
worker_spec.proto, and inspected the generated descriptor for field-number
and oneof correctness. End-to-end conformance tests will land with the
engine-side client and first worker implementation.

Was this patch authored or co-authored using generative AI tooling?

Yes

@haiyangsun-db haiyangsun-db marked this pull request as ready for review May 3, 2026 16:13
@haiyangsun-db haiyangsun-db changed the title [SPARK-56413] Introduce the grpc protocol for UDF execution. [SPARK-56413] Add gRPC UDF execution protocol May 3, 2026
// with no branch set.
oneof control {
Init init = 1;
PayloadChunk payload = 2;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the reason that payload does not require a confirmation from the worker?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Payloadchunks are an addition to the Init message, we can assume it's part of the Init, only used when payload is large. It doesn't require a response. Init + (payload_chunk)* maps to one InitResponse

// the engine: the engine forwards [[payload]] and [[format]]
// unchanged, and the worker decodes them per the format the client
// and worker have agreed on.
message UdfPayload {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about payload language? I assume because the worker is already tied to a specific language, so it does not need to know what language this UDF payload is in?

Copy link
Copy Markdown
Contributor Author

@haiyangsun-db haiyangsun-db May 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly, that does not matter. worker is already provided by the worker spec, engine doesn't care what language it is

// a typed field number from the reserved range right after this
// block and is removed from [[session_conf]]. [[timezone]] below
// is an example of a key that has already been promoted.
map<string, string> session_conf = 6;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would this and task_context above have no optional prefix?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

map field cannot be optional in protobuf. Leave it to be an empty map when it is not set.


// (Optional) Session timezone, promoted out of [[session_conf]]
// because every eval needs it for timestamp encoding/decoding.
optional string timezone = 7;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is string the canonical type to represent the timezone? I am afraid all kinds of conversion errors may happen with no schema/enum enforcement.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is convention from Spark, timezone is a string in spark.

message Heartbeat {}

// Acknowledgment for [[Heartbeat]].
message HeartbeatAck {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
message HeartbeatAck {}
message HeartbeatResponse {}

just for some consistency?


// (Optional) Session timezone, promoted out of [[session_conf]]
// because every eval needs it for timestamp encoding/decoding.
optional string timezone = 7;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should specify the exact format in which the timezone will be reported since its a string

// Packed by the client side of the protocol; opaque to the
// wire protocol. Left unset whenever the worker's built-in
// decoders are sufficient.
optional bytes input_encoder = 6;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK there are no use cases for the custom in/output encoders at the moment. Should we maybe only add them when they are needed?

def cancel(): Unit

/** Closes this session and releases resources. */
override def close(): Unit
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to clarify the exact semantics of close/finish and cancel within the background of how we could implement calling them in Spark.

From my current understanding, finish would indicate to the UDF worker that no more input batches are to be send. Therefore, the worker would finish processing the batches it has already received/buffered and then response with a FinishResponse. From the Spark side (e.g. in a operator), we could send multiple batches and then call finish, meaning the batches and the finish message would sit in the client or server-side gRPC buffer. If the Spark task now gets canceled, we have to wait for the UDF worker to finish as the finish message was already sent. Is this a scenario, would we like to support more eager cancellation? This also somewhat depends on the processing time per batch and the buffer size. Alternatively, we can force worker termination via the system-level primitives (SIGTERM/SIGKILL).

Another concern is a potential race condition between finish and cancel calls. Cancellations are most likely going to be implemented using a taskInterruptListener on the Spark task running a UDF. The callback invoked from this listener does not necessarily share the whole context of the operator execution. Therefore, it might not know if finish has already been called/queued. Would it make sense for the cancel/finish calls from the WorkerSession to implement a no-op in this case? The session has all the state and can know whether the session has already previously been canceled/finished.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same concern exists between init and cancel. If we implement cancellation via a taskInterruptListener, cancel might be called before init was called.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants