Do not auto-retry gRPC-message-size-too-large errors #2604

maciejdudko · 2025-07-21T22:01:50Z

What was changed

GrpcRetryer does not retry on gRPC error if the cause is message size limit, instead wraps the error in the new custom GrpcMessageTooLargeException and throws.
WorkflowWorker catches GrpcMessageTooLargeException on reporting workflow task completion/failure, and reports task failure with this error. This does NOT prevent server-side retry of the workflow task, but it does record the error in event history for easier debugging.

Why?

Feature request: temporalio/features#624

Checklist

Closes Do not auto-retry gRPC-message-size-too-large errors #1585
How was this tested: added tests to GrpcSyncRetryerTest, GrpcAsyncRetryerTest and GrpcMessageTooLargeTest.

Quinn-With-Two-Ns · 2025-07-22T16:14:21Z

temporal-sdk/src/main/java/io/temporal/failure/ApplicationFailure.java

@@ -236,7 +236,9 @@ public static final class Builder {
    private Duration nextRetryDelay;
    private ApplicationErrorCategory category;

-    private Builder() {}


I already fixed this in a different PR :)

Undid the change.

Quinn-With-Two-Ns · 2025-07-22T16:16:58Z

...l-serviceclient/src/main/java/io/temporal/internal/retryer/GrpcMessageTooLargeException.java

@@ -0,0 +1,7 @@
+package io.temporal.internal.retryer;
+
+public class GrpcMessageTooLargeException extends RuntimeException {


Is this type really internal? If a user makes a client call with a payload that is to large won't that client call throw this exception?

As per conversation, the exception is internal, but I changed it so it derives StatusRuntimeException so the user can catch it that way.

Quinn-With-Two-Ns · 2025-07-22T16:20:04Z

temporal-sdk/src/main/java/io/temporal/internal/worker/WorkflowWorker.java

-                  result.getRequestRetryOptions(),
-                  workflowTypeScope);
-            } else if (queryCompleted != null) {
+            if (queryCompleted != null) {
              sendDirectQueryCompletedResponse(


The direct query response can also fail due to gRPC message size too large so I think that needs to be under the try as well no?

… other changes.

Quinn-With-Two-Ns · 2025-07-29T18:29:12Z

...l-serviceclient/src/main/java/io/temporal/internal/retryer/GrpcMessageTooLargeException.java

+                .startsWith("grpc: received message after decompression larger than max"))) {
+      return new GrpcMessageTooLargeException(status.withCause(exception), exception.getTrailers());
+    } else {
+      return null;


nit: could just change the return type and return the original StatusRuntimeException. Since all the callers just do that as well.

The logic is different depending on whether the wrap was successful or not - GrpcMessageTooLargeException is sent as a failure to the server, other StatusRuntimeException is just rethrown. Technically it could be instanceof check, but I think null check is cleaner.

Quinn-With-Two-Ns · 2025-07-29T18:29:52Z

Looks good!, did we plan to add a Temporal cloud test as part of this PR? If not that is okay, but can we at least manually test against it?

Quinn-With-Two-Ns · 2025-08-01T16:05:38Z

temporal-sdk/src/main/java/io/temporal/internal/worker/WorkflowWorker.java

+                        workflowExecution.getWorkflowId(),
+                        tooLargeException,
+                        "Failed to send query response");
+                RespondQueryTaskCompletedRequest.Builder queryFailedBuilder =


Sorry I missed this in the initial review, we should be failing the workflow task here

Talked offline, In Java we seem to always use RespondQueryTaskCompletedRequest for queries, even if there was something that would normally fail the workflow task.

maciejdudko requested a review from a team as a code owner July 21, 2025 22:01

Quinn-With-Two-Ns reviewed Jul 22, 2025

View reviewed changes

maciejdudko added 5 commits July 22, 2025 14:02

Do not auto-retry gRPC-message-size-too-large errors

23c35d4

Fixed GrpcRetryer tests

87af036

Fixed GrpcAsyncRetryerTest failure

a4e0fee

Rebased with master

44b1e90

Handling message too large error when sending query response. Various…

6444df3

… other changes.

maciejdudko force-pushed the grpc-message-too-large branch from b3bad78 to 6444df3 Compare July 28, 2025 16:40

Reverted minor change in GrpcRetryerUtils that's no longer needed.

340154a

maciejdudko mentioned this pull request Jul 29, 2025

[Bug] Report gRPC-message-size-too-large error to server when sending WFT failures and query results temporalio/sdk-core#970

Open

Quinn-With-Two-Ns reviewed Jul 29, 2025

View reviewed changes

Quinn-With-Two-Ns reviewed Aug 1, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Do not auto-retry gRPC-message-size-too-large errors #2604

Do not auto-retry gRPC-message-size-too-large errors #2604

Uh oh!

maciejdudko commented Jul 21, 2025

Uh oh!

Quinn-With-Two-Ns Jul 22, 2025

Uh oh!

maciejdudko Jul 28, 2025

Uh oh!

Quinn-With-Two-Ns Jul 22, 2025

Uh oh!

maciejdudko Jul 28, 2025

Uh oh!

Quinn-With-Two-Ns Jul 22, 2025

Uh oh!

maciejdudko Jul 28, 2025

Uh oh!

Quinn-With-Two-Ns Jul 29, 2025

Uh oh!

maciejdudko Jul 31, 2025

Uh oh!

Quinn-With-Two-Ns commented Jul 29, 2025

Uh oh!

Quinn-With-Two-Ns Aug 1, 2025

Uh oh!

Quinn-With-Two-Ns Aug 1, 2025

Uh oh!

Uh oh!

		@@ -0,0 +1,7 @@
		package io.temporal.internal.retryer;

		public class GrpcMessageTooLargeException extends RuntimeException {

Do not auto-retry gRPC-message-size-too-large errors #2604

Are you sure you want to change the base?

Do not auto-retry gRPC-message-size-too-large errors #2604

Uh oh!

Conversation

maciejdudko commented Jul 21, 2025

What was changed

Why?

Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Quinn-With-Two-Ns commented Jul 29, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!