You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our Dataflow pipeline is designed to transfer data from Google Pub/Sub to BigQuery. It utilizes a custom dynamic destination to dynamically determine the target BigQuery table based on the JSON content of the Pub/Sub message.
We have identified a recurring issue where the pipeline encounters a failure if the specified destination table is absent in BigQuery, resulting in a 404 Not Found error. The expected behavior is for the pipeline to manage such errors gracefully and attempt retries according to the defined retry policy.
Despite configuring the pipeline with InsertRetryPolicy.neverRetry(), it continues to terminate with the same error upon encountering a non-existent table. As a result, the pipeline continues to retry to insert the invalid data as #20211 .
Environment
Apache Beam SDK in Java: 2.56.0
Desired behavior
We aim to customize error handling and retry operations according to a specified retry policy. For example, we could implement a custom retry policy that specifically avoids retrying operations that result in a 404 Not Found error.
Sample code
Here is a snippet of the code that configures the pipeline to write to BigQuery:
What happened?
Overview
Our Dataflow pipeline is designed to transfer data from Google Pub/Sub to BigQuery. It utilizes a custom dynamic destination to dynamically determine the target BigQuery table based on the JSON content of the Pub/Sub message.
We have identified a recurring issue where the pipeline encounters a failure if the specified destination table is absent in BigQuery, resulting in a 404 Not Found error. The expected behavior is for the pipeline to manage such errors gracefully and attempt retries according to the defined retry policy.
Despite configuring the pipeline with
InsertRetryPolicy.neverRetry()
, it continues to terminate with the same error upon encountering a non-existent table. As a result, the pipeline continues to retry to insert the invalid data as #20211 .Environment
Desired behavior
We aim to customize error handling and retry operations according to a specified retry policy. For example, we could implement a custom retry policy that specifically avoids retrying operations that result in a 404 Not Found error.
Sample code
Here is a snippet of the code that configures the pipeline to write to BigQuery:
Error message
We have masked sensitive information like the project ID and dataset ID in the error message below:
Issue Priority
Priority: 3 (minor)
Issue Components
The text was updated successfully, but these errors were encountered: