fix(bigquery): route JOB_CREATION_REQUIRED through fast query path#13437
fix(bigquery): route JOB_CREATION_REQUIRED through fast query path#13437jinseopkim0 wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request enables fast query support when the job creation mode is set to JOB_CREATION_REQUIRED in QueryRequestInfo. This allows queries with this configuration to execute via the fast query path, returning a TableResult directly (with both JobId and QueryId populated) rather than creating a separate job. Unit and integration tests have been updated to reflect and verify this behavior. I have no feedback to provide.
| && config.getTimePartitioning() == null | ||
| && config.getUserDefinedFunctions() == null | ||
| && config.getWriteDisposition() == null | ||
| && config.getJobCreationMode() != JobCreationMode.JOB_CREATION_REQUIRED; |
There was a problem hiding this comment.
qq, wouldn't this just end up always doing a fast query (default is set to required)? IIUC, I think should be a fast query only for JobCreationMode.JOB_CREATION_OPTIONAL?
Is there any performance impact or behavioral change if default to a fast query even if a user explicitly sets job_required?
There was a problem hiding this comment.
Thanks for the questions.
qq, wouldn't this just end up always doing a fast query (default is set to required)?
Yes, this is intended.
IIUC, I think should be a fast query only for JobCreationMode.JOB_CREATION_OPTIONAL?
Fast query should be for both. The BigQuery backend always creates a job in the background and returns a jobReference (including the jobId, see https://docs.cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query#QueryResponse) for all jobs.query requests, so fast query makes sense.
Is there any performance impact or behavioral change if default to a fast query even if a user explicitly sets job_required?
The result is faster queries and lower latencies. There is no behavioral change as a job is still created in the background and tracked as expected.
There was a problem hiding this comment.
Gotcha, I think faster queries is always better for the user, but something seems a bit weird about JobCreationMode to me. Why even have an JOB_CREATION_OPTIONAL configuration on the client side and not just do it under the hood on the server side?
Since we have this configuration, it seems odd to have a customer specify JOB_CREATION_REQUIRED and potentially not result in a job back. Could the original issue be solved if fast query runs only when JOB_CREATION_OPTIONAL is specified? It seems like the issue was the fast query logic was running on the wrong conditions?
Routes queries under
JobCreationMode.JOB_CREATION_REQUIREDto the fast query path (jobs.queryAPI / 1 RPC) to avoid the slow fallback path (jobs.insertAPI / 2 RPCs).https://docs.cloud.google.com/bigquery/docs/reference/rest/v2/jobs/query#QueryResponse says:
Thus, routing
JOB_CREATION_REQUIREDthrough the fast path is preferred.b/522363981