Backport of #104881 to 25.8: Use explicit flag for secondary on cluster queries by ianton-ru · Pull Request #1875 · Altinity/ClickHouse

ianton-ru · 2026-06-05T14:25:36Z

Backport of ClickHouse#104881 by @tavplubix

Changelog category (leave one):

Not for changelog (changelog entry is not required) (Use explicit flag for secondary on cluster queries ClickHouse/ClickHouse#104881 by @tavplubix)

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Use an explicit flag in Context for secondary DDL/ON CLUSTER queries instead of SECONDARY_QUERY

CI/CD Options

Exclude tests:

Regression jobs to run:

…ry-on-cluster Use explicit flag for secondary on cluster queries

github-actions · 2026-06-05T14:26:32Z

Workflow [PR], commit [2fa783d]

ianton-ru · 2026-06-05T14:28:17Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2fa783dc77

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-05T14:35:01Z

    query_context->makeQueryContext();
    query_context->setCurrentQueryId(""); // generate random query_id
-    query_context->setQueryKind(ClientInfo::QueryKind::SECONDARY_QUERY);
+    query_context->setDDLOrOnClusterInternal(true);


Preserve worker mode for cluster table functions

When a DDL worker executes CREATE ... ON CLUSTER ... AS SELECT ... FROM fileCluster/urlCluster/s3Cluster, this context is no longer marked as SECONDARY_QUERY, while the *Cluster table functions still use query_kind == SECONDARY_QUERY to choose their worker-local storage path (for example TableFunctionFileCluster.cpp:23 and TableFunctionURLCluster.cpp:13). As a result, each DDL worker can fan out to the whole cluster again instead of reading only its assigned local input, multiplying remote reads and potentially duplicating the data inserted by CTAS on every replica; keep the secondary query kind for execution semantics or update those table-function checks to also recognize this new internal DDL flag.

Useful? React with 👍 / 👎.

Good catch, but upstream master branch still uses query_kind == SECONDARY_QUERY, so this issue also there.

the original PR author stated it was irrelevant to ddl

ianton-ru · 2026-06-08T10:40:56Z

Failed tests looks as unstable, unrelated to PR

mkmkme

LGTM, pretty much 1-to-1 with the original PR

mkmkme · 2026-06-10T12:30:20Z

    query_context->makeQueryContext();
    query_context->setCurrentQueryId(""); // generate random query_id
-    query_context->setQueryKind(ClientInfo::QueryKind::SECONDARY_QUERY);
+    query_context->setDDLOrOnClusterInternal(true);


the original PR author stated it was irrelevant to ddl

alsugiliazova · 2026-06-10T13:50:31Z

Audit update for PR #1875

AI audit note: This review comment was generated by AI (claude-opus-4.7).

PR: Altinity/ClickHouse#1875 — Backport of ClickHouse#104881 to 25.8: Use explicit flag for secondary on cluster queries
HEAD: 2fa783dc773f8a971e333aa9de671f3982fdacfb
Base: stable-25.8
Upstream: ClickHouse/ClickHouse#104881 (merged)

Confirmed defects

High — `CREATE TABLE AS s3Cluster()` / `fileCluster()` / `urlCluster()` is broken inside a `Replicated` database

Impact: CREATE TABLE ... AS SELECT FROM s3Cluster(...) (and analogues for fileCluster, urlCluster, azureBlobStorageCluster, etc.) executed against a Replicated database throws NOT_FOUND_COLUMN_IN_BLOCK on the secondary replica that picks up the DDL task. The PR's own CI reproduces this on Stateless tests (amd_binary, old analyzer, s3 storage, DatabaseReplicated, parallel)/03579_create_table_populate_from_s3 (backported by 53b01c8fdfe from upstream LOGICAL_ERROR: Next task callback is not set for query ClickHouse/ClickHouse#84753); the stack trace runs through DatabaseReplicated::tryEnqueueReplicatedDDL → DDLWorker::tryExecuteQuery → InterpreterCreateQuery::fillTableIfNeeded → Planner::buildPlanForQueryNode → ActionsDAG::appendInputsForUnusedColumns. The previously-backported fix Fix logical error on creating table as s3Cluster() in Replicated database ClickHouse/ClickHouse#85904 ("Fix logical error on creating table as s3Cluster in Replicated database") relies on the DDL worker context having query_kind == SECONDARY_QUERY so that TableFunctionObjectStorageCluster::executeImpl takes the worker-local StorageObjectStorage branch with distributed_processing = can_use_distributed_iterator (false here, because there is no cluster-function read-task callback). After this PR, DDLTaskBase::makeQueryContext and DatabaseReplicatedTask::makeQueryContext set only setDDLOrOnClusterInternal(true) and drop the setQueryKind(SECONDARY_QUERY) call, so client_info.query_kind keeps the parent context's value (NO_QUERY for the DDL worker thread). The check at src/TableFunctions/TableFunctionObjectStorageCluster.cpp:37 then falls through to the initiator branch and constructs a StorageObjectStorageCluster that re-dispatches the read across the replicas. The Codex bot raised this exact concern on the PR; it was dismissed. Other *Cluster table functions in 25.8 use the same discrimination (TableFunctionFileCluster.cpp:23, TableFunctionURLCluster.cpp:13, TableFunctionObjectStorage.cpp:203, TableFunctionURL.cpp:100), so the regression is not S3-only.
Anchor: src/Interpreters/DDLTask.cpp (DDLTaskBase::makeQueryContext, DatabaseReplicatedTask::makeQueryContext); affected callers in src/TableFunctions/TableFunction*Cluster.cpp and src/TableFunctions/TableFunctionObjectStorage.cpp.
Trigger: CREATE [OR REPLACE] TABLE t ... AS SELECT * FROM s3Cluster(cluster, url, format) ... executed against a Replicated database (or any DatabaseReplicated stateless test wrapper). Reproduced by tests/queries/0_stateless/03579_create_table_populate_from_s3.sh in the DatabaseReplicated stateless config on this PR's CI.
Why defect: DDLTask::makeQueryContext no longer marks the worker context as SECONDARY_QUERY, but multiple call sites that gate worker-local vs. initiator behavior still check query_kind == SECONDARY_QUERY and were not updated to also accept isDDLOrOnClusterInternal. The Altinity 25.8 branch carries the Fix logical error on creating table as s3Cluster() in Replicated database ClickHouse/ClickHouse#85904 fix and the LOGICAL_ERROR: Next task callback is not set for query ClickHouse/ClickHouse#84753 regression test that depend on the old semantics; upstream master happens to be broken in the same way (the PR author acknowledged this), but on 25.8 the regression is observable and fails CI.
Fix direction: Either keep setQueryKind(ClientInfo::QueryKind::SECONDARY_QUERY) in DDLTask::makeQueryContext alongside setDDLOrOnClusterInternal(true), or update TableFunctionObjectStorageCluster.cpp, TableFunctionFileCluster.cpp, TableFunctionURLCluster.cpp, TableFunctionObjectStorage.cpp, TableFunctionURL.cpp (and any other query_kind == SECONDARY_QUERY checks that fire from the DDL flow) to also treat context->isDDLOrOnClusterInternal() as the worker branch.
Regression test direction: 03579_create_table_populate_from_s3 already triggers the failure under the DatabaseReplicated config; make it required in CI for this backport and add explicit fileCluster / urlCluster analogues under a Replicated engine.

Coverage summary

Scope reviewed: All 13 files in the backport diff; cross-checked against upstream Use explicit flag for secondary on cluster queries ClickHouse/ClickHouse#104881 (15 files). The two missing upstream files (src/Interpreters/InterpreterSystemQuery.cpp::restoreDatabaseFromKeeperPath and src/Storages/ObjectStorage/Utils.cpp::expandPaimonKeeperMacrosIfNeeded) and the missing third hunk in src/Databases/DatabaseReplicated.cpp::registerDatabaseReplicated are correctly omitted because those code paths do not exist in 25.8. All boolean translations (query_kind == SECONDARY_QUERY ↔ isDDLOrOnClusterInternal, query_kind == INITIAL_QUERY ↔ !isDDLOrOnClusterInternal, query_kind != INITIAL_QUERY ↔ isDDLOrOnClusterInternal, query_kind != SECONDARY_QUERY ↔ !isDDLOrOnClusterInternal) match upstream verbatim. Context state propagation (copy constructor, default initializer, getter/setter) is consistent. CI report at https://altinity-build-artifacts.s3.amazonaws.com/PRs/1875/2fa783dc773f8a971e333aa9de671f3982fdacfb/result_pr.json reviewed.
Categories failed: DDL worker context propagation × cluster table function dispatch (root cause: dropped setQueryKind(SECONDARY_QUERY)).
Categories passed: Backport-vs-upstream boolean translation parity; Context member lifecycle/copy; backup internal-flag access checks; Replicated database tryEnqueueReplicatedDDL initial-query check; InterpreterCreateQuery UUID/attach-path checks; InterpreterDropQuery secondary-query detection; Kafka / MergeTree / ObjectStorageQueue / TableZnodeInfo is_replicated_database macro gating.
Not applicable: iterator/reference invalidation, integer overflow/signedness, RAII leaks, multithreaded interleaving (no new shared mutable state introduced), rollback/partial-update (no mutation paths added).
Assumptions/limits: Static reasoning + CI log inspection only; did not run a fresh local DatabaseReplicated build to re-execute 03579_create_table_populate_from_s3. The two remaining CI failures (03707_set_index_bad_get_null_bug plan-text mismatch under ParallelReplicas, and the BackupsWorker::wait fatal in the amd_debug, distributed plan shard) were not analyzed beyond confirming their stack traces are not on the modified code paths.

Merge pull request ClickHouse#104881 from ClickHouse/explicit-seconda…

2fa783d

…ry-on-cluster Use explicit flag for secondary on cluster queries

ianton-ru added 25.8 25.8 Altinity Stable backport Backport labels Jun 5, 2026

chatgpt-codex-connector Bot reviewed Jun 5, 2026

View reviewed changes

mkmkme approved these changes Jun 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backport of #104881 to 25.8: Use explicit flag for secondary on cluster queries#1875

Backport of #104881 to 25.8: Use explicit flag for secondary on cluster queries#1875
ianton-ru wants to merge 1 commit into
stable-25.8from
backports/25.8/104881

ianton-ru commented Jun 5, 2026

Uh oh!

github-actions Bot commented Jun 5, 2026

Uh oh!

ianton-ru commented Jun 5, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 5, 2026

Uh oh!

ianton-ru Jun 8, 2026

Uh oh!

mkmkme Jun 10, 2026

Uh oh!

ianton-ru commented Jun 8, 2026

Uh oh!

mkmkme left a comment

Uh oh!

mkmkme Jun 10, 2026

Uh oh!

alsugiliazova commented Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

ianton-ru commented Jun 5, 2026

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

CI/CD Options

Exclude tests:

Regression jobs to run:

Uh oh!

github-actions Bot commented Jun 5, 2026

Uh oh!

ianton-ru commented Jun 5, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

ianton-ru Jun 8, 2026

Choose a reason for hiding this comment

Uh oh!

mkmkme Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

ianton-ru commented Jun 8, 2026

Uh oh!

mkmkme left a comment

Choose a reason for hiding this comment

Uh oh!

mkmkme Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

alsugiliazova commented Jun 10, 2026

Audit update for PR #1875

Confirmed defects

High — CREATE TABLE AS s3Cluster() / fileCluster() / urlCluster() is broken inside a Replicated database

Coverage summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

High — `CREATE TABLE AS s3Cluster()` / `fileCluster()` / `urlCluster()` is broken inside a `Replicated` database