Skip to content

PHOENIX-7898 :- Cloned PhoenixConnection loses HA group, breaking _HAGroupName tagging on UPSERT SELECT and DELETE#2515

Merged
lokiore merged 3 commits into
apache:PHOENIX-7562-feature-newfrom
lokiore:PHOENIX-7898/clone-haGroup-fix
Jun 12, 2026
Merged

PHOENIX-7898 :- Cloned PhoenixConnection loses HA group, breaking _HAGroupName tagging on UPSERT SELECT and DELETE#2515
lokiore merged 3 commits into
apache:PHOENIX-7562-feature-newfrom
lokiore:PHOENIX-7898/clone-haGroup-fix

Conversation

@lokiore

@lokiore lokiore commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Three localized changes in phoenix-core-client that propagate the HA group through cloned PhoenixConnections, plus an integration test that locks the regression:

  1. PhoenixConnection.java — copy-constructor haGroup propagation. The four copy constructors that take an existing PhoenixConnection as the first argument all delegate to the 9-arg private constructor with null passed for the cloned connection's haGroup field. Replaced with connection.haGroup so cloned connections inherit the parent's HA group. Affected ctors: (PhoenixConnection, boolean, boolean), (PhoenixConnection, MutationState), (PhoenixConnection, Properties), and (PhoenixConnection, ConnectionQueryServices, Properties). The 1-arg (PhoenixConnection) and the (PhoenixConnection, long scn) ctors delegate transitively and inherit the fix.

  2. PhoenixConnection.java — Javadoc disambiguation on the two HA fields. The haGroupName (String) and haGroup (HighAvailabilityGroup) fields share an identical // Only available for connection which is part of HA Connections comment. Replaced with field-specific descriptions: haGroupName is the wire-tag string used for the _HAGroupName HBase attribute and is set on every connection that resolved through an HA route (including cloned connections); haGroup is the resolved group object used for failover orchestration and cluster-role lookups, set on root HA connections and inherited by clones via the copy constructors.

  3. ConnectionQueryServices.java — interface ergonomics. The 3-arg connect(String url, Properties info, HighAvailabilityGroup haGroup) overload is currently abstract. Convert to default delegating to the 2-arg connect(url, info). The three internal implementors (ConnectionQueryServicesImpl, ConnectionlessQueryServicesImpl, DelegateConnectionQueryServices) already override the 3-arg form with a haGroup-using body. The default is for source-compatibility with any 3rd-party ConnectionQueryServices implementor that has not yet adopted the 3-arg overload — those implementors get the pre-existing 2-arg behavior (haGroup discarded) instead of a NoSuchMethodError.

ReplicationLogGroupIT.java — adds testUpsertSelectReplicatesViaCloneConnection to lock the regression. The test exercises an UPSERT SELECT of 250 rows (above the default MUTATE_BATCH_SIZE = 100) so the in-loop flush at UpsertCompiler.upsertSelect fires through the cloned connection's MutationState.send() path — the only path where the missing haGroup field is observable.

JIRA: https://issues.apache.org/jira/browse/PHOENIX-7898

Why are the changes needed?

Cloned PhoenixConnections are produced by MutatingParallelIteratorFactory.newIterator(...) (the server-side coprocessor path for UPSERT SELECT and DELETE), which calls new PhoenixConnection(this.connection) per parallel iterator scope. Because each clone receives haGroup = null, the guard inside MutationState.annotateMutationWithMetadata:

if (connection.getHAGroup() != null && StringUtils.isNotBlank(connection.getHAGroupName())) {
  haGroupName = Bytes.toBytes(connection.getHAGroupName());
}

evaluates false on every mutation flushed through the cloned connection. The _HAGroupName HBase attribute therefore is not attached to those mutations, and the consistent_failover (CCF) server's mutation-block contract — which gates ACTIVE→STANDBY transitions on the presence of that attribute — is not enforced for the affected write paths.

Threshold sensitivity: with MUTATE_BATCH_SIZE = 100 (default), an UPSERT SELECT of N rows triggers floor(N/100) in-loop flushes through the cloned connection's MutationState.send() path. The trailing N mod 100 rows are joined back to the parent connection and committed via the parent's path (parent's haGroup is non-null), so those rows DO get tagged. For N=250: 200 mutations dropped (no _HAGroupName), 50 mutations tagged.

This is a correctness gap on the PHOENIX-7562-feature-new line and must be fixed before that line merges to master.

Does this PR introduce any user-facing change?

No

This is a pre-launch CCF feature branch. No customer-visible behavior change; the fix corrects an internal contract that has not shipped to users. The interface change converts an abstract method to a default, which is source-compatible.

How was this patch tested?

New integration test method testUpsertSelectReplicatesViaCloneConnection in ReplicationLogGroupIT.

Local commands run on PHOENIX-7562-feature-new HEAD 8f9655a69e:

mvn install -pl phoenix-core-client,phoenix-core,phoenix-core-server -am -DskipTests
# BUILD SUCCESS, 5 reactor modules

mvn spotless:apply -DskipTests -q
# silent success — zero drift

mvn -pl phoenix-core failsafe:integration-test failsafe:verify \
  -Dit.test='ReplicationLogGroupIT#testUpsertSelectReplicatesViaCloneConnection' \
  -DfailIfNoTests=false
# pre-fix:  Tests run: 1, Failures: 1 — assertion: target table expected:<250> but was:<50>
# post-fix: Tests run: 1, Failures: 0 — target table = 250

Empirical pre/post-fix dumpTableLogCount on the same test:

pre-fix:  T_source = 250  | T_target =  50   ← 200 dropped via 2x in-loop flush through cloned conn (no _HAGroupName)
post-fix: T_source = 250  | T_target = 250   ← cloned conn inherits haGroup, _HAGroupName tagged

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude Code (Opus 4.7)

lokiore added 3 commits June 12, 2026 13:26
…GroupName tagging on UPSERT SELECT and DELETE

The four PhoenixConnection copy constructors that take an existing
PhoenixConnection as the first argument all delegate to the 9-arg
private constructor with `null` passed for the cloned connection's
haGroup field. Cloned connections produced by
MutatingParallelIteratorFactory.newIterator(...) (server-side
coprocessor for UPSERT SELECT and DELETE) therefore evaluate the
MutationState.annotateMutationWithMetadata guard
`if (connection.getHAGroup() != null && ...)` to false, the
_HAGroupName HBase attribute is not attached to those mutations, and
the consistent_failover (CCF) server's mutation-block contract is not
enforced for the affected write paths.

Three localized changes in phoenix-core-client:

1. PhoenixConnection.java -- four copy constructors propagate
   connection.haGroup instead of null. Restores HA-group inheritance
   for cloned connections produced by MutatingParallelIteratorFactory
   and other clone paths.
2. PhoenixConnection.java -- split duplicated
   `// Only available for connection which is part of HA Connections`
   Javadoc on haGroupName / haGroup into field-specific descriptions
   that document each field's distinct role.
3. ConnectionQueryServices.java -- convert the 3-arg
   connect(String, Properties, HighAvailabilityGroup) overload from
   abstract to default, delegating to the 2-arg connect(url, info).
   Preserves source-compatibility for any 3rd-party
   ConnectionQueryServices implementor that has not yet adopted the
   3-arg overload.

Adds testUpsertSelectReplicatesViaCloneConnection in
ReplicationLogGroupIT to lock the regression. The test exercises an
UPSERT SELECT of 250 rows (above the default MUTATE_BATCH_SIZE = 100)
so the in-loop flush at UpsertCompiler.upsertSelect fires through the
cloned connection's MutationState.send() path -- the only path where
the missing haGroup field is observable. Pre-fix: target table
receives 50 of 250 rows (assertion fails). Post-fix: target table
receives 250 of 250 rows (assertion passes).

Generated-by: Claude Code (Opus 4.7)
…py constructors

Replace direct `connection.haGroup` field access with the public
getter `connection.getHAGroup()` in the four PhoenixConnection copy
constructors. Preserves encapsulation and matches the existing
MutationState.annotateMutationWithMetadata guard which already uses
`connection.getHAGroup() != null`.

Generated-by: Claude Code (Opus 4.7)
Removes the redundant PhoenixConnection.haGroupName field; HA group name
is now derived through connection.getHAGroup().getName().

Adds HighAvailabilityGroup.getName() convenience accessor that delegates
to info.getName() so callers in org.apache.phoenix.util and
org.apache.phoenix.execute can access the HA group name without needing
visibility into the package-private HAGroupInfo.

Collapses the paired-gate
  getHAGroup() != null && StringUtils.isNotBlank(getHAGroupName())
to single-gate
  getHAGroup() != null
at both production callsites (ScanUtil.setScanAttributeForHAForPointLookups,
MutationState.annotateMutationWithMetadata). The blank check is impossible
in practice: HAGroupInfo's ctor calls Preconditions.checkNotNull(name) and
HighAvailabilityGroup.getUrlInfo throws HA_INVALID_PROPERTIES for empty
names before any HAGroupInfo is constructed, so a non-null HAGroup with a
blank/null name is structurally impossible.

Removes the now-unused StringUtils import from ScanUtil and MutationState.

Generated-by: Claude Code (Opus 4.7)
@lokiore lokiore force-pushed the PHOENIX-7898/clone-haGroup-fix branch from 114d5df to 7a8771a Compare June 12, 2026 20:31
@lokiore lokiore merged commit 0a3c416 into apache:PHOENIX-7562-feature-new Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants