feat(sitesearch): vendor-neutral Site Search — neutral aggregation + OpenSearch impl + phase-aware router (#35786)#36282
Conversation
Decouple SiteSearchAPI/SiteSearchWebAPI from Elasticsearch aggregation types so Site Search can be served by OpenSearch in Phase 3. - Reuse the existing neutral com.dotcms.content.index.domain.Aggregation / AggregationBucket DTOs (from #36026) instead of a new IndexAggregation - Add neutral DotSearchException (unchecked) to replace ElasticsearchException on the public API surface - SiteSearchAPI: drop org.elasticsearch.* imports; neutral Aggregation return type; createSiteSearchIndex throws DotSearchException - SiteSearchWebAPI: remove InternalDateHistogram/StringTerms/Bucket casts and the Joda DateTime import; getFacets distinguishes histogram vs terms by aggregation type and feeds the legacy wrappers neutral buckets - ESSiteSearchAPI: adapt ES results via Aggregation.from(); ES exception throws -> DotSearchException - Add date/numeric histogram support to the neutral Aggregation ES factory (also fixes a latent CCE: the old getFacets cast the histogram key to Joda DateTime, which is a java.time.ZonedDateTime in ES 7.x) OSSiteSearchAPI is deferred to #34609 (not yet in the codebase); Aggregation.fromOS() is already in place for it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Claude finished @fabrizzio-dotCMS's task in 4m 35s —— View job Rollback Safety Analysis
Pull Request Unsafe to Rollback!!!
|
🤖 Bedrock Review —
|
|
Pull Request Unsafe to Rollback!!!
|
#35786) Completes the vendor-neutral Site Search extraction begun in #35786 by adding the OpenSearch implementation and a phase-aware router, so Site Search dual-writes and reads correctly across the ES -> OS migration phases. - OSSiteSearchAPI: @ApplicationScoped @default OpenSearch implementation of SiteSearchAPI. Search/aggregations via the generic client -> ContentSearchResponse (mirrors OSSearchAPIImpl); doc put/delete via _doc PUT/DELETE; get via typed client.get(...). Default site-search index resolved from VersionedIndicesAPI (not the deprecated IndiciesAPI). Index names handled in logical space; the .os tag forced by VersionedIndicesAPI is stripped on read. - SiteSearchAPIImpl: PhaseRouter<SiteSearchAPI> router mirroring IndexAPIImpl and acting as the single fan-out point. Reads -> read provider; doc/index writes -> write fan-out; listIndices/listClosedIndices merge in dual-write; Quartz task methods route to a single provider (fan-out would double-schedule jobs). - ESSiteSearchAPI: use raw ESIndexAPI instead of the IndexAPI router so the SiteSearch router is the only fan-out point (avoids double dual-write). - APILocator: SITESEARCH_API now returns SiteSearchAPIImpl. - OSSiteSearchAPIIntegrationTest: lifecycle, doc round-trip, aggregations, and default-index activation; registered in OpenSearchUpgradeSuite. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
CI (OpenSearch Upgrade Suite) failed: every OSSiteSearchAPIIntegrationTest that creates an index errored with "Failed to parse index settings". The OS impl was loading es-sitesearch-settings.json, whose ES-only token-filter syntax (edgeNGram, side) is rejected by the typed OpenSearch IndexSettings deserializer in OSIndexAPIImpl.createIndex. Add os-sitesearch-settings.json declaring the same analyzers (standard_content, partial_content, comma_analyzer) in OpenSearch syntax (edge_ngram, no side), and load it from OSSiteSearchAPI.createSiteSearchIndex. The mapping is vendor-neutral and reused as-is. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…l index The aggregation IT failed: mimeType aggregation hit "Text fields are not optimised ... use a keyword field". Root cause: createSiteSearchIndex delegated the mapping PUT to MappingOperationsOS, which force-tags the physical name with `.os`. Site search uses untagged logical names, so the mapping landed on a different (`.os`) index while the real index kept the dynamic default mapping (string -> text), breaking keyword aggregations. Apply the mapping with a raw PUT /<index>/_mapping against the same untagged physical name used by createIndex/search/put, and drop the MappingOperationsOS dependency. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
🤖 Bedrock Review —
|
…5786) Adds SiteSearchWebAPITest covering the view-tool surface affected by the neutral-aggregation refactor: search() (default-index, alias, pagination, empty and error paths) with full SiteSearchResults/SiteSearchResult field assertions; getAggregations() over the neutral Aggregation/AggregationBucket tree (terms, nested top_hits, numeric-histogram getKeyAsNumber); and getFacets() across all three legacy wrappers (string-terms, count-histogram, plain Facet fallback). Registered in MainSuite1b alongside ContentSearchToolTest. Also a minor List.getFirst() cleanup in SiteSearchAPIImpl. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
🤖 Bedrock Review —
|
…apping Two OpenSearch site-search regressions surfaced by the dual-write fan-out: 1. Shared mutable result across the fan-out. SiteSearchAPIImpl.putToIndex handed the same SiteSearchResult to both leaves. putToIndex mutates the backing map (setKeywords rewrites "keywords" String -> List), so the first leaf (ES) corrupted the input the second leaf (OS) then read, throwing ClassCastException: EmptyList cannot be cast to String and silently dropping every document from OpenSearch. The router now copies the result (and each element of the batch overload) per provider. 2. Mapping fan-out leak. ESSiteSearchAPI.createSiteSearchIndex applied its mapping through the phase-dispatched ESMappingAPIImpl.putMapping, which fanned out a second time to OpenSearch using a .os-tagged physical name that site-search OS indices never use -> HTTP 404. Pinned the ES leaf to IndexTag.ES, restoring the single-fan-out invariant (SiteSearchAPIImpl already drives OSSiteSearchAPI, which owns its own untagged OS index + mapping). Adds SiteSearchDualWriteRouterIT (registered in OpenSearchUpgradeSuite) which drives the router in Phase 1 dual-write and asserts documents reach OpenSearch (single + batch) — the isolated OS-leaf IT cannot reproduce either bug. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
🤖 Bedrock Review —
|
What & why
Closes the Site Search portion of the ES → OpenSearch migration (#35786). Site Search is decoupled from Elasticsearch types and given a working OpenSearch backend plus a phase-aware router, so it dual-writes and reads correctly across all migration phases.
Two commits:
org.elasticsearch.*from theSiteSearchAPIcontract andSiteSearchWebAPI, reusing the existingcom.dotcms.content.index.domain.Aggregation/AggregationBucketDTOs (from Aggregation return-type change breaks existing VTL templates accessing $results.aggregations #36026) with histogram support, and introducesDotSearchException.OSSiteSearchAPI, theSiteSearchAPIImplphase router, and an integration test.Changes
SiteSearchAPI/SiteSearchWebAPIgetAggregations/getFacetsreturn neutralAggregation;DotSearchExceptionaddedOSSiteSearchAPI(new)@ApplicationScoped @DefaultOpenSearch impl. Search/aggregations via the generic client →ContentSearchResponse(mirrorsOSSearchAPIImpl); doc put/delete via_docPUT/DELETE; get via typedclient.get(...). Default index resolved fromVersionedIndicesAPI(not the deprecatedIndiciesAPI)SiteSearchAPIImpl(new, router)PhaseRouter<SiteSearchAPI>mirroringIndexAPIImpl; the single fan-out point. Reads → read provider; doc/index writes → write fan-out;listIndices/listClosedIndicesmerge in dual-write; Quartz task methods route to a single provider (fan-out would double-schedule jobs)ESSiteSearchAPIESIndexAPIinstead of theIndexAPIrouter so the SiteSearch router is the only fan-out point (avoids double dual-write of OS indices)APILocatorSITESEARCH_APInow returnsSiteSearchAPIImplDesign notes
ESSiteSearchAPIin the enterprise package (license-gated feature). The singleannotatedbeans.xmlcovers the mergedtarget/classes, so CDI still discovers the@Defaultbean.VersionedIndicesAPIforce-tags.oson store/load, so the default isIndexTag.strip(...)-ed on read.deactivateIndexcallsremoveVersion(...)when removing the slot would leave the version empty (saveIndicesrejects empty).SearchHitDTO carries no highlights, so OSsearch()returns empty highlight arrays (the ES path is best-effort too) — markedTODO OS.Testing
./mvnw compile -pl :dotcms-core→ BUILD SUCCESS (Java 25)./mvnw test-compile -pl :dotcms-integration -am→ BUILD SUCCESSOSSiteSearchAPIIntegrationTest(registered inOpenSearchUpgradeSuite) covers lifecycle, doc round-trip, aggregations, and default-index activation. Requires theopensearch-upgradecontainer:🤖 Generated with Claude Code