Skip to content

36002 workflow fire convert block editor markdown to prosemirror json on save server side#36253

Open
hassandotcms wants to merge 5 commits into
mainfrom
36002-workflow-fire-convert-block-editor-htmlmarkdown-to-prosemirror-json-on-save-server-side
Open

36002 workflow fire convert block editor markdown to prosemirror json on save server side#36253
hassandotcms wants to merge 5 commits into
mainfrom
36002-workflow-fire-convert-block-editor-htmlmarkdown-to-prosemirror-json-on-save-server-side

Conversation

@hassandotcms

Copy link
Copy Markdown
Member

What

Converts Story Block (Block Editor) field values supplied as Markdown to Tiptap/ProseMirror JSON server-side, on the shared content save path. Non-interactive clients (AI agents, headless imports) no longer need a human to open and re-save the contentlet for the field to read back as structured content.

Closes #36002 (Markdown scope; see Scope below).

How

  • TiptapMarkdown.isTiptapDoc / isMarkdownRepresentable — pure discriminators.
  • MapToContentletPopulator.fillFields — the seam shared by the workflow fire endpoints and the content REST API. For a Story Block value:
    • starts with { (JSON) or < (HTML) → stored unchanged;
    • otherwise (Markdown) → converted via TiptapMarkdown.toTiptap.
  • Guards: already-valid JSON passes through untouched; a Markdown update is rejected (400) when the existing stored doc contains rich blocks Markdown can't represent (dotContent, dotVideo, grid, …) instead of silently destroying them; a conversion failure never blocks the save (stores raw + logs).
  • OpenAPI fire note corrected to reflect automatic server-side conversion (regenerated).

Scope

  • Markdown only. HTML is deferred to a follow-up PR. HTML continues to pass through unchanged (no regression).

Behavior change

  • A Markdown overwrite of rich content now returns 400 instead of a silent success that
    corrupted the field. Additive and rollback-safe (stored JSON is read natively by N-1).

Testing

  • Unit (65): TiptapMarkdownDocDetectionTest (13) + existing TiptapMarkdownTest /
    RoundTripContractTest.
  • Integration (5): StoryBlockMarkdownPopulatorTest — convert + GraphQL read-back, JSON
    passthrough, HTML passthrough, primitive replace, rich-overwrite reject.
  • Regression re-run clean: MapToContentletPopulatorTest (20), StoryBlockValidationTest (28).

…minators

Add two pure helpers to TiptapMarkdown that the save path needs to safely
ingest Story Block values:

- isTiptapDoc(String): cheap detector for an already-valid Tiptap/ProseMirror
  document (peeks the first non-whitespace char before parsing), so editor-
  authored JSON can be stored unchanged instead of re-parsed as Markdown.
- isMarkdownRepresentable(String): true only when every block is Markdown-
  expressible, used to refuse a Markdown overwrite that would silently drop
  rich blocks (dotContent, dotVideo, grid, etc.). Marks are ignored on purpose
  (losing a mark loses styling, not content).

Covered by TiptapMarkdownDocDetectionTest (13 cases incl. nested rich blocks,
marks-only docs, malformed/empty/null input).

Refs #36002
…t save path

Wire the converter into MapToContentletPopulator.fillFields, the shared seam
that the workflow fire endpoints and the content REST API all funnel through.
For a Story Block field whose incoming value is Markdown (begins with neither
'{' nor '<'), convert it to a ProseMirror JSON document and store that, so non-
interactive clients (AI agents, headless imports) no longer require a human to
open and re-save the contentlet.

Guards:
- Already-valid Tiptap JSON and (deferred) HTML are stored unchanged.
- A Markdown update is refused when the existing stored document contains rich
  blocks Markdown cannot represent, rather than silently destroying them.
- A conversion failure never blocks the save: the raw value is stored and a
  warning logged (graceful degradation, consistent with #35728).

The converter stays pure; conversion and guards live at the ingestion seam.

Covered by StoryBlockMarkdownPopulatorTest (convert + GraphQL read-back, JSON
passthrough, HTML passthrough, primitive replace, rich-overwrite reject);
registered in MainSuite1b.

Refs #36002
…wn conversion

The fire endpoints' Block Editor note promised Markdown/HTML acceptance but
admitted it only took effect after a human re-saved in the editor — documenting
the exact bug #36002 fixes. Update the shared @operation note to state that
Markdown is converted to ProseMirror JSON automatically on save (and already-
valid JSON is stored unchanged), drop the "converted when opened in the editor"
caveat, and use a Markdown example. Regenerate openapi.yaml (all 6 fire
operations share the constant).

Refs #36002
@claude

claude Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Claude finished @hassandotcms's task in 3m 19s —— View job


Rollback Safety Analysis

  • Read rollback-unsafe categories reference
  • Get full PR diff
  • Analyze diff against all unsafe categories
  • Apply label

Result: ✅ Safe To Rollback

This PR introduces no changes that match any rollback-unsafe category. Analysis against every category:

Category Check Result
C-1 Structural Data Model No runonce task, no column/table migration ✅ Safe
C-2 ES Mapping No changes to ESMappingAPIImpl, mappings, or index tasks ✅ Safe
C-3 Content JSON Model Version No CURRENT_MODEL_VERSION bump, no ImmutableContentlet changes ✅ Safe
C-4 DROP TABLE / DROP COLUMN No DDL drops ✅ Safe
H-1 One-Way Data Migration No UPDATE … SET across rows, no data transformation task ✅ Safe
H-2 RENAME TABLE / COLUMN No renames ✅ Safe
H-3 PK Restructuring No PK changes ✅ Safe
H-4 New Field Type No new FieldType registered ✅ Safe
H-5 Storage Provider No FileStorageAPIImpl or StoragePersistenceProvider changes ✅ Safe
H-6 DROP PROCEDURE / FUNCTION None ✅ Safe
H-7 NOT NULL Column without Default No DDL ✅ Safe
H-8 VTL Viewtool Contract No viewtool or returned-object public API change ✅ Safe
M-1 Column Type Change No DDL ✅ Safe
M-2 Push Publishing Bundle Format No *Bundler.java or bundle XML changes ✅ Safe
M-3 REST/GraphQL API Contract Response shape unchanged; conversion happens before storage — existing JSON passthrough is preserved ✅ Safe
M-4 OSGi Interface No OSGi public interface changes ✅ Safe

Summary

All changes are confined to:

  • MapToContentletPopulator — adds a new processStoryBlockField branch that converts Markdown to ProseMirror JSON before storing; JSON and HTML are passed through unchanged.
  • TiptapMarkdown — adds two pure static helpers (isTiptapDoc, isMarkdownRepresentable).
  • WorkflowResource — OpenAPI annotation text update only.
  • openapi.yaml — regenerated description text, no structural contract change.
  • Tests and Postman/Karate collections — assertion updates to match new stored format.

The stored format for an existing Tiptap JSON document is unchanged — N-1 reads and writes the same ProseMirror JSON it always did. The only behavioral change is that a Markdown payload is now automatically converted server-side rather than stored raw; if N is rolled back, N-1 simply stores Markdown strings unchanged (the previous behavior), which is fully backward compatible. The IllegalArgumentException (400) for Markdown-over-rich-content is a new client-visible behavior, but its absence in N-1 is graceful (clients just get a silent overwrite as before — no N-1 startup or data loss risk).

Label applied: AI: Safe To Rollback

@github-actions

github-actions Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

🤖 Bedrock Review — deepseek.v3.2

[🟠 High] dotCMS/src/main/java/com/dotcms/rest/MapToContentletPopulator.java:261 — The condition value instanceof String may miss cases where the value is a non-String object that could be valid for a Story Block field (e.g., a JSON object). This could cause unexpected behavior or data loss if a client sends a structured JSON object instead of a JSON string.

[🟡 Medium] dotCMS/src/main/java/com/dotcms/rest/MapToContentletPopulator.java:293 — The check trimmed.isEmpty() || trimmed.charAt(0) == '{' || trimmed.charAt(0) == '<' assumes HTML starts with '<', but a valid HTML string could have leading whitespace or be a different case (e.g., <!DOCTYPE>). This may cause HTML to be incorrectly stored as-is without conversion when it should be.

[🟡 Medium] dotCMS/src/main/java/com/dotcms/rest/MapToContentletPopulator.java:298 — The method contentlet.getStringProperty(field.getVelocityVarName()) may return null if the property is not set, which could lead to a NullPointerException when passed to TiptapMarkdown.isTiptapDoc(existing). The method isTiptapDoc handles null, but the call !TiptapMarkdown.isMarkdownRepresentable(existing) could still throw if existing is null? Actually, isMarkdownRepresentable also handles null, so this is safe. However, the logic depends on the existing value being a Tiptap doc; if it's null, the condition TiptapMarkdown.isTiptapDoc(existing) will be false, skipping the rich content check. This might allow Markdown to overwrite a null field even if the field previously had rich content? No, because the existing value is null. This is acceptable, but the behavior should be clear.

[🟡 Medium] dotCMS/src/main/java/com/dotcms/tiptap/TiptapMarkdown.java:121 — The method isTiptapDoc uses value.stripLeading() which is available in Java 11+. dotCMS uses Java 25, so it's fine, but ensure compatibility with the project's minimum Java version if different.

[🟡 Medium] dotCMS/src/main/java/com/dotcms/tiptap/TiptapMarkdown.java:130 — The JSON parsing with MAPPER.readTree(value) could be expensive for large documents. Since this is called on every save for Story Block fields, consider performance impact. However, the early check trimmed.charAt(0) != '{' mitigates this for non-JSON values.

[🟡 Medium] dotCMS/src/main/java/com/dotcms/tiptap/TiptapMarkdown.java:152 — The set MARKDOWN_NODE_TYPES includes "dotImage" which is a custom block; ensure that dotImage is indeed representable in Markdown (likely as an image syntax). If not, this could incorrectly allow overwrites that lose information.

[🟠 High] dotCMS/src/main/java/com/dotcms/tiptap/TiptapMarkdown.java:168 — The method isMarkdownRepresentable catches IOException and returns true for malformed JSON. This means a malformed JSON string will be considered representable, potentially allowing a Markdown overwrite that should be rejected. This could lead to data loss. The behavior should be consistent with isTiptapDoc which returns false for malformed JSON.

[🟡 Medium] dotcms-integration/src/test/java/com/dotcms/rest/StoryBlockMarkdownPopulatorTest.java:113 — The test uses Mockito.mock but does not verify any interactions. This is fine for integration, but ensure mocks are properly configured to avoid false positives.

[🟡 Medium] dotcms-integration/src/test/java/com/dotcms/rest/StoryBlockMarkdownPopulatorTest.java:176 — The test expects an IllegalArgumentException with a message containing "rich content". If the exception message changes, the test could fail. Consider matching the exception type only or using a custom matcher.

[🟡 Medium] dotcms-postman/src/main/resources/postman/BringBack.postman_collection.json:115 — The Postman test updates the expectation for the body field from a plain string to a structured JSON object. This is correct for the new behavior, but ensure all related Postman collections are updated accordingly to avoid breaking CI/CD tests.

[🟡 Medium] dotcms-postman/src/main/resources/postman/JsScriptAPI.postman_collection.json:25 — Similar update for body field expectations. Ensure consistency across all API test suites.

[🟡 Medium] test-karate/src/test/java/tests/defaults/CheckingJSONAttributes.feature:71 — The Karate test now expects body to be a structured object. This is correct, but verify that all Karate scenarios relying on the old string format are updated.

[🟠 High] dotCMS/src/main/java/com/dotcms/rest/MapToContentletPopulator.java:314 — The Logger.warn call logs the exception message e.getMessage() which could contain sensitive information or implementation details. Consider logging a generic message or using a sanitized exception.

[🟡 Medium] dotCMS/src/main/java/com/dotcms/rest/api/v1/workflow/WorkflowResource.java:231 — The updated documentation states that Markdown is the only format converted, but the code also handles HTML (by storing as-is). The documentation should reflect that HTML is still accepted but not converted, to avoid confusion.

[🟡 Medium] dotCMS/src/main/webapp/WEB-INF/openapi/openapi.yaml:18675 — The OpenAPI specification updates the description for Block Editor fields. Ensure that the change is consistent across all endpoints (it appears to be updated in multiple places, which is good).


Run: #28038347219 · tokens: in: 11064 · out: 1323 · total: 12387

@hassandotcms hassandotcms requested review from fmontes and wezell June 22, 2026 14:39
@hassandotcms hassandotcms marked this pull request as ready for review June 22, 2026 14:40
#36002 normalizes a plain-text/Markdown Story Block value to a ProseMirror JSON
document on save, so a webPageContent `body` sent as plain text now reads back as
a structured doc (object), not the raw string. Update the two API tests that
asserted the old raw-string round-trip to assert the normalized doc instead,
keeping plain-text input so they still exercise the server-side conversion:

- Karate CheckingJSONAttributes.feature: assert body.type == 'doc' and the
  paragraph text, instead of body == "<raw string>".
- Postman JsScriptAPI: assert body.type == 'doc' and that the surviving text
  segments are present (inline <b> markup is dropped by the Markdown converter),
  across fireNew/fireEdit/firePublish via the JS workflows viewtool.

Refs #36002
…ntent checks

CI surfaced two more API tests that asserted a webPageContent `body` (a Story
Block field) read back as the raw plain-text string. #36002 now normalizes
plain-text to a ProseMirror doc on save, so the field comes back as structured
JSON. Update only the content (webPageContent) assertions to read the text from
the doc — body.content[0].content[0].text — leaving the template `body`
assertions (template markup, not a Story Block field) untouched:

- BringBack: 3 content version checks (create/edit/bring-back).
- VersionableResource: 1 content working-version check.

Determined the complete affected set by parsing every collection's body
assertions against its request endpoint (content vs template), so template,
GraphQL seeded/bundle, and response-body assertions are correctly excluded.

Refs #36002
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI: Safe To Rollback Area : Backend PR changes Java/Maven backend code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Workflow fire: convert Block Editor HTML/Markdown to ProseMirror JSON on save (server-side)

1 participant