fix(nodes): tolerate doubled-brace JSON output from models like DeepSeek by mjmirza · Pull Request #1085 · ScrapeGraphAI/Scrapegraph-ai

mjmirza · 2026-06-10T08:02:50Z

Summary

GenerateAnswerNode fails with OutputParserException: Invalid json output when the LLM returns its JSON wrapped in doubled braces, e.g. {{"content": "..."}}. This happens reliably with DeepSeek (deepseek/deepseek-chat) and intermittently with other less strict models.

Root cause

The schema-less format_instructions show the expected shape as:

{{"content": "your analysis here"}}

Those doubled braces are LangChain template escaping for a single literal { }. GPT-4o-class models follow the instruction and emit single braces, but some models (notably DeepSeek) copy the doubled braces verbatim into their answer, which is not valid JSON, so JsonOutputParser raises.

Fix

Add TolerantJsonOutputParser, a small JsonOutputParser subclass that, only on the parse-failure path, retries once with a single layer of wrapping braces removed. Behaviour is unchanged for any model already returning valid JSON (the happy path goes straight through the parent parser). It is used in the schema-less branch of GenerateAnswerNode.

scrapegraphai/utils/output_parser.py — new TolerantJsonOutputParser + _strip_doubled_braces helper
scrapegraphai/nodes/generate_answer_node.py — use it in the non-schema branch
tests/utils/output_parser_test.py — 7 unit tests (clean JSON unchanged, doubled-brace recovery, whitespace, irrecoverable output still raises)

Reproduction (before this change)

from scrapegraphai.graphs import SmartScraperGraph
cfg = {"llm": {"api_key": "<deepseek-key>", "model": "deepseek/deepseek-chat"}, "headless": True}
SmartScraperGraph(prompt="Extract the heading", source="https://example.com", config=cfg).run()
# -> OutputParserException: Invalid json output: {{"content": "..."}}

After: returns the parsed dict as expected.

Testing

python -m pytest tests/utils/output_parser_test.py
# 7 passed

Suggested labels: bug, size:S

The default GenerateAnswerNode format_instructions show the expected shape as {{"content": ...}} (LangChain's escaped braces). Strongly instruction-following models emit single braces, but some models (notably DeepSeek) copy the doubled braces verbatim, yielding {{"content": ...}} which JsonOutputParser rejects with 'Invalid json output'. Add TolerantJsonOutputParser, a JsonOutputParser subclass that retries once with a single layer of wrapping braces removed, only on the parse-failure path. Behaviour is unchanged for any model already returning valid JSON. Use it in the schema-less branch of GenerateAnswerNode. Adds unit tests.

## [2.2.0-beta.4](v2.2.0-beta.3...v2.2.0-beta.4) (2026-06-11) ### Bug Fixes * **nodes:** tolerate doubled-brace JSON output from models like DeepSeek ([#1085](#1085)) ([aaa5d2c](aaa5d2c))

github-actions · 2026-06-11T09:18:03Z

🎉 This PR is included in version 2.2.0-beta.4 🎉

The release is available on:

v2.2.0-beta.4
GitHub release

Your semantic-release bot 📦🚀

dosubot Bot added size:M This PR changes 30-99 lines, ignoring generated files. bug Something isn't working labels Jun 10, 2026

VinciGit00 approved these changes Jun 11, 2026

View reviewed changes

dosubot Bot added the lgtm This PR has been approved by a maintainer label Jun 11, 2026

VinciGit00 merged commit aaa5d2c into ScrapeGraphAI:pre/beta Jun 11, 2026
1 of 2 checks passed

github-actions Bot added the released on @dev label Jun 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(nodes): tolerate doubled-brace JSON output from models like DeepSeek#1085

fix(nodes): tolerate doubled-brace JSON output from models like DeepSeek#1085
VinciGit00 merged 1 commit into
ScrapeGraphAI:pre/betafrom
mjmirza:fix/tolerant-json-doubled-braces

mjmirza commented Jun 10, 2026

Uh oh!

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

mjmirza commented Jun 10, 2026

Uh oh!

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants