Skip to content

Refactor/replace ropey lineindex#32

Merged
ajuvercr merged 3 commits into
mainfrom
refactor/replace-ropey-lineindex
Jun 29, 2026
Merged

Refactor/replace ropey lineindex#32
ajuvercr merged 3 commits into
mainfrom
refactor/replace-ropey-lineindex

Conversation

@ajuvercr

Copy link
Copy Markdown
Collaborator

No description provided.

ajuvercr and others added 2 commits June 29, 2026 11:52
`prefix_diagnostic_helper` sliced the rope with char-indexed `get_slice`
while term spans hold byte offsets. After any multi-byte char earlier on
the line (e.g. an en-dash in a literal) the slice was read too far,
turning `rdfs:domain` into `fs:domain` and reporting a phantom
"Undefined prefix fs". Slice by bytes via `get_byte_slice` instead.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds e2e tests asserting that Position.character is a UTF-16 code-unit
count, per the LSP default. These FAIL against the current byte-based
`offset_to_position` and are the red half of the upcoming switch from
ropey to a hand-rolled LineIndex:

  * diagnostic_column_is_utf16_after_two_byte_char  (BMP: byte ≠ utf16)
  * diagnostic_column_is_utf16_after_surrogate_pair (astral: pins utf16
    apart from both byte count and char count)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@ajuvercr ajuvercr force-pushed the refactor/replace-ropey-lineindex branch from 744ebcd to 2f72697 Compare June 29, 2026 09:52
We only ever used ropey for index conversion, never for what it's for:
the server uses full-document sync and rebuilds the buffer from a String
on every change (no incremental edits, no snapshots). Worse, ropey hid
the one decision that actually matters for correctness — which encoding a
column is measured in — and we got it wrong, emitting byte columns where
the LSP default is UTF-16.

Replace `RopeC(ropey::Rope)` with `RopeC(LineIndex)`: a String plus a
line-start table, where every conversion takes an explicit
`PositionEncoding`. `util::{offset_to_position, position_to_offset, ...}`
now route through it with a single `ENCODING` constant (UTF-16).

This makes the red `position_encoding` tests pass and also fixes the
byte/char mixups found along the way:

  * rename: sliced by char index + mixed byte spans with char counts
  * prefix decl range: built char offsets, fed them to a byte-based
    converter
  * inlay hints: `get_char` (char-indexed) called with a byte offset
  * semantic tokens: char/byte length mixups; now emits UTF-16 columns
    *and* lengths via a single linear pass (no O(line^2) on minified
    single-line JSON-LD)

ropey is dropped from every crate. LineIndex carries unit tests for line
counting, CRLF, trailing newline, BMP/astral columns and round-trips.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@ajuvercr ajuvercr merged commit 81148b6 into main Jun 29, 2026
2 checks passed
@ajuvercr ajuvercr deleted the refactor/replace-ropey-lineindex branch June 29, 2026 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant