Files
zed-p8/grammars/pico-8-lua/KNOWN_LIMITATIONS.md
T
kistaro c8ad7e74e7 Document line-significance limitations in the Pico-8 Lua grammar
PICO-8's shorthand `if (cond) stmt [else stmt]` is line-bounded, but
tree-sitter has no built-in newline awareness. Without an external
scanner ( the same mechanism tree-sitter-python uses for INDENT /
DEDENT / NEWLINE ), the grammar greedily binds `else` to the nearest
`if` and takes only one consequence statement for the shorthand body.
Token classification is unaffected, so syntax highlighting renders
identically to a correct parse; only auto-indent and semantic
selection are subtly off, in a code pattern that is uncommon in real
PICO-8 code.

New `grammars/pico-8-lua/KNOWN_LIMITATIONS.md` walks through both
incorrect cases ( the dangling-else mis-bind and the multi-statement
shorthand body ), tabulates which Zed features are and aren't
affected, and sketches the fix. README cross-links it from the
"Known limitations" block and adds it as a prerequisite to the v0.3
LSP work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:23:50 -07:00

4.5 KiB

Known limitations of tree-sitter-pico8-lua

PICO-8's Lua dialect is line-significant in two places: the body of a shorthand if (cond) ... / while (cond) ... extends to end-of-line, and the optional else of a shorthand if must be on the same line as the opening if. Tree-sitter has no built-in concept of newlines as syntactic tokens — to encode line-significance correctly we'd need an external scanner ( a C file that emits synthetic line-end tokens, the same mechanism tree-sitter-python uses for INDENT/DEDENT/NEWLINE ).

We have intentionally not written that scanner yet. This document tracks the resulting parse incorrectness so it isn't forgotten when we revisit.

1. Dangling-else mis-bind in nested if

-- intended:  outer if/else, with shorthand-if as a single statement
-- inside the outer if's consequence.
if is_noisy then
  if (is_goose()) honk()
else
  toot()
end

The grammar's shorthand if rule uses prec.right on its optional else clause, so it greedily eats any else it can see — matching the classic "associate else with nearest if" convention from C / Java. That's wrong for PICO-8, where the line break after honk() should have closed the shorthand. The bound-too-tight parse:

  • else is parsed as the shorthand's alternative, not the outer if's.
  • The outer if_statement ends up with no else_statement child.
  • The trailing end still resolves to the outer if_statement, so the source still parses cleanly ( no ERROR node ).

Indistinguishable case — both parses are correct here, because the else really is on the same line as the shorthand:

if is_noisy then
  if (is_goose()) honk() else toot()
end

2. Multi-statement shorthand body

-- both statements are conditional in PICO-8.
if (is_falling()) wheeee() splat()

The grammar's shorthand_if_statement rule takes exactly one consequence statement, so this parses as:

  • shorthand_if_statement with consequence wheeee()
  • followed by an unconditional splat() statement

A line-aware grammar would gather every statement up to end-of-line into the shorthand body. Visually:

-- this and the previous example produce the SAME parse tree under
-- the current grammar, which is wrong for the previous example.
if (is_falling()) wheeee()
splat()

What does this break?

The parse is structurally wrong but token classification stays correct, because every keyword and identifier is still itself regardless of which parent node owns it. So:

Feature Affected? Notes
highlights.scm ( syntax highlighting ) No else is @keyword.conditional whether it's a child of shorthand_if_statement or else_statement.
outline.scm ( file outline ) No Doesn't traverse if-bodies.
Bracket matching No Independent of if/else structure.
Injections No Independent.
indents.scm ( auto-indent ) Subtly A mis-bound else is inside a shorthand_if_statement, which is not an @indent node; so the next line may land at the wrong indent column.
Semantic selection ( "expand selection" ) Subtly Cursor on toot() expands to shorthand_if_statement instead of else_statement → outer if_statement.
folds.scm / textobjects.scm Potentially Not currently shipped; would inherit the structural bug if we add them.
Static analysis / LSP-style features Yes Anything that walks the AST to reason about reachability or scope ( e.g. "unreachable code", goto-definition through a conditional branch ) will mis-report. None of this is shipped today.

For v0.2's stated scope ( syntax highlighting + a basic outline ), the visible symptom is "auto-indent occasionally off by one column inside a nested-if-with-out-of-line-else", which only bites a relatively uncommon code pattern. Deferred until v0.3 LSP work, which needs a correct AST.

Fixing it later

The canonical approach is an external scanner. Sketch:

  1. Add an external symbol like _logical_line_end that emits at every \n not preceded by line-continuation context.
  2. Make shorthand_if_statement take the form seq('if', '(', expr, ')', stmt, optional(seq(\ /* not _logical_line_end yet */ 'else', stmt)), $._logical_line_end).
  3. Allow shorthand_if_statement consequence to be repeat1(stmt) so a one-line if (x) a() b() puts both calls in the shorthand body.

The scanner needs to be written in C, registered via the externals field, and built into src/scanner.c. tree-sitter-python's scanner is a good reference for the pattern.