parse EOL as a token

Document line-significance limitations in the Pico-8 Lua grammar
PICO-8's shorthand `if (cond) stmt [else stmt]` is line-bounded, but tree-sitter has no built-in newline awareness. Without an external scanner ( the same mechanism tree-sitter-python uses for INDENT / DEDENT / NEWLINE ), the grammar greedily binds `else` to the nearest `if` and takes only one consequence statement for the shorthand body. Token classification is unaffected, so syntax highlighting renders identically to a correct parse; only auto-indent and semantic selection are subtly off, in a code pattern that is uncommon in real PICO-8 code. New `grammars/pico-8-lua/KNOWN_LIMITATIONS.md` walks through both incorrect cases ( the dangling-else mis-bind and the multi-statement shorthand body ), tabulates which Zed features are and aren't affected, and sketches the fix. README cross-links it from the "Known limitations" block and adds it as a prerequisite to the v0.3 LSP work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 00:16:13 -07:00 · 2026-05-01 15:23:50 -07:00 · 2026-05-01 13:20:57 -07:00 · 2026-05-01 12:57:06 -07:00 · 2026-05-01 12:56:59 -07:00 · 2026-05-01 12:54:15 -07:00
36 changed files with 36147 additions and 1803 deletions
@@ -2,5 +2,14 @@ node_modules/
 build/
 *.wasm

-# tree-sitter-cli scratch
+# tree-sitter-cli scratch and parser-clone caches.
+# tree-sitter-cli creates `grammars/<grammar_name>/` ( underscore form,
+# matching the C identifier ) when it auto-clones a grammar repo for
+# parser binary caching. Our hand-maintained dirs use hyphens, so the
+# underscore variants are always cache.
 .tree-sitter/
+grammars/p8_cart/
+grammars/pico8_lua/
+
+# scratch directory for stuff to show an AI agent or reference in the IDE
+.local/
@@ -1,146 +1,216 @@
 # zed-p8

 A Zed extension for the [PICO-8](https://www.lexaloffle.com/pico-8.php) fantasy
-console. The goal is reasonable editor support for the entire `.p8` cartridge
-file format and for PICO-8's Lua dialect — even where PICO-8 deviates from
+console. Reasonable editor support for the entire `.p8` cartridge file format
+and for PICO-8's Lua dialect — including the parts where PICO-8 deviates from
 standard Lua 5.2 (compound assignments, `?` print shorthand, single-line
-`if (cond) ...`, `!=`, binary literals, peek operators, and so on).
+`if (cond) ...`, `!=`, `0b...` binary literals, integer divide `\`, peek
+operators `@`/`%`/`$`, the rotate/logical-shift family `<<>` / `>><` / `>>>`,
+and `#include`).

-## Status — v0.1 scaffold
+## Status — v0.3 (unreleased)

 Working today:

- A small tree-sitter grammar (`p8_cart`, in this repo's root) that parses
-  the `.p8` cartridge container: the magic header line, `version` line, and
-  the named sections `__lua__`, `__gfx__`, `__gff__`, `__label__`, `__map__`,
-  `__sfx__`, `__music__`. Unknown `__name__` markers are accepted as a
-  fallback `unknown_section`.
- A Zed language definition `Pico-8 Cartridge` (file suffix `.p8`) that uses
-  this grammar for outline, section-marker highlighting, and an outline view.
- An injection that hands the body of the `__lua__` section to Zed's
-  built-in Lua language for syntax highlighting.
+- **`tree-sitter-p8-cart`** ( `grammars/p8-cart/` ) — parses the `.p8`
+  cartridge container: the magic header line, `version` line, and the named
+  sections `__lua__`, `__gfx__`, `__gff__`, `__label__`, `__map__`, `__sfx__`,
+  `__music__`. Unknown `__name__` markers fall through to `unknown_section`.
+- **`tree-sitter-pico8-lua`** ( `grammars/pico-8-lua/` ) — fork of
+  [`tree-sitter-grammars/tree-sitter-lua`](https://github.com/tree-sitter-grammars/tree-sitter-lua)
+  with the PICO-8 dialect added. Handles every dialect form the manual
+  documents: see *Dialect coverage* below. Upstream attribution is preserved
+  in `grammars/pico-8-lua/UPSTREAM-LICENSE.md`.
+- **`Pico-8 Cartridge`** language ( `languages/pico-8-cart/`, suffix `.p8` ) —
+  config, marker highlights, outline view, and an injection that hands the
+  `__lua__` body to the Pico-8 Lua grammar.
+- **`Pico-8 Lua`** language ( `languages/pico-8-lua/`, suffix `.p8lua` ) —
+  config, dialect-aware highlights with PICO-8 builtins recognized, brackets,
+  indents, outline.

-Known limitations:
+The `.p8lua` suffix is the convention for bare Pico-8 Lua source files — the
+ones pulled into a cart via `#include`. Plain `.lua` files are intentionally
+*not* claimed by this extension, so users who keep stock Lua files alongside
+their PICO-8 work continue to get standard Lua treatment.
+
+### Dialect coverage
+
+| Feature | Status |
+|---|---|
+| `!=` ( alias for `~=` ) | ✓ |
+| Compound assignment: `+= -= *= /= %= \= ^= ..= &= \|= ^^= <<= >>= >>>= <<>= >><=` | ✓ |
+| Integer divide `\` and modulo `%` ( binary ) | ✓ |
+| Bitwise XOR `^^` ( binary, in addition to upstream's `~` ) | ✓ |
+| Logical shift right `>>>`, rotate left `<<>`, rotate right `>><` | ✓ |
+| Hex literals with fractional part: `0x11.4000` | ✓ |
+| Binary literals: `0b1010` | ✓ |
+| Memory peek prefix unary: `@addr`, `%addr`, `$addr` | ✓ |
+| Single-line `if (cond) stmt [else stmt]` ( no `then`/`end` ) | ✓ |
+| Single-line `while (cond) stmt` ( no `do`/`end` ) | ✓ |
+| `?` print shorthand statement | ✓ |
+| `#include path` directive | ✓ |
+| `_init` / `_update` / `_update60` / `_draw` highlighted as builtins | ✓ |
+
+### Line-significance (resolved in v0.3)
+
+PICO-8's shorthand `if (cond) ...` and `while (cond) ...` are
+line-bounded: a later-line `else` belongs to an enclosing standard
+`if`, not the shorthand, and a multi-statement single-line shorthand
+body collects every statement on the line. The external scanner emits
+a zero-width `LINE_END` token at `\n` / `\r` / EOF when (and only
+when) the parser is at the body-or-terminator decision point of a
+shorthand statement, so the AST now matches PICO-8 semantics — see
+[`grammars/pico-8-lua/KNOWN_LIMITATIONS.md`](grammars/pico-8-lua/KNOWN_LIMITATIONS.md)
+for the wiring detail and
+[`grammars/pico-8-lua/test/corpus/shorthand_line_end.txt`](grammars/pico-8-lua/test/corpus/shorthand_line_end.txt)
+for the test corpus.
+
+### Known limitations

- **PICO-8 Lua dialect is not fully parsed.** The injected grammar is plain
-  Lua 5.2, which does not understand `?` (print shorthand), `+=` and friends,
-  `!=`, `0b...` literals, the `\` integer-divide operator, the `@`/`%`/`$`
-  peek prefixes, or the single-line `if (cond) stmt` / `while (cond) stmt`
-  forms. Code that uses any of those will show parse-error highlighting in
-  those regions only — surrounding code remains correctly highlighted. See
-  Roadmap below.
 - **No language server.** No completion, hover docs, or diagnostics for
-  PICO-8 builtins yet. See Roadmap.
- **No `.p8.png` support.** Only the plain-text `.p8` format is handled.
+  PICO-8 builtins yet — only a static `function.builtin` highlight on
+  recognized names. See Roadmap.
+- **No `.p8.png` support.** Only the plain-text `.p8` format is handled —
+  the PNG-steganography variant is out of scope for a text-focused IDE
+  extension.
+- **Hex sections are unhighlighted blobs.** `__gfx__`, `__map__`, `__sfx__`,
+  `__gff__`, `__music__`, `__label__` parse as opaque line bodies. Roadmap
+  v0.4 covers per-section highlighters.

 ## Repository layout

 ```
 zed-p8/
  extension.toml                  ← Zed extension manifest
-  grammar.js                 ← tree-sitter-p8-cart grammar source
-  src/                       ← generated parser ( committed; regenerate after grammar.js edits )
-  package.json               ← tree-sitter-cli devDependency
-  tree-sitter.json           ← tree-sitter-cli config ( auto-managed )
+  package.json                    ← workspace root; hosts tree-sitter-cli
+  grammars/
+    p8-cart/                      ← cart-format tree-sitter grammar
+      grammar.js
+      tree-sitter.json
+      src/                        ← generated parser ( committed )
+    pico-8-lua/                   ← Pico-8 Lua dialect tree-sitter grammar
+      grammar.js
+      tree-sitter.json
+      package.json                ← marks this dir as ESM for node
+      src/                        ← generated parser + scanner.c ( committed )
+      UPSTREAM-LICENSE.md         ← MIT, tree-sitter-lua by Munif Tanjim
  languages/
    pico-8-cart/                  ← Pico-8 Cartridge language files
      config.toml
      highlights.scm
+      injections.scm              ← injects pico-8-lua into __lua__ body
+      outline.scm
+    pico-8-lua/                   ← Pico-8 Lua language files
+      config.toml
+      highlights.scm
+      brackets.scm
+      indents.scm
      injections.scm
      outline.scm
  examples/
    hello.p8                      ← minimal test cart
-  references/                ← upstream PICO-8 manual + Zed docs links
+  references/                     ← upstream PICO-8 manual + Zed doc links
 ```

-The cart grammar lives at the repo root rather than as a separate sibling
-repository. This keeps everything in one place during early development; if
-the grammar grows or wants to be reused outside Zed it can be split out
-later — the only file that needs to move with it is `grammar.js` plus the
-generated `src/`, and the `[grammars.p8_cart]` URL in `extension.toml` would
-need updating.
+Both grammars live in subdirectories of this same repository. Zed's
+`[grammars.*]` block supports a `path` field, so the extension manifest
+points each grammar at this repo's git URL plus the relevant subdir.

 ## Local development

-Prerequisites: Node.js (for `tree-sitter-cli`) and Zed. Rust is NOT required
-unless/until we add a language-server harness.
-
-### Edit-and-reload loop
-
-1. Edit `grammar.js`.
-2. Regenerate the parser:
+Prerequisites: Node.js ( for `tree-sitter-cli` ) and Zed. Rust is *not*
+required unless / until we add a language-server harness ( v0.3 ).

 ```sh
-   npx tree-sitter generate
+npm install                        # one-time, installs tree-sitter-cli
 ```

-3. Sanity-check on a real cart:
+### Edit a grammar and reload
+
+1. Edit `grammars/<name>/grammar.js`.
+2. Regenerate from the grammar's directory:

   ```sh
-   npx tree-sitter parse examples/hello.p8
+   ( cd grammars/p8-cart   && npx tree-sitter generate )
+   # or
+   ( cd grammars/pico-8-lua && npx tree-sitter generate )
   ```

-4. Commit. The `[grammars.p8_cart]` block in `extension.toml` references this
-   repo by `file://` URL and pins a commit SHA — Zed clones the grammar
-   from that pinned revision, so changes only take effect after they're
-   committed.
+3. Sanity-check by parsing a sample file:

-5. Update `extension.toml`'s `rev` field to the new SHA, then in Zed run
-   `zed: install dev extension` (or click *Install Dev Extension* on the
-   Extensions page) and select this directory. Reinstall after every commit
-   that should be picked up.
+   ```sh
+   ( cd grammars/p8-cart   && npx tree-sitter parse ../../examples/hello.p8 )
+   ( cd grammars/pico-8-lua && npx tree-sitter parse path/to/file.p8lua )
+   ```
+
+4. Commit the regenerated `src/parser.c` along with the grammar change.
+   Zed clones the grammar repo at the SHA in `extension.toml`, so changes
+   only take effect after a commit.
+
+5. Update the `rev` field of the affected `[grammars.*]` block(s) in
+   `extension.toml` to the new SHA, then in Zed run `zed: install dev
+   extension` ( or *Install Dev Extension* on the Extensions page ) and
+   select this directory. Reinstall after every commit that should be
+   picked up.

   Logs: `zed: open log`. Run `zed --foreground` for live stdout.

-### Editing language queries
+### Edit only language queries

-Files under `languages/pico-8-cart/` (`highlights.scm`, `injections.scm`,
-`outline.scm`) are loaded directly by Zed — no regeneration needed. Reinstall
-the dev extension to pick up changes.
+Files under `languages/*/` ( `highlights.scm`, `injections.scm`, etc. )
+are loaded directly by Zed — no regeneration step needed. Reinstall the
+dev extension to pick up changes.
+
+### Tests
+
+Sample carts live under `examples/`; parse them directly with
+`tree-sitter parse <file>` for ad-hoc checks.
+
+The cart grammar has a corpus under `grammars/p8-cart/test/corpus/` —
+run `( cd grammars/p8-cart && npx tree-sitter test )`. The corpus
+covers the empty-section skeleton, normal Lua content, the case where
+a Lua identifier resembles a section marker ( e.g. `local __foo__ = 1`
+must remain a `line`, not be re-tokenized as a marker ), and the
+fallback `unknown_section` rule.
+
+The Lua grammar has a corpus under `grammars/pico-8-lua/test/corpus/` —
+run `( cd grammars/pico-8-lua && npx tree-sitter test )`. The corpus
+exercises shorthand `if`/`while` line-end behavior: dangling-else,
+multi-statement bodies, EOF termination, nested same-line shorthands,
+and coexistence with standard `if (parenthesized) then ... end`.

 ## Roadmap

-### v0.2 — PICO-8 Lua dialect grammar
-
-Fork [`tree-sitter-grammars/tree-sitter-lua`](https://github.com/tree-sitter-grammars/tree-sitter-lua)
-into `tree-sitter-pico8` and add the dialect extensions documented in the
-PICO-8 manual:
-
- Compound-assignment operators: `+= -= *= /= \= %= ^= ..= &= |= ^^= <<= >>= >>>= <<>= >><=`
- `!=` as alias for `~=`
- `\` (integer divide) and the rotate / logical-shift operators `<<>` `>><` `>>>`
- Binary literals `0b...` and hex fractional literals `0x1.4p0` style
- Single-line `if (cond) stmt` and `while (cond) stmt` ( no `then`/`do`/`end` )
- `?` as a statement-level shorthand for `print`
- The peek-prefix unary operators `@addr` `%addr` `$addr`
-
-Then add a second language `Pico-8 Lua` here (separate from Zed's built-in
-`Lua`) and switch `injections.scm` to inject `pico-8-lua` instead of `lua`.
-
 ### v0.3 — Language server integration

+The line-significance prerequisite is now satisfied (see *Line-significance*
+above), so LSP features that walk the AST — unreachable-code lint,
+goto-definition through a conditional branch — have a correct structure
+to work against.
+
 Wire up [`japhib/pico8-ls`](https://github.com/japhib/pico8-ls) ( or whichever
 PICO-8 LSP is most maintained at the time ) for:

- Completion of PICO-8 builtins (`spr`, `circfill`, `btn`, `flr`, …)
- Signature help and hover docs sourced from the manual
- Cart-aware analysis ( the LSP already understands `.p8` section markers
-  and only analyzes the `__lua__` body )
- Per-cart diagnostics
+- Completion of PICO-8 builtins ( `spr`, `circfill`, `btn`, `flr`, … ).
+- Signature help and hover docs sourced from the manual.
+- Cart-aware analysis — the LSP already understands `.p8` section markers
+  and only analyzes the `__lua__` body.
+- Per-cart diagnostics.

-This will require a Rust component ( the `zed_extension_api` crate ) to
-download the language-server binary and define
-`language_server_command` — see [Zed's developing-extensions docs](https://zed.dev/docs/extensions/developing-extensions).
+This requires a Rust component ( the `zed_extension_api` crate ) that
+downloads the LSP binary and defines `language_server_command`.
+See [Zed's developing-extensions docs](https://zed.dev/docs/extensions/developing-extensions).

 ### v0.4 — Polish

 - LuaCATS / EmmyLua stub file enumerating PICO-8's ~110 globals, for users
  who'd rather wire up `lua-language-server` against their `#include`-d
-  `.lua` files.
- Highlight rules for hex sections (`__gfx__`, `__map__`, `__sfx__`, etc.)
-  so palette indices and note pitches show up distinctly.
- Snippets for common idioms (`for x=0,127 do ... end`, the `_init`/`_update`/
-  `_draw` triad).
+  `.p8lua` files.
+- Per-section highlighters for the hex blocks: `__gfx__` colored by palette
+  index, `__sfx__` / `__music__` parsed as note streams, `__map__` as tile
+  indices, `__gff__` as flag bytes.
+- Snippets for common idioms ( the `_init` / `_update` / `_draw` triad,
+  `for x=0,127 do … end`, palette swap setup, etc. ).

 ## References

@@ -148,7 +218,12 @@ download the language-server binary and define
 - Zed extension docs: see links in `references/zed-doc-links.md`
 - Cart file-format spec ( community wiki, not in the official manual ):
  https://pico-8.fandom.com/wiki/P8FileFormat
+- Upstream Lua grammar: https://github.com/tree-sitter-grammars/tree-sitter-lua
+  ( MIT, by Munif Tanjim — preserved in `grammars/pico-8-lua/UPSTREAM-LICENSE.md` )

 ## License

-0BSD — see `LICENSE`.
+The cart grammar and the Zed extension files are 0BSD ( see `LICENSE` ).
+The PICO-8 Lua grammar is a fork of MIT-licensed `tree-sitter-lua`; the
+upstream license is preserved at `grammars/pico-8-lua/UPSTREAM-LICENSE.md`
+and applies to the derived files in that directory.
@@ -1,14 +1,24 @@
 pico-8 cartridge // http://www.pico-8.com
 version 42
 __lua__
-- hello cartridge
+-- hello cartridge — exercises Pico-8 dialect features
+#include shared.p8lua
+
 function _init()
 cls()
+ t = 0
+end
+
+function _update()
+ t += 1
+ if (btn(4)) sweep_palette()
+ move()
 end

 function _draw()
 cls(1)
- print("hello pico-8", 30, 60, 7)
+ draw_blob()
+ ?"frame:"..t, 0, 120, 7
 end
 __gfx__
 00000000111111110000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
@@ -0,0 +1,30 @@
+-- shared.p8lua: included into hello.p8 via #include
+-- demonstrates Pico-8 dialect features outside a __lua__ section.
+
+local x, y = 64, 64
+
+function move()
+ if (btn(0)) x-=1
+ if (btn(1)) x+=1
+ if (btn(2)) y-=1
+ if (btn(3)) y+=1
+ x = mid(0, x, 127)
+ y = mid(0, y, 127)
+end
+
+function draw_blob()
+ cls(1)
+ circfill(x, y, 5, 7)
+ ?"x="..x..",y="..y, 0, 0, 7
+end
+
+-- peek/poke via prefix shorthands
+function sweep_palette()
+ for i=0,15 do
+  poke(0x5f10+i, %0x5f10 ^^ i)
+ end
+end
+
+-- single-line while with compound assignment body
+local i = 0
+while (i < 8) i += 1
@@ -1,15 +1,25 @@
 id = "pico-8"
 name = "Pico-8"
-version = "0.0.1"
+version = "0.0.2"
 schema_version = 1
 authors = ["Kistaro Windrider <kistaro@gmail.com>"]
 description = "Pico-8 cartridge (.p8) and Lua dialect support for Zed"
 repository = "https://github.com/kistaro/zed-p8"

-# The .p8 cart container grammar lives at the root of this repo.
-# During local development, set `repository` to `file://` + the absolute
-# path of your clone and pin `rev` to a committed SHA. When publishing,
-# this should point at a public Git URL.
+# Both grammars live in this same repository, in subdirectories under
+# `grammars/`. Zed's [grammars.*] block supports a `path` field for
+# exactly this layout. During local development, `repository` is a
+# `file://` URL pointing at this clone and `rev` is pinned to a
+# committed SHA — Zed clones the grammar at that revision rather than
+# reading the working tree, so changes only take effect after a commit
+# + a `rev` bump in this file.
+
 [grammars.p8_cart]
 repository = "file:///Users/norberg/gitea-repos/zed-p8"
-rev = "3f209efa897558e8ecd7aa3612846dc12798b0bb"
+rev = "7557a34c89de5a994cc06025c8122dfa3a5af8cf"
+path = "grammars/p8-cart"
+
+[grammars.pico8_lua]
+repository = "file:///Users/norberg/gitea-repos/zed-p8"
+rev = "7557a34c89de5a994cc06025c8122dfa3a5af8cf"
+path = "grammars/pico-8-lua"
@@ -18,16 +18,31 @@
 module.exports = grammar({
  name: 'p8_cart',

-  // Whitespace is significant inside hex sections, so we don't skip it.
+  // Whitespace is significant inside hex sections, so we don't skip it
+  // globally. Tolerance for stray leading blanks before the magic header
+  // is added explicitly via the `repeat($._blank_line)` at the top of
+  // `cartridge` ( see below ).
  extras: $ => [],

  rules: {
    cartridge: $ => seq(
+      // Tolerate stray whitespace / blank lines before the magic header.
+      // Real PICO-8 carts begin with the header on byte 0, but allowing
+      // a leading run of blanks ( a ) lets the `tree-sitter test` corpus
+      // framework, which prepends a newline to each fixture, run cleanly
+      // and ( b ) keeps the parser robust against a hand-edited cart that
+      // gained an accidental blank line up top.
+      repeat($._blank_line),
      optional($.header),
      optional($.version),
      repeat($.section),
    ),

+    // A line that has no content other than horizontal whitespace and a
+    // newline. Hidden ( underscore prefix ) so it does not appear in the
+    // syntax tree.
+    _blank_line: $ => token(/[ \t]*\n/),
+
    header: $ => /pico-8 cartridge \/\/[^\n]*\n/,
    version: $ => /version[ \t]+\d+\n/,

@@ -5,6 +5,13 @@
    "cartridge": {
      "type": "SEQ",
      "members": [
+        {
+          "type": "REPEAT",
+          "content": {
+            "type": "SYMBOL",
+            "name": "_blank_line"
+          }
+        },
        {
          "type": "CHOICE",
          "members": [
@@ -38,6 +45,13 @@
        }
      ]
    },
+    "_blank_line": {
+      "type": "TOKEN",
+      "content": {
+        "type": "PATTERN",
+        "value": "[ \\t]*\\n"
+      }
+    },
    "header": {
      "type": "PATTERN",
      "value": "pico-8 cartridge \\/\\/[^\\n]*\\n"
@@ -0,0 +1,109 @@
+==================
+empty cart skeleton
+==================
+
+pico-8 cartridge // http://www.pico-8.com
+version 42
+__lua__
+__gfx__
+__map__
+__sfx__
+__music__
+
+---
+
+(cartridge
+  (header)
+  (version)
+  (section (lua_section (lua_marker)))
+  (section (gfx_section (gfx_marker)))
+  (section (map_section (map_marker)))
+  (section (sfx_section (sfx_marker)))
+  (section (music_section (music_marker))))
+
+==================
+cart with lua content
+==================
+
+pico-8 cartridge // http://www.pico-8.com
+version 42
+__lua__
+function _draw()
+ cls()
+end
+__gfx__
+00000000
+
+---
+
+(cartridge
+  (header)
+  (version)
+  (section
+    (lua_section
+      (lua_marker)
+      (lua_content
+        (line)
+        (line)
+        (line))))
+  (section
+    (gfx_section
+      (gfx_marker)
+      (body
+        (line)))))
+
+==================
+lua identifier resembling section marker
+==================
+
+pico-8 cartridge // http://www.pico-8.com
+version 42
+__lua__
+local __foo__ = 1
+local s = "__lua__"
+__gfx__
+00
+
+---
+
+(cartridge
+  (header)
+  (version)
+  (section
+    (lua_section
+      (lua_marker)
+      (lua_content
+        (line)
+        (line))))
+  (section
+    (gfx_section
+      (gfx_marker)
+      (body
+        (line)))))
+
+==================
+unknown section name
+==================
+
+pico-8 cartridge // http://www.pico-8.com
+version 42
+__lua__
+__future_section__
+opaque body
+__gfx__
+00
+
+---
+
+(cartridge
+  (header)
+  (version)
+  (section (lua_section (lua_marker)))
+  (section
+    (unknown_section
+      (section_marker)
+      (body (line))))
+  (section
+    (gfx_section
+      (gfx_marker)
+      (body (line)))))
@@ -0,0 +1,73 @@
+# Known limitations of `tree-sitter-pico8-lua`
+
+This document used to track parse incorrectness around PICO-8's
+line-significant shorthand `if (cond) ...` / `while (cond) ...`
+constructs. As of v0.3 the external scanner emits a `LINE_END` token
+when the parser is at the body-or-terminator decision point of a
+shorthand statement and the next byte is `\n` / `\r` / EOF, so the body
+of a shorthand is correctly bounded to its source line.
+
+There are no other known parse-incorrectness issues at this time.
+Removing this file (or leaving it as a brief stub) is fine once you're
+confident no documentation links still point at the old limitation
+sections.
+
+## How line-significance is wired up (for reference)
+
+PICO-8 deviates from standard Lua in two places where a newline is
+syntactically significant:
+
+- `if (cond) <stmts...>` — the consequence (and any same-line `else`
+  alternative) extends to end-of-line, not to a matching `end`.
+- `while (cond) <stmts...>` — same line-bounded body as the
+  shorthand `if`.
+
+Tree-sitter has no built-in concept of newlines as syntactic tokens
+when `/\s/` is in `extras` (and we want it there: every other
+construct treats whitespace transparently). The canonical fix is an
+**external scanner** that gates a synthetic terminator token on
+`valid_symbols`. We do exactly that:
+
+- `src/scanner.c` exposes a `LINE_END` external symbol. The scanner
+  looks at the raw lookahead before the lexer has a chance to skip
+  extras, and emits `LINE_END` only when the parser actually expects
+  one (i.e., `valid_symbols[LINE_END] == true`). At any other
+  position, the scanner's LINE_END branch returns false, and the `\n`
+  falls through to be eaten silently by the `/\s/` extras pattern.
+- `LINE_END` is **zero-width** — the scanner does not consume the
+  newline. This matters for nested shorthands: `if (a) if (b) c()\nd()`
+  has to terminate BOTH shorthands at the same `\n`. With a zero-width
+  terminator, each enclosing shorthand sees the same `\n` in turn and
+  reduces. Once no shorthand is on the stack, `LINE_END` is no longer
+  in `valid_symbols`, the scanner returns false, and the `\n` is
+  consumed by extras. The emit chain is bounded by static nesting
+  depth, so there's no infinite-loop risk despite the zero width.
+
+The shorthand rules in `grammar.js` end with `$._line_end`; the body
+and the optional `else` alternative are both `$.statement, repeat($.statement)`,
+allowing PICO-8's multi-statement single-line bodies
+(`if (falling) wheeee() splat()`).
+
+The cross-language pattern is "external scanner + valid_symbols-gated
+terminator," same as `tree-sitter-r` (the closest analogue) and
+similar in spirit to Ruby's paired `_line_break` / `_no_line_break`
+hint tokens. Reaching for `\s` removal or per-rule extras is **not**
+necessary for this style of line-significance; only Python-style
+INDENT/DEDENT requires the heavier refactor.
+
+## Test coverage
+
+`test/corpus/shorthand_line_end.txt` exercises:
+
+- Single- and multi-statement shorthand bodies, terminated by `\n` and
+  by EOF.
+- Same-line `else` (single- and multi-statement alternative).
+- The historical dangling-else case (shorthand inside a standard `if`,
+  with `else` on a later line — must bind to the outer `if`).
+- Line comment trailing the shorthand body (the comment is in extras
+  and the trailing `\n` still triggers `LINE_END`).
+- Shorthand inside a `do`-block (the `\n` before the closing `end`
+  terminates the shorthand cleanly).
+- Nested shorthand `if`s on the same line (one `\n` must close both).
+- Coexistence with standard `if (parenthesized) then ... end` — the
+  GLR conflict resolves on whether `then` follows.
@@ -0,0 +1,21 @@
+The MIT License (MIT)
+
+Copyright (c) 2021 Munif Tanjim
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
@@ -0,0 +1,619 @@
+/**
+ * @file PICO-8 Lua grammar for tree-sitter
+ *
+ * Forked from tree-sitter-lua 0.5.0 by Munif Tanjim ( MIT — see
+ * UPSTREAM-LICENSE.md ). This fork adds the PICO-8 dialect extensions
+ * documented in the PICO-8 manual:
+ *
+ *   - != as alias for ~=
+ *   - Integer divide:  \
+ *   - Bitwise XOR (binary): ^^
+ *   - Logical shift right: >>>
+ *   - Rotate left:  <<>
+ *   - Rotate right: >><
+ *   - Compound-assignment statements: += -= *= /= %= \= ^= ..= &= |= ^^=
+ *                                     <<= >>= >>>= <<>= >><=
+ *   - Memory peek prefix unary operators:  @addr  %addr  $addr
+ *     ( these coexist with binary % for modulo )
+ *   - Single-line  if (cond) stmt [else stmt]   — no `then`/`end`
+ *   - Single-line  while (cond) stmt            — no `do`/`end`
+ *   - Statement-level print shorthand: `?` followed by an expression list
+ *   - `#include path` directive
+ */
+
+/// <reference types="tree-sitter-cli/dsl" />
+// @ts-check
+
+const PREC = {
+  OR: 1,         // or
+  AND: 2,        // and
+  COMPARE: 3,    // < > <= >= ~= == !=
+  BIT_OR: 4,     // |
+  BIT_NOT: 5,    // ~ ^^
+  BIT_AND: 6,    // &
+  BIT_SHIFT: 7,  // << >> >>> <<> >><
+  CONCAT: 8,     // ..
+  PLUS: 9,       // + -
+  MULTI: 10,     // * / // % \
+  UNARY: 11,     // not # - ~ @ $ %
+  POWER: 12,     // ^
+};
+
+const list_seq = (rule, separator, trailing_separator = false) =>
+  trailing_separator
+    ? seq(rule, repeat(seq(separator, rule)), optional(separator))
+    : seq(rule, repeat(seq(separator, rule)));
+
+const optional_block = ($) => alias(optional($._block), $.block);
+
+// namelist ::= Name {',' Name}
+const name_list = ($) => list_seq(field('name', $.identifier), ',');
+
+const COMPOUND_ASSIGN_OPERATORS = [
+  '+=', '-=', '*=', '/=', '%=', '\\=', '^=', '..=',
+  '&=', '|=', '^^=',
+  '<<=', '>>=', '>>>=', '<<>=', '>><=',
+];
+
+export default grammar({
+  name: 'pico8_lua',
+
+  extras: ($) => [$.comment, /\s/],
+
+  externals: ($) => [
+    $._block_comment_start,
+    $._block_comment_content,
+    $._block_comment_end,
+
+    $._block_string_start,
+    $._block_string_content,
+    $._block_string_end,
+
+    // PICO-8 line-significance: terminates the body of `if (cond) ...` /
+    // `while (cond) ...` shorthand. The scanner emits this only when the
+    // parser is at a state expecting it; everywhere else a newline falls
+    // through to /\s/ in extras and is skipped. See src/scanner.c.
+    $._line_end,
+  ],
+
+  supertypes: ($) => [$.statement, $.expression, $.declaration, $.variable],
+
+  word: ($) => $.identifier,
+
+  // `if (cond) ...` is ambiguous between a standard if where the condition
+  // is a parenthesized_expression and a shorthand if. Same for while. The
+  // ambiguity resolves by what follows the closing `)` ( `then`/`do` for
+  // the standard form, anything else for the shorthand ).
+  conflicts: ($) => [
+    [$.parenthesized_expression, $.shorthand_if_statement],
+    [$.parenthesized_expression, $.shorthand_while_statement],
+  ],
+
+  rules: {
+    // chunk ::= block
+    chunk: ($) =>
+      seq(
+        optional($.hash_bang_line),
+        repeat($.statement),
+        optional($.return_statement)
+      ),
+
+    hash_bang_line: (_) => /#![^\n]*/,
+
+    // block ::= {stat} [retstat]
+    _block: ($) =>
+      choice(
+        seq(repeat1($.statement), optional($.return_statement)),
+        seq(repeat($.statement), $.return_statement)
+      ),
+
+    statement: ($) =>
+      choice(
+        $.empty_statement,
+        $.assignment_statement,
+        $.compound_assignment_statement,
+        $.function_call,
+        $.label_statement,
+        $.break_statement,
+        $.goto_statement,
+        $.do_statement,
+        $.while_statement,
+        $.shorthand_while_statement,
+        $.repeat_statement,
+        $.if_statement,
+        $.shorthand_if_statement,
+        $.for_statement,
+        $.declaration,
+        $.print_shorthand_statement,
+        $.include_statement,
+      ),
+
+    // retstat ::= return [explist] [';']
+    return_statement: ($) =>
+      seq(
+        'return',
+        optional(alias($._expression_list, $.expression_list)),
+        optional(';')
+      ),
+
+    empty_statement: (_) => ';',
+
+    assignment_statement: ($) =>
+      seq(
+        alias($._variable_assignment_varlist, $.variable_list),
+        field('operator', '='),
+        alias($._variable_assignment_explist, $.expression_list)
+      ),
+    _variable_assignment_varlist: ($) =>
+      list_seq(field('name', $.variable), ','),
+    _variable_assignment_explist: ($) =>
+      list_seq(field('value', $.expression), ','),
+
+    // PICO-8 compound assignment: var OP= expr (single statement, single line).
+    compound_assignment_statement: ($) =>
+      seq(
+        field('name', $.variable),
+        field('operator', choice(...COMPOUND_ASSIGN_OPERATORS)),
+        field('value', $.expression)
+      ),
+
+    label_statement: ($) => seq('::', $.identifier, '::'),
+
+    break_statement: (_) => 'break',
+
+    goto_statement: ($) => seq('goto', $.identifier),
+
+    do_statement: ($) => seq('do', field('body', optional_block($)), 'end'),
+
+    while_statement: ($) =>
+      seq(
+        'while',
+        field('condition', $.expression),
+        'do',
+        field('body', optional_block($)),
+        'end'
+      ),
+
+    // PICO-8 single-line:  while (cond) stmt {stmt}
+    // Body extends to end-of-line (or EOF). The $._line_end terminator
+    // is emitted by the external scanner when it sees \n/\r/EOF at a
+    // position where the parser expects line-end; until then, additional
+    // statements on the same line accumulate into the body.
+    shorthand_while_statement: ($) =>
+      seq(
+        'while',
+        '(',
+        field('condition', $.expression),
+        ')',
+        field('body', $.statement),
+        repeat(field('body', $.statement)),
+        $._line_end
+      ),
+
+    repeat_statement: ($) =>
+      seq(
+        'repeat',
+        field('body', optional_block($)),
+        'until',
+        field('condition', $.expression)
+      ),
+
+    if_statement: ($) =>
+      seq(
+        'if',
+        field('condition', $.expression),
+        'then',
+        field('consequence', optional_block($)),
+        repeat(field('alternative', $.elseif_statement)),
+        optional(field('alternative', $.else_statement)),
+        'end'
+      ),
+    elseif_statement: ($) =>
+      seq(
+        'elseif',
+        field('condition', $.expression),
+        'then',
+        field('consequence', optional_block($))
+      ),
+    else_statement: ($) => seq('else', field('body', optional_block($))),
+
+    // PICO-8 single-line:  if (cond) stmt {stmt} [else stmt {stmt}]
+    // Both the consequence and the alternative extend to end-of-line.
+    // The $._line_end terminator (emitted by the external scanner on
+    // \n/\r/EOF) prevents a later-line `else` from binding to a
+    // shorthand `if` on a previous line, matching PICO-8 semantics.
+    shorthand_if_statement: ($) =>
+      seq(
+        'if',
+        '(',
+        field('condition', $.expression),
+        ')',
+        field('consequence', $.statement),
+        repeat(field('consequence', $.statement)),
+        optional(
+          seq(
+            'else',
+            field('alternative', $.statement),
+            repeat(field('alternative', $.statement))
+          )
+        ),
+        $._line_end
+      ),
+
+    for_statement: ($) =>
+      seq(
+        'for',
+        field('clause', choice($.for_generic_clause, $.for_numeric_clause)),
+        'do',
+        field('body', optional_block($)),
+        'end'
+      ),
+    for_generic_clause: ($) =>
+      seq(
+        alias($._name_list, $.variable_list),
+        'in',
+        alias($._expression_list, $.expression_list)
+      ),
+    for_numeric_clause: ($) =>
+      seq(
+        field('name', $.identifier),
+        field('operator', '='),
+        field('start', $.expression),
+        ',',
+        field('end', $.expression),
+        optional(seq(',', field('step', $.expression)))
+      ),
+    _name_list: ($) => name_list($),
+
+    declaration: ($) =>
+      choice(
+        $.function_declaration,
+        field(
+          'local_declaration',
+          alias($._local_function_declaration, $.function_declaration)
+        ),
+        field('local_declaration', $.variable_declaration),
+      ),
+    function_declaration: ($) =>
+      seq('function', field('name', $._function_name), $._function_body),
+    _local_function_declaration: ($) =>
+      seq('local', 'function', field('name', $.identifier), $._function_body),
+    _function_name: ($) =>
+      choice(
+        $._function_name_prefix_expression,
+        alias(
+          $._function_name_method_index_expression,
+          $.method_index_expression
+        )
+      ),
+    _function_name_prefix_expression: ($) =>
+      choice(
+        $.identifier,
+        alias($._function_name_dot_index_expression, $.dot_index_expression)
+      ),
+    _function_name_dot_index_expression: ($) =>
+      seq(
+        field('table', $._function_name_prefix_expression),
+        '.',
+        field('field', $.identifier)
+      ),
+    _function_name_method_index_expression: ($) =>
+      seq(
+        field('table', $._function_name_prefix_expression),
+        ':',
+        field('method', $.identifier)
+      ),
+
+    variable_declaration: ($) =>
+      seq(
+        'local',
+        choice(
+          alias($._att_name_list, $.variable_list),
+          alias($._variable_assignment, $.assignment_statement)
+        )
+      ),
+    _variable_assignment: ($) =>
+      seq(
+        alias($._att_name_list, $.variable_list),
+        field('operator', '='),
+        alias($._variable_assignment_explist, $.expression_list)
+      ),
+
+    _att_name_list: ($) =>
+      seq(
+        optional(field('attribute', alias($._attrib, $.attribute))),
+        list_seq(
+          seq(
+            field('name', $.identifier),
+            optional(field('attribute', alias($._attrib, $.attribute)))
+          ),
+          ','
+        ),
+      ),
+    _attrib: ($) => seq('<', $.identifier, '>'),
+
+    _expression_list: ($) => list_seq($.expression, ','),
+
+    // PICO-8 print shorthand:  ? expr {, expr}
+    print_shorthand_statement: ($) =>
+      seq(
+        field('directive', '?'),
+        list_seq(field('argument', $.expression), ',')
+      ),
+
+    // PICO-8 include directive:  #include path
+    // Tokenized greedily as `#include` + whitespace so that the standalone
+    // `#` (unary length operator) and identifier-starting `#x` continue to
+    // parse as length-of-expression.
+    include_statement: ($) =>
+      seq(
+        field('directive', alias(token(prec(2, /#include[ \t]+/)), '#include')),
+        field('path', alias(/[^\n\r]*/, $.include_path))
+      ),
+
+    expression: ($) =>
+      choice(
+        $.nil,
+        $.false,
+        $.true,
+        $.number,
+        $.string,
+        $.vararg_expression,
+        $.function_definition,
+        $.variable,
+        $.function_call,
+        $.parenthesized_expression,
+        $.table_constructor,
+        $.binary_expression,
+        $.unary_expression
+      ),
+
+    nil: (_) => 'nil',
+    false: (_) => 'false',
+    true: (_) => 'true',
+
+    number: (_) => {
+      function number_literal(digits, exponent_marker, exponent_digits) {
+        return seq(
+          choice(
+            seq(optional(digits), optional('.'), digits),
+            seq(digits, optional('.'), optional(digits))
+          ),
+          optional(
+            seq(
+              choice(
+                exponent_marker.toLowerCase(),
+                exponent_marker.toUpperCase()
+              ),
+              seq(optional(choice('-', '+')), exponent_digits)
+            )
+          )
+        );
+      }
+
+      const decimal_digits = /[0-9]+/;
+      const decimal_literal = number_literal(decimal_digits, 'e', decimal_digits);
+
+      const hex_digits = /[a-fA-F0-9]+/;
+      const hex_literal = seq(
+        choice('0x', '0X'),
+        number_literal(hex_digits, 'p', decimal_digits)
+      );
+
+      const bin_digits = /[01]+/;
+      const bin_literal = seq(choice('0b', '0B'), bin_digits);
+
+      return token(choice(decimal_literal, hex_literal, bin_literal));
+    },
+
+    string: ($) => choice($._quote_string, $._block_string),
+
+    _quote_string: ($) =>
+      choice(
+        seq(
+          field('start', alias('"', '"')),
+          field(
+            'content',
+            optional(alias($._doublequote_string_content, $.string_content))
+          ),
+          field('end', alias('"', '"'))
+        ),
+        seq(
+          field('start', alias("'", "'")),
+          field(
+            'content',
+            optional(alias($._singlequote_string_content, $.string_content))
+          ),
+          field('end', alias("'", "'"))
+        )
+      ),
+
+    _doublequote_string_content: ($) =>
+      repeat1(choice(token.immediate(prec(1, /[^"\\]+/)), $.escape_sequence)),
+
+    _singlequote_string_content: ($) =>
+      repeat1(choice(token.immediate(prec(1, /[^'\\]+/)), $.escape_sequence)),
+
+    _block_string: ($) =>
+      seq(
+        field('start', alias($._block_string_start, '[[')),
+        field('content', alias($._block_string_content, $.string_content)),
+        field('end', alias($._block_string_end, ']]'))
+      ),
+
+    escape_sequence: () =>
+      token.immediate(
+        seq(
+          '\\',
+          choice(
+            /[\nabfnrtv\\'"]/,
+            /z\s*/,
+            /[0-9]{1,3}/,
+            /x[0-9a-fA-F]{2}/,
+            /u\{[0-9a-fA-F]+\}/
+          )
+        )
+      ),
+
+    vararg_expression: (_) => '...',
+
+    function_definition: ($) => seq('function', $._function_body),
+    _function_body: ($) =>
+      seq(
+        field('parameters', $.parameters),
+        field('body', optional_block($)),
+        'end'
+      ),
+    parameters: ($) => seq('(', optional($._parameter_list), ')'),
+    _parameter_list: ($) =>
+      choice(
+        seq(name_list($), optional(seq(',', $._vararg_parameter))),
+        $._vararg_parameter
+      ),
+    _vararg_parameter: ($) =>
+      seq($.vararg_expression, optional(field('name', $.identifier))),
+
+    _prefix_expression: ($) =>
+      prec(1, choice($.variable, $.function_call, $.parenthesized_expression)),
+
+    variable: ($) =>
+      choice($.identifier, $.bracket_index_expression, $.dot_index_expression),
+    bracket_index_expression: ($) =>
+      seq(
+        field('table', $._prefix_expression),
+        '[',
+        field('field', $.expression),
+        ']'
+      ),
+    dot_index_expression: ($) =>
+      seq(
+        field('table', $._prefix_expression),
+        '.',
+        field('field', $.identifier)
+      ),
+
+    function_call: ($) =>
+      seq(
+        field('name', choice($._prefix_expression, $.method_index_expression)),
+        field('arguments', $.arguments)
+      ),
+    method_index_expression: ($) =>
+      seq(
+        field('table', $._prefix_expression),
+        ':',
+        field('method', $.identifier)
+      ),
+    arguments: ($) =>
+      choice(
+        seq('(', optional(list_seq($.expression, ',')), ')'),
+        $.table_constructor,
+        $.string
+      ),
+
+    parenthesized_expression: ($) => seq('(', $.expression, ')'),
+
+    table_constructor: ($) => seq('{', optional($._field_list), '}'),
+    _field_list: ($) => list_seq($.field, $._field_sep, true),
+    _field_sep: (_) => choice(',', ';'),
+    field: ($) =>
+      choice(
+        seq(
+          '[',
+          field('name', $.expression),
+          ']',
+          field('operator', '='),
+          field('value', $.expression)
+        ),
+        seq(field('name', $.identifier), '=', field('value', $.expression)),
+        field('value', $.expression)
+      ),
+
+    binary_expression: ($) =>
+      choice(
+        ...[
+          ['or', PREC.OR],
+          ['and', PREC.AND],
+          ['<', PREC.COMPARE],
+          ['<=', PREC.COMPARE],
+          ['==', PREC.COMPARE],
+          ['~=', PREC.COMPARE],
+          ['!=', PREC.COMPARE],   // PICO-8 alias for ~=
+          ['>=', PREC.COMPARE],
+          ['>', PREC.COMPARE],
+          ['|', PREC.BIT_OR],
+          ['~', PREC.BIT_NOT],    // bitwise xor (Lua 5.3 binary form)
+          ['^^', PREC.BIT_NOT],   // PICO-8 bitwise xor
+          ['&', PREC.BIT_AND],
+          ['<<', PREC.BIT_SHIFT],
+          ['>>', PREC.BIT_SHIFT],
+          ['>>>', PREC.BIT_SHIFT],  // PICO-8 logical shift right
+          ['<<>', PREC.BIT_SHIFT],  // PICO-8 rotate left
+          ['>><', PREC.BIT_SHIFT],  // PICO-8 rotate right
+          ['+', PREC.PLUS],
+          ['-', PREC.PLUS],
+          ['*', PREC.MULTI],
+          ['/', PREC.MULTI],
+          ['//', PREC.MULTI],
+          ['%', PREC.MULTI],
+          ['\\', PREC.MULTI],     // PICO-8 integer divide
+        ].map(([operator, precedence]) =>
+          prec.left(
+            precedence,
+            seq(
+              field('left', $.expression),
+              field('operator', operator),
+              field('right', $.expression)
+            )
+          )
+        ),
+        ...[
+          ['..', PREC.CONCAT],
+          ['^', PREC.POWER],
+        ].map(([operator, precedence]) =>
+          prec.right(
+            precedence,
+            seq(
+              field('left', $.expression),
+              field('operator', operator),
+              field('right', $.expression)
+            )
+          )
+        )
+      ),
+
+    unary_expression: ($) =>
+      prec.left(
+        PREC.UNARY,
+        seq(
+          // @ $ % are PICO-8 peek prefixes ( peek / peek4 / peek2 ).
+          // % collides lexically with binary modulo; the GLR parser
+          // resolves usage by surrounding context.
+          field('operator', choice('not', '#', '-', '~', '@', '$', '%')),
+          field('operand', $.expression),
+        )
+      ),
+
+    identifier: (_) => {
+      // PICO-8 dialect carves out !, ?, @, $ as operator tokens, so they
+      // are not valid in identifiers ( upstream allowed them ).
+      const identifier_start =
+        /[^\p{Control}\s!?@$+\-*/%^#&~|<>=(){}\[\];:,.\\'"\d]/;
+      const identifier_continue =
+        /[^\p{Control}\s!?@$+\-*/%^#&~|<>=(){}\[\];:,.\\'"]*/;
+      return token(seq(identifier_start, identifier_continue));
+    },
+
+    comment: ($) =>
+      choice(
+        seq(
+          field('start', '--'),
+          field('content', alias(/[^\r\n]*/, $.comment_content))
+        ),
+        seq(
+          field('start', alias($._block_comment_start, '[[')),
+          field('content', alias($._block_comment_content, $.comment_content)),
+          field('end', alias($._block_comment_end, ']]'))
+        )
+      ),
+  },
+});
@@ -0,0 +1,7 @@
+{
+  "name": "tree-sitter-pico8-lua",
+  "version": "0.0.1",
+  "description": "tree-sitter grammar for the PICO-8 Lua dialect (forked from tree-sitter-lua)",
+  "type": "module",
+  "license": "MIT"
+}
@@ -0,0 +1,230 @@
+#include <stdio.h>
+#include "tree_sitter/alloc.h"
+#include "tree_sitter/parser.h"
+#include <wctype.h>
+
+enum TokenType {
+  BLOCK_COMMENT_START,
+  BLOCK_COMMENT_CONTENT,
+  BLOCK_COMMENT_END,
+
+  BLOCK_STRING_START,
+  BLOCK_STRING_CONTENT,
+  BLOCK_STRING_END,
+
+  // PICO-8 line-significance: terminates the body of `if (cond) ...` /
+  // `while (cond) ...` shorthand. Emitted only when the parser expects it
+  // (see scan() — this token is gated on valid_symbols[LINE_END]) so that
+  // newlines outside of shorthand contexts continue to fall through to
+  // extras and be skipped silently.
+  LINE_END,
+};
+
+static inline void consume(TSLexer *lexer) { lexer->advance(lexer, false); }
+
+static inline void skip(TSLexer *lexer) { lexer->advance(lexer, true); }
+
+static inline bool consume_char(char c, TSLexer *lexer) {
+  if (lexer->lookahead != c) {
+    return false;
+  }
+
+  consume(lexer);
+  return true;
+}
+
+static inline uint8_t consume_and_count_char(char c, TSLexer *lexer) {
+  uint8_t count = 0;
+  while (lexer->lookahead == c) {
+    ++count;
+    consume(lexer);
+  }
+  return count;
+}
+
+static inline void skip_whitespaces(TSLexer *lexer) {
+  while (iswspace(lexer->lookahead)) {
+    skip(lexer);
+  }
+}
+
+typedef struct {
+  char ending_char;
+  uint8_t level_count;
+} Scanner;
+
+static inline void reset_state(Scanner *scanner) {
+  scanner->ending_char = 0;
+  scanner->level_count = 0;
+}
+
+void *tree_sitter_pico8_lua_external_scanner_create() {
+  Scanner *scanner = ts_calloc(1, sizeof(Scanner));
+  return scanner;
+}
+
+void tree_sitter_pico8_lua_external_scanner_destroy(void *payload) {
+  Scanner *scanner = (Scanner *)payload;
+  ts_free(scanner);
+}
+
+unsigned tree_sitter_pico8_lua_external_scanner_serialize(void *payload, char *buffer) {
+  Scanner *scanner = (Scanner *)payload;
+  buffer[0] = scanner->ending_char;
+  buffer[1] = (char)scanner->level_count;
+  return 2;
+}
+
+void tree_sitter_pico8_lua_external_scanner_deserialize(void *payload, const char *buffer, unsigned length) {
+  Scanner *scanner = (Scanner *)payload;
+  if (length == 0) return;
+  scanner->ending_char = buffer[0];
+  if (length == 1) return;
+  scanner->level_count = buffer[1];
+}
+
+static bool scan_block_start(Scanner *scanner, TSLexer *lexer) {
+  if (consume_char('[', lexer)) {
+    uint8_t level = consume_and_count_char('=', lexer);
+
+    if (consume_char('[', lexer)) {
+      scanner->level_count = level;
+      return true;
+    }
+  }
+
+  return false;
+}
+
+static bool scan_block_end(Scanner *scanner, TSLexer *lexer) {
+  if (consume_char(']', lexer)) {
+    uint8_t level = consume_and_count_char('=', lexer);
+
+    if (scanner->level_count == level && consume_char(']', lexer)) {
+      return true;
+    }
+  }
+
+  return false;
+}
+
+static bool scan_block_content(Scanner *scanner, TSLexer *lexer) {
+  while (lexer->lookahead != 0) {
+    if (lexer->lookahead == ']') {
+      lexer->mark_end(lexer);
+
+      if (scan_block_end(scanner, lexer)) {
+        return true;
+      }
+    } else {
+      consume(lexer);
+    }
+  }
+
+  return false;
+}
+
+static bool scan_comment_start(Scanner *scanner, TSLexer *lexer) {
+  if (consume_char('-', lexer) && consume_char('-', lexer)) {
+    lexer->mark_end(lexer);
+
+    if (scan_block_start(scanner, lexer)) {
+      lexer->mark_end(lexer);
+      lexer->result_symbol = BLOCK_COMMENT_START;
+      return true;
+    }
+  }
+
+  return false;
+}
+
+static bool scan_comment_content(Scanner *scanner, TSLexer *lexer) {
+  if (scanner->ending_char == 0) { // block comment
+    if (scan_block_content(scanner, lexer)) {
+      lexer->result_symbol = BLOCK_COMMENT_CONTENT;
+      return true;
+    }
+
+    return false;
+  }
+
+  while (lexer->lookahead != 0) {
+    if (lexer->lookahead == scanner->ending_char) {
+      reset_state(scanner);
+      lexer->result_symbol = BLOCK_COMMENT_CONTENT;
+      return true;
+    }
+
+    consume(lexer);
+  }
+
+  return false;
+}
+
+bool tree_sitter_pico8_lua_external_scanner_scan(void *payload, TSLexer *lexer, const bool *valid_symbols) {
+  Scanner *scanner = (Scanner *)payload;
+
+  // LINE_END must be checked before any whitespace-skipping path below,
+  // because the bytes that signal it (\n, \r, EOF) would otherwise be
+  // consumed as extras and be invisible to us. The check is also
+  // intentionally placed before the block_string / block_comment branches
+  // so that those branches' skip_whitespaces() can't eat our newline.
+  //
+  // The scanner emits LINE_END only when the parser's current state lists
+  // it as valid (i.e., we're at the body-or-terminator decision point of a
+  // shorthand_if_statement / shorthand_while_statement). Everywhere else,
+  // \n falls through to the /\s/ extras pattern and is skipped silently,
+  // so this branch is invisible to the rest of the grammar.
+  //
+  // LINE_END is intentionally zero-width: we do NOT consume the newline.
+  // That lets nested shorthands on the same line each see the same \n and
+  // close in turn (e.g. `if (a) if (b) c()\nd()` — the \n must terminate
+  // BOTH shorthands so that `d()` is a top-level statement). Once every
+  // enclosing shorthand has reduced, LINE_END is no longer in any parser
+  // state's valid_symbols, the scanner returns false, and the trailing
+  // \n is consumed by /\s/ in extras as usual. There is no infinite-loop
+  // risk: each LINE_END shift reduces one shorthand statement, so the
+  // emit chain is bounded by static nesting depth.
+  if (valid_symbols[LINE_END] &&
+      (lexer->lookahead == '\n' || lexer->lookahead == '\r' ||
+       lexer->lookahead == 0)) {
+    lexer->result_symbol = LINE_END;
+    return true;
+  }
+
+  if (valid_symbols[BLOCK_STRING_END] && scan_block_end(scanner, lexer)) {
+    reset_state(scanner);
+    lexer->result_symbol = BLOCK_STRING_END;
+    return true;
+  }
+
+  if (valid_symbols[BLOCK_STRING_CONTENT] && scan_block_content(scanner, lexer)) {
+    lexer->result_symbol = BLOCK_STRING_CONTENT;
+    return true;
+  }
+
+  if (valid_symbols[BLOCK_COMMENT_END] && scanner->ending_char == 0 && scan_block_end(scanner, lexer)) {
+    reset_state(scanner);
+    lexer->result_symbol = BLOCK_COMMENT_END;
+    return true;
+  }
+
+  if (valid_symbols[BLOCK_COMMENT_CONTENT] && scan_comment_content(scanner, lexer)) {
+    return true;
+  }
+
+  skip_whitespaces(lexer);
+
+  if (valid_symbols[BLOCK_STRING_START] && scan_block_start(scanner, lexer)) {
+    lexer->result_symbol = BLOCK_STRING_START;
+    return true;
+  }
+
+  if (valid_symbols[BLOCK_COMMENT_START]) {
+    if (scan_comment_start(scanner, lexer)) {
+      return true;
+    }
+  }
+
+  return false;
+}
@@ -0,0 +1,54 @@
+#ifndef TREE_SITTER_ALLOC_H_
+#define TREE_SITTER_ALLOC_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+
+// Allow clients to override allocation functions
+#ifdef TREE_SITTER_REUSE_ALLOCATOR
+
+extern void *(*ts_current_malloc)(size_t size);
+extern void *(*ts_current_calloc)(size_t count, size_t size);
+extern void *(*ts_current_realloc)(void *ptr, size_t size);
+extern void (*ts_current_free)(void *ptr);
+
+#ifndef ts_malloc
+#define ts_malloc  ts_current_malloc
+#endif
+#ifndef ts_calloc
+#define ts_calloc  ts_current_calloc
+#endif
+#ifndef ts_realloc
+#define ts_realloc ts_current_realloc
+#endif
+#ifndef ts_free
+#define ts_free    ts_current_free
+#endif
+
+#else
+
+#ifndef ts_malloc
+#define ts_malloc  malloc
+#endif
+#ifndef ts_calloc
+#define ts_calloc  calloc
+#endif
+#ifndef ts_realloc
+#define ts_realloc realloc
+#endif
+#ifndef ts_free
+#define ts_free    free
+#endif
+
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif // TREE_SITTER_ALLOC_H_
@@ -0,0 +1,291 @@
+#ifndef TREE_SITTER_ARRAY_H_
+#define TREE_SITTER_ARRAY_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "./alloc.h"
+
+#include <assert.h>
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdlib.h>
+#include <string.h>
+
+#ifdef _MSC_VER
+#pragma warning(push)
+#pragma warning(disable : 4101)
+#elif defined(__GNUC__) || defined(__clang__)
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wunused-variable"
+#endif
+
+#define Array(T)       \
+  struct {             \
+    T *contents;       \
+    uint32_t size;     \
+    uint32_t capacity; \
+  }
+
+/// Initialize an array.
+#define array_init(self) \
+  ((self)->size = 0, (self)->capacity = 0, (self)->contents = NULL)
+
+/// Create an empty array.
+#define array_new() \
+  { NULL, 0, 0 }
+
+/// Get a pointer to the element at a given `index` in the array.
+#define array_get(self, _index) \
+  (assert((uint32_t)(_index) < (self)->size), &(self)->contents[_index])
+
+/// Get a pointer to the first element in the array.
+#define array_front(self) array_get(self, 0)
+
+/// Get a pointer to the last element in the array.
+#define array_back(self) array_get(self, (self)->size - 1)
+
+/// Clear the array, setting its size to zero. Note that this does not free any
+/// memory allocated for the array's contents.
+#define array_clear(self) ((self)->size = 0)
+
+/// Reserve `new_capacity` elements of space in the array. If `new_capacity` is
+/// less than the array's current capacity, this function has no effect.
+#define array_reserve(self, new_capacity) \
+  _array__reserve((Array *)(self), array_elem_size(self), new_capacity)
+
+/// Free any memory allocated for this array. Note that this does not free any
+/// memory allocated for the array's contents.
+#define array_delete(self) _array__delete((Array *)(self))
+
+/// Push a new `element` onto the end of the array.
+#define array_push(self, element)                            \
+  (_array__grow((Array *)(self), 1, array_elem_size(self)), \
+   (self)->contents[(self)->size++] = (element))
+
+/// Increase the array's size by `count` elements.
+/// New elements are zero-initialized.
+#define array_grow_by(self, count) \
+  do { \
+    if ((count) == 0) break; \
+    _array__grow((Array *)(self), count, array_elem_size(self)); \
+    memset((self)->contents + (self)->size, 0, (count) * array_elem_size(self)); \
+    (self)->size += (count); \
+  } while (0)
+
+/// Append all elements from one array to the end of another.
+#define array_push_all(self, other)                                       \
+  array_extend((self), (other)->size, (other)->contents)
+
+/// Append `count` elements to the end of the array, reading their values from the
+/// `contents` pointer.
+#define array_extend(self, count, contents)                    \
+  _array__splice(                                               \
+    (Array *)(self), array_elem_size(self), (self)->size, \
+    0, count,  contents                                        \
+  )
+
+/// Remove `old_count` elements from the array starting at the given `index`. At
+/// the same index, insert `new_count` new elements, reading their values from the
+/// `new_contents` pointer.
+#define array_splice(self, _index, old_count, new_count, new_contents)  \
+  _array__splice(                                                       \
+    (Array *)(self), array_elem_size(self), _index,                \
+    old_count, new_count, new_contents                                 \
+  )
+
+/// Insert one `element` into the array at the given `index`.
+#define array_insert(self, _index, element) \
+  _array__splice((Array *)(self), array_elem_size(self), _index, 0, 1, &(element))
+
+/// Remove one element from the array at the given `index`.
+#define array_erase(self, _index) \
+  _array__erase((Array *)(self), array_elem_size(self), _index)
+
+/// Pop the last element off the array, returning the element by value.
+#define array_pop(self) ((self)->contents[--(self)->size])
+
+/// Assign the contents of one array to another, reallocating if necessary.
+#define array_assign(self, other) \
+  _array__assign((Array *)(self), (const Array *)(other), array_elem_size(self))
+
+/// Swap one array with another
+#define array_swap(self, other) \
+  _array__swap((Array *)(self), (Array *)(other))
+
+/// Get the size of the array contents
+#define array_elem_size(self) (sizeof *(self)->contents)
+
+/// Search a sorted array for a given `needle` value, using the given `compare`
+/// callback to determine the order.
+///
+/// If an existing element is found to be equal to `needle`, then the `index`
+/// out-parameter is set to the existing value's index, and the `exists`
+/// out-parameter is set to true. Otherwise, `index` is set to an index where
+/// `needle` should be inserted in order to preserve the sorting, and `exists`
+/// is set to false.
+#define array_search_sorted_with(self, compare, needle, _index, _exists) \
+  _array__search_sorted(self, 0, compare, , needle, _index, _exists)
+
+/// Search a sorted array for a given `needle` value, using integer comparisons
+/// of a given struct field (specified with a leading dot) to determine the order.
+///
+/// See also `array_search_sorted_with`.
+#define array_search_sorted_by(self, field, needle, _index, _exists) \
+  _array__search_sorted(self, 0, _compare_int, field, needle, _index, _exists)
+
+/// Insert a given `value` into a sorted array, using the given `compare`
+/// callback to determine the order.
+#define array_insert_sorted_with(self, compare, value) \
+  do { \
+    unsigned _index, _exists; \
+    array_search_sorted_with(self, compare, &(value), &_index, &_exists); \
+    if (!_exists) array_insert(self, _index, value); \
+  } while (0)
+
+/// Insert a given `value` into a sorted array, using integer comparisons of
+/// a given struct field (specified with a leading dot) to determine the order.
+///
+/// See also `array_search_sorted_by`.
+#define array_insert_sorted_by(self, field, value) \
+  do { \
+    unsigned _index, _exists; \
+    array_search_sorted_by(self, field, (value) field, &_index, &_exists); \
+    if (!_exists) array_insert(self, _index, value); \
+  } while (0)
+
+// Private
+
+typedef Array(void) Array;
+
+/// This is not what you're looking for, see `array_delete`.
+static inline void _array__delete(Array *self) {
+  if (self->contents) {
+    ts_free(self->contents);
+    self->contents = NULL;
+    self->size = 0;
+    self->capacity = 0;
+  }
+}
+
+/// This is not what you're looking for, see `array_erase`.
+static inline void _array__erase(Array *self, size_t element_size,
+                                uint32_t index) {
+  assert(index < self->size);
+  char *contents = (char *)self->contents;
+  memmove(contents + index * element_size, contents + (index + 1) * element_size,
+          (self->size - index - 1) * element_size);
+  self->size--;
+}
+
+/// This is not what you're looking for, see `array_reserve`.
+static inline void _array__reserve(Array *self, size_t element_size, uint32_t new_capacity) {
+  if (new_capacity > self->capacity) {
+    if (self->contents) {
+      self->contents = ts_realloc(self->contents, new_capacity * element_size);
+    } else {
+      self->contents = ts_malloc(new_capacity * element_size);
+    }
+    self->capacity = new_capacity;
+  }
+}
+
+/// This is not what you're looking for, see `array_assign`.
+static inline void _array__assign(Array *self, const Array *other, size_t element_size) {
+  _array__reserve(self, element_size, other->size);
+  self->size = other->size;
+  memcpy(self->contents, other->contents, self->size * element_size);
+}
+
+/// This is not what you're looking for, see `array_swap`.
+static inline void _array__swap(Array *self, Array *other) {
+  Array swap = *other;
+  *other = *self;
+  *self = swap;
+}
+
+/// This is not what you're looking for, see `array_push` or `array_grow_by`.
+static inline void _array__grow(Array *self, uint32_t count, size_t element_size) {
+  uint32_t new_size = self->size + count;
+  if (new_size > self->capacity) {
+    uint32_t new_capacity = self->capacity * 2;
+    if (new_capacity < 8) new_capacity = 8;
+    if (new_capacity < new_size) new_capacity = new_size;
+    _array__reserve(self, element_size, new_capacity);
+  }
+}
+
+/// This is not what you're looking for, see `array_splice`.
+static inline void _array__splice(Array *self, size_t element_size,
+                                 uint32_t index, uint32_t old_count,
+                                 uint32_t new_count, const void *elements) {
+  uint32_t new_size = self->size + new_count - old_count;
+  uint32_t old_end = index + old_count;
+  uint32_t new_end = index + new_count;
+  assert(old_end <= self->size);
+
+  _array__reserve(self, element_size, new_size);
+
+  char *contents = (char *)self->contents;
+  if (self->size > old_end) {
+    memmove(
+      contents + new_end * element_size,
+      contents + old_end * element_size,
+      (self->size - old_end) * element_size
+    );
+  }
+  if (new_count > 0) {
+    if (elements) {
+      memcpy(
+        (contents + index * element_size),
+        elements,
+        new_count * element_size
+      );
+    } else {
+      memset(
+        (contents + index * element_size),
+        0,
+        new_count * element_size
+      );
+    }
+  }
+  self->size += new_count - old_count;
+}
+
+/// A binary search routine, based on Rust's `std::slice::binary_search_by`.
+/// This is not what you're looking for, see `array_search_sorted_with` or `array_search_sorted_by`.
+#define _array__search_sorted(self, start, compare, suffix, needle, _index, _exists) \
+  do { \
+    *(_index) = start; \
+    *(_exists) = false; \
+    uint32_t size = (self)->size - *(_index); \
+    if (size == 0) break; \
+    int comparison; \
+    while (size > 1) { \
+      uint32_t half_size = size / 2; \
+      uint32_t mid_index = *(_index) + half_size; \
+      comparison = compare(&((self)->contents[mid_index] suffix), (needle)); \
+      if (comparison <= 0) *(_index) = mid_index; \
+      size -= half_size; \
+    } \
+    comparison = compare(&((self)->contents[*(_index)] suffix), (needle)); \
+    if (comparison == 0) *(_exists) = true; \
+    else if (comparison < 0) *(_index) += 1; \
+  } while (0)
+
+/// Helper macro for the `_sorted_by` routines below. This takes the left (existing)
+/// parameter by reference in order to work with the generic sorting function above.
+#define _compare_int(a, b) ((int)*(a) - (int)(b))
+
+#ifdef _MSC_VER
+#pragma warning(pop)
+#elif defined(__GNUC__) || defined(__clang__)
+#pragma GCC diagnostic pop
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif  // TREE_SITTER_ARRAY_H_
@@ -0,0 +1,266 @@
+#ifndef TREE_SITTER_PARSER_H_
+#define TREE_SITTER_PARSER_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdbool.h>
+#include <stdint.h>
+#include <stdlib.h>
+
+#define ts_builtin_sym_error ((TSSymbol)-1)
+#define ts_builtin_sym_end 0
+#define TREE_SITTER_SERIALIZATION_BUFFER_SIZE 1024
+
+#ifndef TREE_SITTER_API_H_
+typedef uint16_t TSStateId;
+typedef uint16_t TSSymbol;
+typedef uint16_t TSFieldId;
+typedef struct TSLanguage TSLanguage;
+#endif
+
+typedef struct {
+  TSFieldId field_id;
+  uint8_t child_index;
+  bool inherited;
+} TSFieldMapEntry;
+
+typedef struct {
+  uint16_t index;
+  uint16_t length;
+} TSFieldMapSlice;
+
+typedef struct {
+  bool visible;
+  bool named;
+  bool supertype;
+} TSSymbolMetadata;
+
+typedef struct TSLexer TSLexer;
+
+struct TSLexer {
+  int32_t lookahead;
+  TSSymbol result_symbol;
+  void (*advance)(TSLexer *, bool);
+  void (*mark_end)(TSLexer *);
+  uint32_t (*get_column)(TSLexer *);
+  bool (*is_at_included_range_start)(const TSLexer *);
+  bool (*eof)(const TSLexer *);
+  void (*log)(const TSLexer *, const char *, ...);
+};
+
+typedef enum {
+  TSParseActionTypeShift,
+  TSParseActionTypeReduce,
+  TSParseActionTypeAccept,
+  TSParseActionTypeRecover,
+} TSParseActionType;
+
+typedef union {
+  struct {
+    uint8_t type;
+    TSStateId state;
+    bool extra;
+    bool repetition;
+  } shift;
+  struct {
+    uint8_t type;
+    uint8_t child_count;
+    TSSymbol symbol;
+    int16_t dynamic_precedence;
+    uint16_t production_id;
+  } reduce;
+  uint8_t type;
+} TSParseAction;
+
+typedef struct {
+  uint16_t lex_state;
+  uint16_t external_lex_state;
+} TSLexMode;
+
+typedef union {
+  TSParseAction action;
+  struct {
+    uint8_t count;
+    bool reusable;
+  } entry;
+} TSParseActionEntry;
+
+typedef struct {
+  int32_t start;
+  int32_t end;
+} TSCharacterRange;
+
+struct TSLanguage {
+  uint32_t version;
+  uint32_t symbol_count;
+  uint32_t alias_count;
+  uint32_t token_count;
+  uint32_t external_token_count;
+  uint32_t state_count;
+  uint32_t large_state_count;
+  uint32_t production_id_count;
+  uint32_t field_count;
+  uint16_t max_alias_sequence_length;
+  const uint16_t *parse_table;
+  const uint16_t *small_parse_table;
+  const uint32_t *small_parse_table_map;
+  const TSParseActionEntry *parse_actions;
+  const char * const *symbol_names;
+  const char * const *field_names;
+  const TSFieldMapSlice *field_map_slices;
+  const TSFieldMapEntry *field_map_entries;
+  const TSSymbolMetadata *symbol_metadata;
+  const TSSymbol *public_symbol_map;
+  const uint16_t *alias_map;
+  const TSSymbol *alias_sequences;
+  const TSLexMode *lex_modes;
+  bool (*lex_fn)(TSLexer *, TSStateId);
+  bool (*keyword_lex_fn)(TSLexer *, TSStateId);
+  TSSymbol keyword_capture_token;
+  struct {
+    const bool *states;
+    const TSSymbol *symbol_map;
+    void *(*create)(void);
+    void (*destroy)(void *);
+    bool (*scan)(void *, TSLexer *, const bool *symbol_whitelist);
+    unsigned (*serialize)(void *, char *);
+    void (*deserialize)(void *, const char *, unsigned);
+  } external_scanner;
+  const TSStateId *primary_state_ids;
+};
+
+static inline bool set_contains(TSCharacterRange *ranges, uint32_t len, int32_t lookahead) {
+  uint32_t index = 0;
+  uint32_t size = len - index;
+  while (size > 1) {
+    uint32_t half_size = size / 2;
+    uint32_t mid_index = index + half_size;
+    TSCharacterRange *range = &ranges[mid_index];
+    if (lookahead >= range->start && lookahead <= range->end) {
+      return true;
+    } else if (lookahead > range->end) {
+      index = mid_index;
+    }
+    size -= half_size;
+  }
+  TSCharacterRange *range = &ranges[index];
+  return (lookahead >= range->start && lookahead <= range->end);
+}
+
+/*
+ *  Lexer Macros
+ */
+
+#ifdef _MSC_VER
+#define UNUSED __pragma(warning(suppress : 4101))
+#else
+#define UNUSED __attribute__((unused))
+#endif
+
+#define START_LEXER()           \
+  bool result = false;          \
+  bool skip = false;            \
+  UNUSED                        \
+  bool eof = false;             \
+  int32_t lookahead;            \
+  goto start;                   \
+  next_state:                   \
+  lexer->advance(lexer, skip);  \
+  start:                        \
+  skip = false;                 \
+  lookahead = lexer->lookahead;
+
+#define ADVANCE(state_value) \
+  {                          \
+    state = state_value;     \
+    goto next_state;         \
+  }
+
+#define ADVANCE_MAP(...)                                              \
+  {                                                                   \
+    static const uint16_t map[] = { __VA_ARGS__ };                    \
+    for (uint32_t i = 0; i < sizeof(map) / sizeof(map[0]); i += 2) {  \
+      if (map[i] == lookahead) {                                      \
+        state = map[i + 1];                                           \
+        goto next_state;                                              \
+      }                                                               \
+    }                                                                 \
+  }
+
+#define SKIP(state_value) \
+  {                       \
+    skip = true;          \
+    state = state_value;  \
+    goto next_state;      \
+  }
+
+#define ACCEPT_TOKEN(symbol_value)     \
+  result = true;                       \
+  lexer->result_symbol = symbol_value; \
+  lexer->mark_end(lexer);
+
+#define END_STATE() return result;
+
+/*
+ *  Parse Table Macros
+ */
+
+#define SMALL_STATE(id) ((id) - LARGE_STATE_COUNT)
+
+#define STATE(id) id
+
+#define ACTIONS(id) id
+
+#define SHIFT(state_value)            \
+  {{                                  \
+    .shift = {                        \
+      .type = TSParseActionTypeShift, \
+      .state = (state_value)          \
+    }                                 \
+  }}
+
+#define SHIFT_REPEAT(state_value)     \
+  {{                                  \
+    .shift = {                        \
+      .type = TSParseActionTypeShift, \
+      .state = (state_value),         \
+      .repetition = true              \
+    }                                 \
+  }}
+
+#define SHIFT_EXTRA()                 \
+  {{                                  \
+    .shift = {                        \
+      .type = TSParseActionTypeShift, \
+      .extra = true                   \
+    }                                 \
+  }}
+
+#define REDUCE(symbol_name, children, precedence, prod_id) \
+  {{                                                       \
+    .reduce = {                                            \
+      .type = TSParseActionTypeReduce,                     \
+      .symbol = symbol_name,                               \
+      .child_count = children,                             \
+      .dynamic_precedence = precedence,                    \
+      .production_id = prod_id                             \
+    },                                                     \
+  }}
+
+#define RECOVER()                    \
+  {{                                 \
+    .type = TSParseActionTypeRecover \
+  }}
+
+#define ACCEPT_INPUT()              \
+  {{                                \
+    .type = TSParseActionTypeAccept \
+  }}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif  // TREE_SITTER_PARSER_H_
@@ -0,0 +1,249 @@
+================================================================
+shorthand if — single statement body, terminated by newline
+================================================================
+
+if (cond) honk()
+toot()
+
+----------------------------------------------------------------
+
+(chunk
+  (shorthand_if_statement
+    condition: (identifier)
+    consequence: (function_call
+      name: (identifier)
+      arguments: (arguments)))
+  (function_call
+    name: (identifier)
+    arguments: (arguments)))
+
+================================================================
+shorthand if — single statement body, terminated by EOF
+================================================================
+
+if (cond) honk()
+
+----------------------------------------------------------------
+
+(chunk
+  (shorthand_if_statement
+    condition: (identifier)
+    consequence: (function_call
+      name: (identifier)
+      arguments: (arguments))))
+
+================================================================
+shorthand if — multi-statement body collected into shorthand
+================================================================
+
+if (is_falling()) wheeee() splat()
+
+----------------------------------------------------------------
+
+(chunk
+  (shorthand_if_statement
+    condition: (function_call
+      name: (identifier)
+      arguments: (arguments))
+    consequence: (function_call
+      name: (identifier)
+      arguments: (arguments))
+    consequence: (function_call
+      name: (identifier)
+      arguments: (arguments))))
+
+================================================================
+shorthand if — same-line else
+================================================================
+
+if (cond) honk() else toot()
+
+----------------------------------------------------------------
+
+(chunk
+  (shorthand_if_statement
+    condition: (identifier)
+    consequence: (function_call
+      name: (identifier)
+      arguments: (arguments))
+    alternative: (function_call
+      name: (identifier)
+      arguments: (arguments))))
+
+================================================================
+shorthand if — same-line multi-statement else
+================================================================
+
+if (cond) honk() else toot() squawk()
+
+----------------------------------------------------------------
+
+(chunk
+  (shorthand_if_statement
+    condition: (identifier)
+    consequence: (function_call
+      name: (identifier)
+      arguments: (arguments))
+    alternative: (function_call
+      name: (identifier)
+      arguments: (arguments))
+    alternative: (function_call
+      name: (identifier)
+      arguments: (arguments))))
+
+================================================================
+shorthand if nested in standard if — `else` on later line binds
+to OUTER if, not the shorthand (PICO-8 line-significance)
+================================================================
+
+if is_noisy then
+  if (is_goose()) honk()
+else
+  toot()
+end
+
+----------------------------------------------------------------
+
+(chunk
+  (if_statement
+    condition: (identifier)
+    consequence: (block
+      (shorthand_if_statement
+        condition: (function_call
+          name: (identifier)
+          arguments: (arguments))
+        consequence: (function_call
+          name: (identifier)
+          arguments: (arguments))))
+    alternative: (else_statement
+      body: (block
+        (function_call
+          name: (identifier)
+          arguments: (arguments))))))
+
+================================================================
+shorthand if — line comment between body and newline still
+terminates the shorthand at the newline (line comment is in
+extras and is attached to the deepest enclosing node)
+================================================================
+
+if (cond) honk() -- inline
+toot()
+
+----------------------------------------------------------------
+
+(chunk
+  (shorthand_if_statement
+    condition: (identifier)
+    consequence: (function_call
+      name: (identifier)
+      arguments: (arguments))
+    (comment
+      content: (comment_content)))
+  (function_call
+    name: (identifier)
+    arguments: (arguments)))
+
+================================================================
+shorthand if inside a do-block — newline before `end` terminates
+shorthand, then `end` closes the do-block
+================================================================
+
+do
+  if (cond) honk()
+end
+
+----------------------------------------------------------------
+
+(chunk
+  (do_statement
+    body: (block
+      (shorthand_if_statement
+        condition: (identifier)
+        consequence: (function_call
+          name: (identifier)
+          arguments: (arguments))))))
+
+================================================================
+shorthand while — multi-statement body, terminated by newline
+================================================================
+
+while (running) tick() draw()
+cleanup()
+
+----------------------------------------------------------------
+
+(chunk
+  (shorthand_while_statement
+    condition: (identifier)
+    body: (function_call
+      name: (identifier)
+      arguments: (arguments))
+    body: (function_call
+      name: (identifier)
+      arguments: (arguments)))
+  (function_call
+    name: (identifier)
+    arguments: (arguments)))
+
+================================================================
+shorthand while — single statement body, terminated by EOF
+================================================================
+
+while (cond) tick()
+
+----------------------------------------------------------------
+
+(chunk
+  (shorthand_while_statement
+    condition: (identifier)
+    body: (function_call
+      name: (identifier)
+      arguments: (arguments))))
+
+================================================================
+nested shorthand ifs on the same line — a single newline must
+terminate BOTH shorthands (otherwise the outer one greedily
+absorbs the next-line statement)
+================================================================
+
+if (a) if (b) c()
+d()
+
+----------------------------------------------------------------
+
+(chunk
+  (shorthand_if_statement
+    condition: (identifier)
+    consequence: (shorthand_if_statement
+      condition: (identifier)
+      consequence: (function_call
+        name: (identifier)
+        arguments: (arguments))))
+  (function_call
+    name: (identifier)
+    arguments: (arguments)))
+
+================================================================
+standard if with parenthesized condition coexists with shorthand
+— GLR resolves on the token after `)` (then vs statement)
+================================================================
+
+if (cond) then a() end
+if (cond) a()
+
+----------------------------------------------------------------
+
+(chunk
+  (if_statement
+    condition: (parenthesized_expression
+      (identifier))
+    consequence: (block
+      (function_call
+        name: (identifier)
+        arguments: (arguments))))
+  (shorthand_if_statement
+    condition: (identifier)
+    consequence: (function_call
+      name: (identifier)
+      arguments: (arguments))))
@@ -0,0 +1,27 @@
+{
+  "$schema": "https://tree-sitter.github.io/tree-sitter/assets/schemas/config.schema.json",
+  "grammars": [
+    {
+      "name": "pico8_lua",
+      "scope": "source.pico8-lua",
+      "path": ".",
+      "file-types": [
+        "p8lua"
+      ],
+      "injection-regex": "^pico-?8[-_ ]?lua$"
+    }
+  ],
+  "metadata": {
+    "version": "0.0.1",
+    "license": "MIT",
+    "description": "PICO-8 Lua dialect grammar (forked from tree-sitter-lua)"
+  },
+  "bindings": {
+    "c": true,
+    "go": false,
+    "node": true,
+    "python": false,
+    "rust": false,
+    "swift": false
+  }
+}
@@ -1,11 +1,10 @@
-; Inject Lua syntax highlighting into the __lua__ section.
+; Hand the body of the __lua__ section to the Pico-8 Lua grammar so the
+; dialect-aware parser parses it (compound-assignment statements, the `?`
+; print shorthand, single-line `if (cond) stmt`, peek prefixes, etc.).
 ;
-; NOTE: This injects Zed's built-in Lua grammar, which does not
-; understand Pico-8 dialect extensions ( ?, +=, !=, single-line
-; `if (cond) stmt`, binary literals, memory peek prefix operators,
-; etc. ). Code that uses those forms will produce parse errors
-; locally, with degraded highlighting in those regions only — the
-; rest of the file will still render correctly. A future Pico-8
-; Lua grammar fork will replace this; see README for status.
+; The injection.language string must match the target language's `name`
+; field in its config.toml. Zed case-folds via UniCase but does not
+; treat hyphens as equivalent to spaces, so this has to be the literal
+; "Pico-8 Lua" ( with a space ).
 ((lua_content) @injection.content
- (#set! injection.language "lua"))
+ (#set! injection.language "Pico-8 Lua"))
@@ -0,0 +1,4 @@
+("(" @open ")" @close)
+("[" @open "]" @close)
+("{" @open "}" @close)
+("\"" @open "\"" @close)
@@ -0,0 +1,16 @@
+name = "Pico-8 Lua"
+grammar = "pico8_lua"
+path_suffixes = ["p8lua"]
+line_comments = ["-- "]
+block_comment = ["--[[", "]]"]
+hard_tabs = false
+tab_size = 1
+autoclose_before = ";:.,=}])>"
+brackets = [
+  { start = "{", end = "}", close = true, newline = true },
+  { start = "[", end = "]", close = true, newline = true },
+  { start = "(", end = ")", close = true, newline = true },
+  { start = "\"", end = "\"", close = true, newline = false, not_in = ["string"] },
+  { start = "'", end = "'", close = true, newline = false, not_in = ["string", "comment"] },
+  { start = "[[", end = "]]", close = true, newline = true },
+]
@@ -0,0 +1,215 @@
+; --- Keywords ---
+
+"return" @keyword.return
+
+[
+  "goto"
+  "in"
+  "local"
+] @keyword
+
+(label_statement) @label
+(break_statement) @keyword
+
+(do_statement
+  ["do" "end"] @keyword)
+
+(while_statement
+  ["while" "do" "end"] @keyword.repeat)
+(shorthand_while_statement
+  "while" @keyword.repeat)
+
+(repeat_statement
+  ["repeat" "until"] @keyword.repeat)
+
+(if_statement
+  ["if" "elseif" "else" "then" "end"] @keyword.conditional)
+(elseif_statement
+  ["elseif" "then" "end"] @keyword.conditional)
+(else_statement
+  ["else" "end"] @keyword.conditional)
+(shorthand_if_statement
+  ["if" "else"] @keyword.conditional)
+
+(for_statement
+  ["for" "do" "end"] @keyword.repeat)
+
+(function_declaration
+  ["function" "end"] @keyword.function)
+(function_definition
+  ["function" "end"] @keyword.function)
+
+; --- PICO-8 dialect: print shorthand and #include ---
+
+(print_shorthand_statement
+  directive: "?" @function.builtin)
+
+(include_statement
+  directive: _ @keyword.directive)
+(include_statement
+  path: (include_path) @string.special.path)
+
+; --- Operators ---
+
+(binary_expression
+  operator: _ @operator)
+
+(unary_expression
+  operator: _ @operator)
+
+(compound_assignment_statement
+  operator: _ @operator)
+
+"=" @operator
+
+[
+  "and"
+  "not"
+  "or"
+] @keyword.operator
+
+; --- Punctuation ---
+
+[
+  ";"
+  ":"
+  ","
+  "."
+] @punctuation.delimiter
+
+[
+  "("
+  ")"
+  "["
+  "]"
+  "{"
+  "}"
+] @punctuation.bracket
+
+; --- Variables and constants ---
+
+(identifier) @variable
+
+(variable_list
+  (attribute
+    "<" @punctuation.bracket
+    (identifier) @attribute
+    ">" @punctuation.bracket))
+
+((identifier) @constant
+  (#match? @constant "^[A-Z][A-Z_0-9]*$"))
+
+(vararg_expression) @constant
+(nil) @constant.builtin
+
+[
+  (false)
+  (true)
+] @boolean
+
+; PICO-8 callback hooks — recognized by name regardless of definition site.
+((identifier) @function.builtin
+  (#any-of? @function.builtin
+    "_init" "_update" "_update60" "_draw"))
+
+; --- Tables ---
+
+(field
+  name: (identifier) @property)
+
+(dot_index_expression
+  field: (identifier) @property)
+
+(table_constructor
+  ["{" "}"] @constructor)
+
+; --- Functions ---
+
+(parameters
+  (identifier) @variable.parameter)
+
+(function_declaration
+  name: [
+    (identifier) @function
+    (dot_index_expression
+      field: (identifier) @function)
+  ])
+
+(function_declaration
+  name: (method_index_expression
+    method: (identifier) @function.method))
+
+(assignment_statement
+  (variable_list
+    .
+    name: [
+      (identifier) @function
+      (dot_index_expression
+        field: (identifier) @function)
+    ])
+  (expression_list
+    .
+    value: (function_definition)))
+
+(table_constructor
+  (field
+    name: (identifier) @function
+    value: (function_definition)))
+
+(function_call
+  name: [
+    (identifier) @function.call
+    (dot_index_expression
+      field: (identifier) @function.call)
+    (method_index_expression
+      method: (identifier) @function.method.call)
+  ])
+
+; --- PICO-8 builtin globals ---
+
+(function_call
+  (identifier) @function.builtin
+  (#any-of? @function.builtin
+    ; system
+    "load" "save" "ls" "run" "stop" "assert" "reset" "info" "flip" "printh"
+    "time" "t" "stat" "extcmd" "holdframe" "_set_fps"
+    ; graphics
+    "clip" "pset" "pget" "sget" "sset" "fget" "fset" "print" "cursor"
+    "color" "cls" "camera" "circ" "circfill" "oval" "ovalfill" "line"
+    "rect" "rectfill" "rrect" "rrectfill" "pal" "palt" "spr" "sspr" "fillp"
+    ; tables
+    "add" "del" "deli" "count" "all" "foreach" "pairs" "ipairs" "next"
+    ; input
+    "btn" "btnp"
+    ; audio
+    "sfx" "music"
+    ; map
+    "mget" "mset" "map" "tline"
+    ; memory
+    "peek" "poke" "peek2" "poke2" "peek4" "poke4" "memcpy" "reload"
+    "cstore" "memset"
+    ; math
+    "max" "min" "mid" "flr" "ceil" "cos" "sin" "atan2" "sqrt" "abs"
+    "rnd" "srand"
+    ; bitwise (named forms — operator forms covered by @operator)
+    "band" "bor" "bxor" "bnot" "shl" "shr" "lshr" "rotl" "rotr"
+    ; custom menu
+    "menuitem"
+    ; strings / conversion
+    "tostr" "tonum" "chr" "ord" "sub" "split" "type"
+    ; cart data
+    "cartdata" "dget" "dset"
+    ; metatables
+    "setmetatable" "getmetatable" "rawset" "rawget" "rawequal" "rawlen"
+    ; coroutines
+    "cocreate" "coresume" "costatus" "yield"
+    ; error
+    "error" "pcall" "xpcall"))
+
+; --- Misc ---
+
+(comment) @comment
+(hash_bang_line) @comment.documentation
+(number) @number
+(string) @string
+(escape_sequence) @string.escape
@@ -0,0 +1,20 @@
+; Indent inside any block-bearing construct.
+(do_statement) @indent
+(while_statement) @indent
+(repeat_statement) @indent
+(for_statement) @indent
+(if_statement) @indent
+(elseif_statement) @indent
+(else_statement) @indent
+(function_declaration) @indent
+(function_definition) @indent
+(table_constructor) @indent
+(parenthesized_expression) @indent
+(arguments) @indent
+
+; The closing keywords/brackets sit at the parent's indent level.
+"end" @end
+"until" @end
+"}" @end
+")" @end
+"]" @end
@@ -0,0 +1,2 @@
+; ( reserved for future use — e.g. injecting hex blob content into peek/poke
+;   string literals, or shader code in spr() comments )
@@ -0,0 +1,12 @@
+; Top-level functions appear in the outline.
+(function_declaration
+  "function" @context
+  name: (_) @name
+  parameters: (parameters) @context.extra) @item
+
+; Local functions:  local function foo()
+(declaration
+  (function_declaration
+    "function" @context
+    name: (identifier) @name
+    parameters: (parameters) @context.extra)) @item
@@ -1,15 +1,8 @@
 {
-  "name": "tree-sitter-p8-cart",
-  "version": "0.0.1",
-  "description": "tree-sitter grammar for the PICO-8 .p8 cartridge text format",
-  "main": "bindings/node",
-  "types": "bindings/node",
-  "keywords": [
-    "tree-sitter",
-    "parser",
-    "pico-8",
-    "pico8"
-  ],
+  "name": "zed-p8-workspace",
+  "version": "0.0.2",
+  "private": true,
+  "description": "Workspace root for the zed-p8 extension; hosts tree-sitter-cli for the grammars under grammars/",
  "license": "0BSD",
  "devDependencies": {
    "tree-sitter-cli": "^0.24.7"
Author	SHA1	Message	Date
kistaro	64e5467062	parse EOL as a token	2026-05-15 00:16:13 -07:00
kistaro	c8ad7e74e7	Document line-significance limitations in the Pico-8 Lua grammar PICO-8's shorthand `if (cond) stmt [else stmt]` is line-bounded, but tree-sitter has no built-in newline awareness. Without an external scanner ( the same mechanism tree-sitter-python uses for INDENT / DEDENT / NEWLINE ), the grammar greedily binds `else` to the nearest `if` and takes only one consequence statement for the shorthand body. Token classification is unaffected, so syntax highlighting renders identically to a correct parse; only auto-indent and semantic selection are subtly off, in a code pattern that is uncommon in real PICO-8 code. New `grammars/pico-8-lua/KNOWN_LIMITATIONS.md` walks through both incorrect cases ( the dangling-else mis-bind and the multi-statement shorthand body ), tabulates which Zed features are and aren't affected, and sketches the fix. README cross-links it from the "Known limitations" block and adds it as a prerequisite to the v0.3 LSP work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 15:23:50 -07:00
kistaro	ba2dd6a9a2	Cart injection: use the literal language name "Pico-8 Lua" Zed resolves `injection.language` by matching it against the target language's `name` field ( config.toml ) via UniCase, which folds case but does NOT treat `-` as equivalent to ` `. The previous string "pico-8-lua" therefore did not resolve to any registered language and the entire __lua__ section rendered with zero highlights inside Zed. Per `LanguageRegistry::language_for_name_or_extension` in crates/language/src/language_registry.rs, only the `name` field and `path_suffixes` are consulted — the directory under languages/, the grammar `name`, the `scope`, and the tree-sitter.json `injection-regex` field are all ignored. ( `injection-regex` is a Helix/Neovim convention; Zed's production code never reads it. ) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 13:20:57 -07:00
kistaro	446a7972a4	Repin grammar revs to the leading-blanks fix commit Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 12:57:06 -07:00
kistaro	7557a34c89	Cart grammar: tolerate leading blank lines before the magic header `extras: $ => []` in the cart grammar made the parser fail at byte 0 on any whitespace-only or empty line before `pico-8 cartridge //...`. Real PICO-8 carts always start with the header at byte 0 so this rarely surfaced in production, but it ( a ) broke the `tree-sitter test` corpus harness, which prepends a newline to each fixture, and ( b ) would mis-flag a hand-edited cart that gained an accidental blank line up top. Fix: prefix the `cartridge` rule with `repeat($._blank_line)` and add a hidden `_blank_line` token matching `[ \t]*\n`. Junk content before the header ( a non-blank, non-magic line ) still produces an ERROR. Restores the test corpus that was dropped in v0.1 ( previously failing on this same edge case ) and adds a fixture for the unknown_section fallback while the corpus is being rebuilt. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 12:56:59 -07:00
kistaro	445aac4e30	Pin grammar revs to v0.2 commit Both [grammars.*] blocks now reference the v0.2 SHA, so a fresh `zed: install dev extension` clones each grammar from the right revision via its `path` subdirectory. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 12:54:15 -07:00
kistaro	39d77a8cae	v0.2: Pico-8 Lua dialect grammar and language Reorganize into grammars/<name>/ subdirs ( Zed's [grammars.*] supports a `path` field, so both grammars ship from this repo without a sibling- repo split ). Vendor tree-sitter-lua as the fork base for tree-sitter- pico8-lua; upstream MIT license preserved at grammars/pico-8-lua/ UPSTREAM-LICENSE.md. Dialect features added: != as ~= alias, \ integer divide, ^^ binary xor, >>> / <<> / >>< shifts and rotates, compound-assignment statements, memory peek prefixes @ % $ (% coexists with binary modulo), single-line `if (cond) stmt [else stmt]` and `while (cond) stmt`, statement-level print shorthand ?, and `#include path` directives. Identifier rule no longer accepts ! ? @ $ ( upstream did ). Pico-8 Lua language ( languages/pico-8-lua/, suffix .p8lua ) ships highlights with the full ~110 PICO-8 builtins as @function.builtin. The cart injection now hands __lua__ bodies to pico-8-lua, so .p8 carts and bare .p8lua files share the dialect-aware grammar. Examples updated to exercise the dialect end-to-end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 12:50:41 -07:00