Compare commits

..

7 Commits

Author SHA1 Message Date
kistaro 64e5467062 parse EOL as a token 2026-05-15 00:16:13 -07:00
kistaro c8ad7e74e7 Document line-significance limitations in the Pico-8 Lua grammar
PICO-8's shorthand `if (cond) stmt [else stmt]` is line-bounded, but
tree-sitter has no built-in newline awareness. Without an external
scanner ( the same mechanism tree-sitter-python uses for INDENT /
DEDENT / NEWLINE ), the grammar greedily binds `else` to the nearest
`if` and takes only one consequence statement for the shorthand body.
Token classification is unaffected, so syntax highlighting renders
identically to a correct parse; only auto-indent and semantic
selection are subtly off, in a code pattern that is uncommon in real
PICO-8 code.

New `grammars/pico-8-lua/KNOWN_LIMITATIONS.md` walks through both
incorrect cases ( the dangling-else mis-bind and the multi-statement
shorthand body ), tabulates which Zed features are and aren't
affected, and sketches the fix. README cross-links it from the
"Known limitations" block and adds it as a prerequisite to the v0.3
LSP work.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 15:23:50 -07:00
kistaro ba2dd6a9a2 Cart injection: use the literal language name "Pico-8 Lua"
Zed resolves `injection.language` by matching it against the target
language's `name` field ( config.toml ) via UniCase, which folds case
but does NOT treat `-` as equivalent to ` `. The previous string
"pico-8-lua" therefore did not resolve to any registered language and
the entire __lua__ section rendered with zero highlights inside Zed.

Per `LanguageRegistry::language_for_name_or_extension` in
crates/language/src/language_registry.rs, only the `name` field and
`path_suffixes` are consulted — the directory under languages/, the
grammar `name`, the `scope`, and the tree-sitter.json `injection-regex`
field are all ignored. ( `injection-regex` is a Helix/Neovim
convention; Zed's production code never reads it. )

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 13:20:57 -07:00
kistaro 446a7972a4 Repin grammar revs to the leading-blanks fix commit
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 12:57:06 -07:00
kistaro 7557a34c89 Cart grammar: tolerate leading blank lines before the magic header
`extras: $ => []` in the cart grammar made the parser fail at byte 0
on any whitespace-only or empty line before `pico-8 cartridge //...`.
Real PICO-8 carts always start with the header at byte 0 so this
rarely surfaced in production, but it ( a ) broke the `tree-sitter
test` corpus harness, which prepends a newline to each fixture, and
( b ) would mis-flag a hand-edited cart that gained an accidental
blank line up top.

Fix: prefix the `cartridge` rule with `repeat($._blank_line)` and add
a hidden `_blank_line` token matching `[ \t]*\n`. Junk content before
the header ( a non-blank, non-magic line ) still produces an ERROR.

Restores the test corpus that was dropped in v0.1 ( previously failing
on this same edge case ) and adds a fixture for the unknown_section
fallback while the corpus is being rebuilt.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 12:56:59 -07:00
kistaro 445aac4e30 Pin grammar revs to v0.2 commit
Both [grammars.*] blocks now reference the v0.2 SHA, so a fresh
`zed: install dev extension` clones each grammar from the right
revision via its `path` subdirectory.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 12:54:15 -07:00
kistaro 39d77a8cae v0.2: Pico-8 Lua dialect grammar and language
Reorganize into grammars/<name>/ subdirs ( Zed's [grammars.*] supports
a `path` field, so both grammars ship from this repo without a sibling-
repo split ). Vendor tree-sitter-lua as the fork base for tree-sitter-
pico8-lua; upstream MIT license preserved at grammars/pico-8-lua/
UPSTREAM-LICENSE.md.

Dialect features added: != as ~= alias, \ integer divide, ^^ binary xor,
>>> / <<> / >>< shifts and rotates, compound-assignment statements,
memory peek prefixes @ % $ (% coexists with binary modulo), single-line
`if (cond) stmt [else stmt]` and `while (cond) stmt`, statement-level
print shorthand ?, and `#include path` directives. Identifier rule no
longer accepts ! ? @ $ ( upstream did ).

Pico-8 Lua language ( languages/pico-8-lua/, suffix .p8lua ) ships
highlights with the full ~110 PICO-8 builtins as @function.builtin.
The cart injection now hands __lua__ bodies to pico-8-lua, so .p8 carts
and bare .p8lua files share the dialect-aware grammar. Examples updated
to exercise the dialect end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 12:50:41 -07:00
36 changed files with 36147 additions and 1803 deletions
+10 -1
View File
@@ -2,5 +2,14 @@ node_modules/
build/
*.wasm
# tree-sitter-cli scratch
# tree-sitter-cli scratch and parser-clone caches.
# tree-sitter-cli creates `grammars/<grammar_name>/` ( underscore form,
# matching the C identifier ) when it auto-clones a grammar repo for
# parser binary caching. Our hand-maintained dirs use hyphens, so the
# underscore variants are always cache.
.tree-sitter/
grammars/p8_cart/
grammars/pico8_lua/
# scratch directory for stuff to show an AI agent or reference in the IDE
.local/
+162 -87
View File
@@ -1,146 +1,216 @@
# zed-p8
A Zed extension for the [PICO-8](https://www.lexaloffle.com/pico-8.php) fantasy
console. The goal is reasonable editor support for the entire `.p8` cartridge
file format and for PICO-8's Lua dialect — even where PICO-8 deviates from
console. Reasonable editor support for the entire `.p8` cartridge file format
and for PICO-8's Lua dialect — including the parts where PICO-8 deviates from
standard Lua 5.2 (compound assignments, `?` print shorthand, single-line
`if (cond) ...`, `!=`, binary literals, peek operators, and so on).
`if (cond) ...`, `!=`, `0b...` binary literals, integer divide `\`, peek
operators `@`/`%`/`$`, the rotate/logical-shift family `<<>` / `>><` / `>>>`,
and `#include`).
## Status — v0.1 scaffold
## Status — v0.3 (unreleased)
Working today:
- A small tree-sitter grammar (`p8_cart`, in this repo's root) that parses
the `.p8` cartridge container: the magic header line, `version` line, and
the named sections `__lua__`, `__gfx__`, `__gff__`, `__label__`, `__map__`,
`__sfx__`, `__music__`. Unknown `__name__` markers are accepted as a
fallback `unknown_section`.
- A Zed language definition `Pico-8 Cartridge` (file suffix `.p8`) that uses
this grammar for outline, section-marker highlighting, and an outline view.
- An injection that hands the body of the `__lua__` section to Zed's
built-in Lua language for syntax highlighting.
- **`tree-sitter-p8-cart`** ( `grammars/p8-cart/` ) — parses the `.p8`
cartridge container: the magic header line, `version` line, and the named
sections `__lua__`, `__gfx__`, `__gff__`, `__label__`, `__map__`, `__sfx__`,
`__music__`. Unknown `__name__` markers fall through to `unknown_section`.
- **`tree-sitter-pico8-lua`** ( `grammars/pico-8-lua/` ) — fork of
[`tree-sitter-grammars/tree-sitter-lua`](https://github.com/tree-sitter-grammars/tree-sitter-lua)
with the PICO-8 dialect added. Handles every dialect form the manual
documents: see *Dialect coverage* below. Upstream attribution is preserved
in `grammars/pico-8-lua/UPSTREAM-LICENSE.md`.
- **`Pico-8 Cartridge`** language ( `languages/pico-8-cart/`, suffix `.p8` ) —
config, marker highlights, outline view, and an injection that hands the
`__lua__` body to the Pico-8 Lua grammar.
- **`Pico-8 Lua`** language ( `languages/pico-8-lua/`, suffix `.p8lua` ) —
config, dialect-aware highlights with PICO-8 builtins recognized, brackets,
indents, outline.
Known limitations:
The `.p8lua` suffix is the convention for bare Pico-8 Lua source files — the
ones pulled into a cart via `#include`. Plain `.lua` files are intentionally
*not* claimed by this extension, so users who keep stock Lua files alongside
their PICO-8 work continue to get standard Lua treatment.
### Dialect coverage
| Feature | Status |
|---|---|
| `!=` ( alias for `~=` ) | ✓ |
| Compound assignment: `+= -= *= /= %= \= ^= ..= &= \|= ^^= <<= >>= >>>= <<>= >><=` | ✓ |
| Integer divide `\` and modulo `%` ( binary ) | ✓ |
| Bitwise XOR `^^` ( binary, in addition to upstream's `~` ) | ✓ |
| Logical shift right `>>>`, rotate left `<<>`, rotate right `>><` | ✓ |
| Hex literals with fractional part: `0x11.4000` | ✓ |
| Binary literals: `0b1010` | ✓ |
| Memory peek prefix unary: `@addr`, `%addr`, `$addr` | ✓ |
| Single-line `if (cond) stmt [else stmt]` ( no `then`/`end` ) | ✓ |
| Single-line `while (cond) stmt` ( no `do`/`end` ) | ✓ |
| `?` print shorthand statement | ✓ |
| `#include path` directive | ✓ |
| `_init` / `_update` / `_update60` / `_draw` highlighted as builtins | ✓ |
### Line-significance (resolved in v0.3)
PICO-8's shorthand `if (cond) ...` and `while (cond) ...` are
line-bounded: a later-line `else` belongs to an enclosing standard
`if`, not the shorthand, and a multi-statement single-line shorthand
body collects every statement on the line. The external scanner emits
a zero-width `LINE_END` token at `\n` / `\r` / EOF when (and only
when) the parser is at the body-or-terminator decision point of a
shorthand statement, so the AST now matches PICO-8 semantics — see
[`grammars/pico-8-lua/KNOWN_LIMITATIONS.md`](grammars/pico-8-lua/KNOWN_LIMITATIONS.md)
for the wiring detail and
[`grammars/pico-8-lua/test/corpus/shorthand_line_end.txt`](grammars/pico-8-lua/test/corpus/shorthand_line_end.txt)
for the test corpus.
### Known limitations
- **PICO-8 Lua dialect is not fully parsed.** The injected grammar is plain
Lua 5.2, which does not understand `?` (print shorthand), `+=` and friends,
`!=`, `0b...` literals, the `\` integer-divide operator, the `@`/`%`/`$`
peek prefixes, or the single-line `if (cond) stmt` / `while (cond) stmt`
forms. Code that uses any of those will show parse-error highlighting in
those regions only — surrounding code remains correctly highlighted. See
Roadmap below.
- **No language server.** No completion, hover docs, or diagnostics for
PICO-8 builtins yet. See Roadmap.
- **No `.p8.png` support.** Only the plain-text `.p8` format is handled.
PICO-8 builtins yet — only a static `function.builtin` highlight on
recognized names. See Roadmap.
- **No `.p8.png` support.** Only the plain-text `.p8` format is handled —
the PNG-steganography variant is out of scope for a text-focused IDE
extension.
- **Hex sections are unhighlighted blobs.** `__gfx__`, `__map__`, `__sfx__`,
`__gff__`, `__music__`, `__label__` parse as opaque line bodies. Roadmap
v0.4 covers per-section highlighters.
## Repository layout
```
zed-p8/
extension.toml ← Zed extension manifest
grammar.jstree-sitter-p8-cart grammar source
src/ ← generated parser ( committed; regenerate after grammar.js edits )
package.jsontree-sitter-cli devDependency
tree-sitter.json ← tree-sitter-cli config ( auto-managed )
package.json workspace root; hosts tree-sitter-cli
grammars/
p8-cart/ cart-format tree-sitter grammar
grammar.js
tree-sitter.json
src/ ← generated parser ( committed )
pico-8-lua/ ← Pico-8 Lua dialect tree-sitter grammar
grammar.js
tree-sitter.json
package.json ← marks this dir as ESM for node
src/ ← generated parser + scanner.c ( committed )
UPSTREAM-LICENSE.md ← MIT, tree-sitter-lua by Munif Tanjim
languages/
pico-8-cart/ ← Pico-8 Cartridge language files
config.toml
highlights.scm
injections.scm ← injects pico-8-lua into __lua__ body
outline.scm
pico-8-lua/ ← Pico-8 Lua language files
config.toml
highlights.scm
brackets.scm
indents.scm
injections.scm
outline.scm
examples/
hello.p8 ← minimal test cart
references/ ← upstream PICO-8 manual + Zed docs links
references/ ← upstream PICO-8 manual + Zed doc links
```
The cart grammar lives at the repo root rather than as a separate sibling
repository. This keeps everything in one place during early development; if
the grammar grows or wants to be reused outside Zed it can be split out
later — the only file that needs to move with it is `grammar.js` plus the
generated `src/`, and the `[grammars.p8_cart]` URL in `extension.toml` would
need updating.
Both grammars live in subdirectories of this same repository. Zed's
`[grammars.*]` block supports a `path` field, so the extension manifest
points each grammar at this repo's git URL plus the relevant subdir.
## Local development
Prerequisites: Node.js (for `tree-sitter-cli`) and Zed. Rust is NOT required
unless/until we add a language-server harness.
### Edit-and-reload loop
1. Edit `grammar.js`.
2. Regenerate the parser:
Prerequisites: Node.js ( for `tree-sitter-cli` ) and Zed. Rust is *not*
required unless / until we add a language-server harness ( v0.3 ).
```sh
npx tree-sitter generate
npm install # one-time, installs tree-sitter-cli
```
3. Sanity-check on a real cart:
### Edit a grammar and reload
1. Edit `grammars/<name>/grammar.js`.
2. Regenerate from the grammar's directory:
```sh
npx tree-sitter parse examples/hello.p8
( cd grammars/p8-cart && npx tree-sitter generate )
# or
( cd grammars/pico-8-lua && npx tree-sitter generate )
```
4. Commit. The `[grammars.p8_cart]` block in `extension.toml` references this
repo by `file://` URL and pins a commit SHA — Zed clones the grammar
from that pinned revision, so changes only take effect after they're
committed.
3. Sanity-check by parsing a sample file:
5. Update `extension.toml`'s `rev` field to the new SHA, then in Zed run
`zed: install dev extension` (or click *Install Dev Extension* on the
Extensions page) and select this directory. Reinstall after every commit
that should be picked up.
```sh
( cd grammars/p8-cart && npx tree-sitter parse ../../examples/hello.p8 )
( cd grammars/pico-8-lua && npx tree-sitter parse path/to/file.p8lua )
```
4. Commit the regenerated `src/parser.c` along with the grammar change.
Zed clones the grammar repo at the SHA in `extension.toml`, so changes
only take effect after a commit.
5. Update the `rev` field of the affected `[grammars.*]` block(s) in
`extension.toml` to the new SHA, then in Zed run `zed: install dev
extension` ( or *Install Dev Extension* on the Extensions page ) and
select this directory. Reinstall after every commit that should be
picked up.
Logs: `zed: open log`. Run `zed --foreground` for live stdout.
### Editing language queries
### Edit only language queries
Files under `languages/pico-8-cart/` (`highlights.scm`, `injections.scm`,
`outline.scm`) are loaded directly by Zed — no regeneration needed. Reinstall
the dev extension to pick up changes.
Files under `languages/*/` ( `highlights.scm`, `injections.scm`, etc. )
are loaded directly by Zed — no regeneration step needed. Reinstall the
dev extension to pick up changes.
### Tests
Sample carts live under `examples/`; parse them directly with
`tree-sitter parse <file>` for ad-hoc checks.
The cart grammar has a corpus under `grammars/p8-cart/test/corpus/` —
run `( cd grammars/p8-cart && npx tree-sitter test )`. The corpus
covers the empty-section skeleton, normal Lua content, the case where
a Lua identifier resembles a section marker ( e.g. `local __foo__ = 1`
must remain a `line`, not be re-tokenized as a marker ), and the
fallback `unknown_section` rule.
The Lua grammar has a corpus under `grammars/pico-8-lua/test/corpus/` —
run `( cd grammars/pico-8-lua && npx tree-sitter test )`. The corpus
exercises shorthand `if`/`while` line-end behavior: dangling-else,
multi-statement bodies, EOF termination, nested same-line shorthands,
and coexistence with standard `if (parenthesized) then ... end`.
## Roadmap
### v0.2 — PICO-8 Lua dialect grammar
Fork [`tree-sitter-grammars/tree-sitter-lua`](https://github.com/tree-sitter-grammars/tree-sitter-lua)
into `tree-sitter-pico8` and add the dialect extensions documented in the
PICO-8 manual:
- Compound-assignment operators: `+= -= *= /= \= %= ^= ..= &= |= ^^= <<= >>= >>>= <<>= >><=`
- `!=` as alias for `~=`
- `\` (integer divide) and the rotate / logical-shift operators `<<>` `>><` `>>>`
- Binary literals `0b...` and hex fractional literals `0x1.4p0` style
- Single-line `if (cond) stmt` and `while (cond) stmt` ( no `then`/`do`/`end` )
- `?` as a statement-level shorthand for `print`
- The peek-prefix unary operators `@addr` `%addr` `$addr`
Then add a second language `Pico-8 Lua` here (separate from Zed's built-in
`Lua`) and switch `injections.scm` to inject `pico-8-lua` instead of `lua`.
### v0.3 — Language server integration
The line-significance prerequisite is now satisfied (see *Line-significance*
above), so LSP features that walk the AST — unreachable-code lint,
goto-definition through a conditional branch — have a correct structure
to work against.
Wire up [`japhib/pico8-ls`](https://github.com/japhib/pico8-ls) ( or whichever
PICO-8 LSP is most maintained at the time ) for:
- Completion of PICO-8 builtins (`spr`, `circfill`, `btn`, `flr`, …)
- Signature help and hover docs sourced from the manual
- Cart-aware analysis ( the LSP already understands `.p8` section markers
and only analyzes the `__lua__` body )
- Per-cart diagnostics
- Completion of PICO-8 builtins ( `spr`, `circfill`, `btn`, `flr`, … ).
- Signature help and hover docs sourced from the manual.
- Cart-aware analysis the LSP already understands `.p8` section markers
and only analyzes the `__lua__` body.
- Per-cart diagnostics.
This will require a Rust component ( the `zed_extension_api` crate ) to
download the language-server binary and define
`language_server_command` — see [Zed's developing-extensions docs](https://zed.dev/docs/extensions/developing-extensions).
This requires a Rust component ( the `zed_extension_api` crate ) that
downloads the LSP binary and defines `language_server_command`.
See [Zed's developing-extensions docs](https://zed.dev/docs/extensions/developing-extensions).
### v0.4 — Polish
- LuaCATS / EmmyLua stub file enumerating PICO-8's ~110 globals, for users
who'd rather wire up `lua-language-server` against their `#include`-d
`.lua` files.
- Highlight rules for hex sections (`__gfx__`, `__map__`, `__sfx__`, etc.)
so palette indices and note pitches show up distinctly.
- Snippets for common idioms (`for x=0,127 do ... end`, the `_init`/`_update`/
`_draw` triad).
`.p8lua` files.
- Per-section highlighters for the hex blocks: `__gfx__` colored by palette
index, `__sfx__` / `__music__` parsed as note streams, `__map__` as tile
indices, `__gff__` as flag bytes.
- Snippets for common idioms ( the `_init` / `_update` / `_draw` triad,
`for x=0,127 do … end`, palette swap setup, etc. ).
## References
@@ -148,7 +218,12 @@ download the language-server binary and define
- Zed extension docs: see links in `references/zed-doc-links.md`
- Cart file-format spec ( community wiki, not in the official manual ):
https://pico-8.fandom.com/wiki/P8FileFormat
- Upstream Lua grammar: https://github.com/tree-sitter-grammars/tree-sitter-lua
( MIT, by Munif Tanjim — preserved in `grammars/pico-8-lua/UPSTREAM-LICENSE.md` )
## License
0BSD see `LICENSE`.
The cart grammar and the Zed extension files are 0BSD ( see `LICENSE` ).
The PICO-8 Lua grammar is a fork of MIT-licensed `tree-sitter-lua`; the
upstream license is preserved at `grammars/pico-8-lua/UPSTREAM-LICENSE.md`
and applies to the derived files in that directory.
+12 -2
View File
@@ -1,14 +1,24 @@
pico-8 cartridge // http://www.pico-8.com
version 42
__lua__
-- hello cartridge
-- hello cartridge — exercises Pico-8 dialect features
#include shared.p8lua
function _init()
cls()
t = 0
end
function _update()
t += 1
if (btn(4)) sweep_palette()
move()
end
function _draw()
cls(1)
print("hello pico-8", 30, 60, 7)
draw_blob()
?"frame:"..t, 0, 120, 7
end
__gfx__
00000000111111110000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
+30
View File
@@ -0,0 +1,30 @@
-- shared.p8lua: included into hello.p8 via #include
-- demonstrates Pico-8 dialect features outside a __lua__ section.
local x, y = 64, 64
function move()
if (btn(0)) x-=1
if (btn(1)) x+=1
if (btn(2)) y-=1
if (btn(3)) y+=1
x = mid(0, x, 127)
y = mid(0, y, 127)
end
function draw_blob()
cls(1)
circfill(x, y, 5, 7)
?"x="..x..",y="..y, 0, 0, 7
end
-- peek/poke via prefix shorthands
function sweep_palette()
for i=0,15 do
poke(0x5f10+i, %0x5f10 ^^ i)
end
end
-- single-line while with compound assignment body
local i = 0
while (i < 8) i += 1
+16 -6
View File
@@ -1,15 +1,25 @@
id = "pico-8"
name = "Pico-8"
version = "0.0.1"
version = "0.0.2"
schema_version = 1
authors = ["Kistaro Windrider <kistaro@gmail.com>"]
description = "Pico-8 cartridge (.p8) and Lua dialect support for Zed"
repository = "https://github.com/kistaro/zed-p8"
# The .p8 cart container grammar lives at the root of this repo.
# During local development, set `repository` to `file://` + the absolute
# path of your clone and pin `rev` to a committed SHA. When publishing,
# this should point at a public Git URL.
# Both grammars live in this same repository, in subdirectories under
# `grammars/`. Zed's [grammars.*] block supports a `path` field for
# exactly this layout. During local development, `repository` is a
# `file://` URL pointing at this clone and `rev` is pinned to a
# committed SHA — Zed clones the grammar at that revision rather than
# reading the working tree, so changes only take effect after a commit
# + a `rev` bump in this file.
[grammars.p8_cart]
repository = "file:///Users/norberg/gitea-repos/zed-p8"
rev = "3f209efa897558e8ecd7aa3612846dc12798b0bb"
rev = "7557a34c89de5a994cc06025c8122dfa3a5af8cf"
path = "grammars/p8-cart"
[grammars.pico8_lua]
repository = "file:///Users/norberg/gitea-repos/zed-p8"
rev = "7557a34c89de5a994cc06025c8122dfa3a5af8cf"
path = "grammars/pico-8-lua"
+16 -1
View File
@@ -18,16 +18,31 @@
module.exports = grammar({
name: 'p8_cart',
// Whitespace is significant inside hex sections, so we don't skip it.
// Whitespace is significant inside hex sections, so we don't skip it
// globally. Tolerance for stray leading blanks before the magic header
// is added explicitly via the `repeat($._blank_line)` at the top of
// `cartridge` ( see below ).
extras: $ => [],
rules: {
cartridge: $ => seq(
// Tolerate stray whitespace / blank lines before the magic header.
// Real PICO-8 carts begin with the header on byte 0, but allowing
// a leading run of blanks ( a ) lets the `tree-sitter test` corpus
// framework, which prepends a newline to each fixture, run cleanly
// and ( b ) keeps the parser robust against a hand-edited cart that
// gained an accidental blank line up top.
repeat($._blank_line),
optional($.header),
optional($.version),
repeat($.section),
),
// A line that has no content other than horizontal whitespace and a
// newline. Hidden ( underscore prefix ) so it does not appear in the
// syntax tree.
_blank_line: $ => token(/[ \t]*\n/),
header: $ => /pico-8 cartridge \/\/[^\n]*\n/,
version: $ => /version[ \t]+\d+\n/,
@@ -5,6 +5,13 @@
"cartridge": {
"type": "SEQ",
"members": [
{
"type": "REPEAT",
"content": {
"type": "SYMBOL",
"name": "_blank_line"
}
},
{
"type": "CHOICE",
"members": [
@@ -38,6 +45,13 @@
}
]
},
"_blank_line": {
"type": "TOKEN",
"content": {
"type": "PATTERN",
"value": "[ \\t]*\\n"
}
},
"header": {
"type": "PATTERN",
"value": "pico-8 cartridge \\/\\/[^\\n]*\\n"
File diff suppressed because it is too large Load Diff
+109
View File
@@ -0,0 +1,109 @@
==================
empty cart skeleton
==================
pico-8 cartridge // http://www.pico-8.com
version 42
__lua__
__gfx__
__map__
__sfx__
__music__
---
(cartridge
(header)
(version)
(section (lua_section (lua_marker)))
(section (gfx_section (gfx_marker)))
(section (map_section (map_marker)))
(section (sfx_section (sfx_marker)))
(section (music_section (music_marker))))
==================
cart with lua content
==================
pico-8 cartridge // http://www.pico-8.com
version 42
__lua__
function _draw()
cls()
end
__gfx__
00000000
---
(cartridge
(header)
(version)
(section
(lua_section
(lua_marker)
(lua_content
(line)
(line)
(line))))
(section
(gfx_section
(gfx_marker)
(body
(line)))))
==================
lua identifier resembling section marker
==================
pico-8 cartridge // http://www.pico-8.com
version 42
__lua__
local __foo__ = 1
local s = "__lua__"
__gfx__
00
---
(cartridge
(header)
(version)
(section
(lua_section
(lua_marker)
(lua_content
(line)
(line))))
(section
(gfx_section
(gfx_marker)
(body
(line)))))
==================
unknown section name
==================
pico-8 cartridge // http://www.pico-8.com
version 42
__lua__
__future_section__
opaque body
__gfx__
00
---
(cartridge
(header)
(version)
(section (lua_section (lua_marker)))
(section
(unknown_section
(section_marker)
(body (line))))
(section
(gfx_section
(gfx_marker)
(body (line)))))
+73
View File
@@ -0,0 +1,73 @@
# Known limitations of `tree-sitter-pico8-lua`
This document used to track parse incorrectness around PICO-8's
line-significant shorthand `if (cond) ...` / `while (cond) ...`
constructs. As of v0.3 the external scanner emits a `LINE_END` token
when the parser is at the body-or-terminator decision point of a
shorthand statement and the next byte is `\n` / `\r` / EOF, so the body
of a shorthand is correctly bounded to its source line.
There are no other known parse-incorrectness issues at this time.
Removing this file (or leaving it as a brief stub) is fine once you're
confident no documentation links still point at the old limitation
sections.
## How line-significance is wired up (for reference)
PICO-8 deviates from standard Lua in two places where a newline is
syntactically significant:
- `if (cond) <stmts...>` — the consequence (and any same-line `else`
alternative) extends to end-of-line, not to a matching `end`.
- `while (cond) <stmts...>` — same line-bounded body as the
shorthand `if`.
Tree-sitter has no built-in concept of newlines as syntactic tokens
when `/\s/` is in `extras` (and we want it there: every other
construct treats whitespace transparently). The canonical fix is an
**external scanner** that gates a synthetic terminator token on
`valid_symbols`. We do exactly that:
- `src/scanner.c` exposes a `LINE_END` external symbol. The scanner
looks at the raw lookahead before the lexer has a chance to skip
extras, and emits `LINE_END` only when the parser actually expects
one (i.e., `valid_symbols[LINE_END] == true`). At any other
position, the scanner's LINE_END branch returns false, and the `\n`
falls through to be eaten silently by the `/\s/` extras pattern.
- `LINE_END` is **zero-width** — the scanner does not consume the
newline. This matters for nested shorthands: `if (a) if (b) c()\nd()`
has to terminate BOTH shorthands at the same `\n`. With a zero-width
terminator, each enclosing shorthand sees the same `\n` in turn and
reduces. Once no shorthand is on the stack, `LINE_END` is no longer
in `valid_symbols`, the scanner returns false, and the `\n` is
consumed by extras. The emit chain is bounded by static nesting
depth, so there's no infinite-loop risk despite the zero width.
The shorthand rules in `grammar.js` end with `$._line_end`; the body
and the optional `else` alternative are both `$.statement, repeat($.statement)`,
allowing PICO-8's multi-statement single-line bodies
(`if (falling) wheeee() splat()`).
The cross-language pattern is "external scanner + valid_symbols-gated
terminator," same as `tree-sitter-r` (the closest analogue) and
similar in spirit to Ruby's paired `_line_break` / `_no_line_break`
hint tokens. Reaching for `\s` removal or per-rule extras is **not**
necessary for this style of line-significance; only Python-style
INDENT/DEDENT requires the heavier refactor.
## Test coverage
`test/corpus/shorthand_line_end.txt` exercises:
- Single- and multi-statement shorthand bodies, terminated by `\n` and
by EOF.
- Same-line `else` (single- and multi-statement alternative).
- The historical dangling-else case (shorthand inside a standard `if`,
with `else` on a later line — must bind to the outer `if`).
- Line comment trailing the shorthand body (the comment is in extras
and the trailing `\n` still triggers `LINE_END`).
- Shorthand inside a `do`-block (the `\n` before the closing `end`
terminates the shorthand cleanly).
- Nested shorthand `if`s on the same line (one `\n` must close both).
- Coexistence with standard `if (parenthesized) then ... end` — the
GLR conflict resolves on whether `then` follows.
+21
View File
@@ -0,0 +1,21 @@
The MIT License (MIT)
Copyright (c) 2021 Munif Tanjim
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
+619
View File
@@ -0,0 +1,619 @@
/**
* @file PICO-8 Lua grammar for tree-sitter
*
* Forked from tree-sitter-lua 0.5.0 by Munif Tanjim ( MIT — see
* UPSTREAM-LICENSE.md ). This fork adds the PICO-8 dialect extensions
* documented in the PICO-8 manual:
*
* - != as alias for ~=
* - Integer divide: \
* - Bitwise XOR (binary): ^^
* - Logical shift right: >>>
* - Rotate left: <<>
* - Rotate right: >><
* - Compound-assignment statements: += -= *= /= %= \= ^= ..= &= |= ^^=
* <<= >>= >>>= <<>= >><=
* - Memory peek prefix unary operators: @addr %addr $addr
* ( these coexist with binary % for modulo )
* - Single-line if (cond) stmt [else stmt] — no `then`/`end`
* - Single-line while (cond) stmt — no `do`/`end`
* - Statement-level print shorthand: `?` followed by an expression list
* - `#include path` directive
*/
/// <reference types="tree-sitter-cli/dsl" />
// @ts-check
const PREC = {
OR: 1, // or
AND: 2, // and
COMPARE: 3, // < > <= >= ~= == !=
BIT_OR: 4, // |
BIT_NOT: 5, // ~ ^^
BIT_AND: 6, // &
BIT_SHIFT: 7, // << >> >>> <<> >><
CONCAT: 8, // ..
PLUS: 9, // + -
MULTI: 10, // * / // % \
UNARY: 11, // not # - ~ @ $ %
POWER: 12, // ^
};
const list_seq = (rule, separator, trailing_separator = false) =>
trailing_separator
? seq(rule, repeat(seq(separator, rule)), optional(separator))
: seq(rule, repeat(seq(separator, rule)));
const optional_block = ($) => alias(optional($._block), $.block);
// namelist ::= Name {',' Name}
const name_list = ($) => list_seq(field('name', $.identifier), ',');
const COMPOUND_ASSIGN_OPERATORS = [
'+=', '-=', '*=', '/=', '%=', '\\=', '^=', '..=',
'&=', '|=', '^^=',
'<<=', '>>=', '>>>=', '<<>=', '>><=',
];
export default grammar({
name: 'pico8_lua',
extras: ($) => [$.comment, /\s/],
externals: ($) => [
$._block_comment_start,
$._block_comment_content,
$._block_comment_end,
$._block_string_start,
$._block_string_content,
$._block_string_end,
// PICO-8 line-significance: terminates the body of `if (cond) ...` /
// `while (cond) ...` shorthand. The scanner emits this only when the
// parser is at a state expecting it; everywhere else a newline falls
// through to /\s/ in extras and is skipped. See src/scanner.c.
$._line_end,
],
supertypes: ($) => [$.statement, $.expression, $.declaration, $.variable],
word: ($) => $.identifier,
// `if (cond) ...` is ambiguous between a standard if where the condition
// is a parenthesized_expression and a shorthand if. Same for while. The
// ambiguity resolves by what follows the closing `)` ( `then`/`do` for
// the standard form, anything else for the shorthand ).
conflicts: ($) => [
[$.parenthesized_expression, $.shorthand_if_statement],
[$.parenthesized_expression, $.shorthand_while_statement],
],
rules: {
// chunk ::= block
chunk: ($) =>
seq(
optional($.hash_bang_line),
repeat($.statement),
optional($.return_statement)
),
hash_bang_line: (_) => /#![^\n]*/,
// block ::= {stat} [retstat]
_block: ($) =>
choice(
seq(repeat1($.statement), optional($.return_statement)),
seq(repeat($.statement), $.return_statement)
),
statement: ($) =>
choice(
$.empty_statement,
$.assignment_statement,
$.compound_assignment_statement,
$.function_call,
$.label_statement,
$.break_statement,
$.goto_statement,
$.do_statement,
$.while_statement,
$.shorthand_while_statement,
$.repeat_statement,
$.if_statement,
$.shorthand_if_statement,
$.for_statement,
$.declaration,
$.print_shorthand_statement,
$.include_statement,
),
// retstat ::= return [explist] [';']
return_statement: ($) =>
seq(
'return',
optional(alias($._expression_list, $.expression_list)),
optional(';')
),
empty_statement: (_) => ';',
assignment_statement: ($) =>
seq(
alias($._variable_assignment_varlist, $.variable_list),
field('operator', '='),
alias($._variable_assignment_explist, $.expression_list)
),
_variable_assignment_varlist: ($) =>
list_seq(field('name', $.variable), ','),
_variable_assignment_explist: ($) =>
list_seq(field('value', $.expression), ','),
// PICO-8 compound assignment: var OP= expr (single statement, single line).
compound_assignment_statement: ($) =>
seq(
field('name', $.variable),
field('operator', choice(...COMPOUND_ASSIGN_OPERATORS)),
field('value', $.expression)
),
label_statement: ($) => seq('::', $.identifier, '::'),
break_statement: (_) => 'break',
goto_statement: ($) => seq('goto', $.identifier),
do_statement: ($) => seq('do', field('body', optional_block($)), 'end'),
while_statement: ($) =>
seq(
'while',
field('condition', $.expression),
'do',
field('body', optional_block($)),
'end'
),
// PICO-8 single-line: while (cond) stmt {stmt}
// Body extends to end-of-line (or EOF). The $._line_end terminator
// is emitted by the external scanner when it sees \n/\r/EOF at a
// position where the parser expects line-end; until then, additional
// statements on the same line accumulate into the body.
shorthand_while_statement: ($) =>
seq(
'while',
'(',
field('condition', $.expression),
')',
field('body', $.statement),
repeat(field('body', $.statement)),
$._line_end
),
repeat_statement: ($) =>
seq(
'repeat',
field('body', optional_block($)),
'until',
field('condition', $.expression)
),
if_statement: ($) =>
seq(
'if',
field('condition', $.expression),
'then',
field('consequence', optional_block($)),
repeat(field('alternative', $.elseif_statement)),
optional(field('alternative', $.else_statement)),
'end'
),
elseif_statement: ($) =>
seq(
'elseif',
field('condition', $.expression),
'then',
field('consequence', optional_block($))
),
else_statement: ($) => seq('else', field('body', optional_block($))),
// PICO-8 single-line: if (cond) stmt {stmt} [else stmt {stmt}]
// Both the consequence and the alternative extend to end-of-line.
// The $._line_end terminator (emitted by the external scanner on
// \n/\r/EOF) prevents a later-line `else` from binding to a
// shorthand `if` on a previous line, matching PICO-8 semantics.
shorthand_if_statement: ($) =>
seq(
'if',
'(',
field('condition', $.expression),
')',
field('consequence', $.statement),
repeat(field('consequence', $.statement)),
optional(
seq(
'else',
field('alternative', $.statement),
repeat(field('alternative', $.statement))
)
),
$._line_end
),
for_statement: ($) =>
seq(
'for',
field('clause', choice($.for_generic_clause, $.for_numeric_clause)),
'do',
field('body', optional_block($)),
'end'
),
for_generic_clause: ($) =>
seq(
alias($._name_list, $.variable_list),
'in',
alias($._expression_list, $.expression_list)
),
for_numeric_clause: ($) =>
seq(
field('name', $.identifier),
field('operator', '='),
field('start', $.expression),
',',
field('end', $.expression),
optional(seq(',', field('step', $.expression)))
),
_name_list: ($) => name_list($),
declaration: ($) =>
choice(
$.function_declaration,
field(
'local_declaration',
alias($._local_function_declaration, $.function_declaration)
),
field('local_declaration', $.variable_declaration),
),
function_declaration: ($) =>
seq('function', field('name', $._function_name), $._function_body),
_local_function_declaration: ($) =>
seq('local', 'function', field('name', $.identifier), $._function_body),
_function_name: ($) =>
choice(
$._function_name_prefix_expression,
alias(
$._function_name_method_index_expression,
$.method_index_expression
)
),
_function_name_prefix_expression: ($) =>
choice(
$.identifier,
alias($._function_name_dot_index_expression, $.dot_index_expression)
),
_function_name_dot_index_expression: ($) =>
seq(
field('table', $._function_name_prefix_expression),
'.',
field('field', $.identifier)
),
_function_name_method_index_expression: ($) =>
seq(
field('table', $._function_name_prefix_expression),
':',
field('method', $.identifier)
),
variable_declaration: ($) =>
seq(
'local',
choice(
alias($._att_name_list, $.variable_list),
alias($._variable_assignment, $.assignment_statement)
)
),
_variable_assignment: ($) =>
seq(
alias($._att_name_list, $.variable_list),
field('operator', '='),
alias($._variable_assignment_explist, $.expression_list)
),
_att_name_list: ($) =>
seq(
optional(field('attribute', alias($._attrib, $.attribute))),
list_seq(
seq(
field('name', $.identifier),
optional(field('attribute', alias($._attrib, $.attribute)))
),
','
),
),
_attrib: ($) => seq('<', $.identifier, '>'),
_expression_list: ($) => list_seq($.expression, ','),
// PICO-8 print shorthand: ? expr {, expr}
print_shorthand_statement: ($) =>
seq(
field('directive', '?'),
list_seq(field('argument', $.expression), ',')
),
// PICO-8 include directive: #include path
// Tokenized greedily as `#include` + whitespace so that the standalone
// `#` (unary length operator) and identifier-starting `#x` continue to
// parse as length-of-expression.
include_statement: ($) =>
seq(
field('directive', alias(token(prec(2, /#include[ \t]+/)), '#include')),
field('path', alias(/[^\n\r]*/, $.include_path))
),
expression: ($) =>
choice(
$.nil,
$.false,
$.true,
$.number,
$.string,
$.vararg_expression,
$.function_definition,
$.variable,
$.function_call,
$.parenthesized_expression,
$.table_constructor,
$.binary_expression,
$.unary_expression
),
nil: (_) => 'nil',
false: (_) => 'false',
true: (_) => 'true',
number: (_) => {
function number_literal(digits, exponent_marker, exponent_digits) {
return seq(
choice(
seq(optional(digits), optional('.'), digits),
seq(digits, optional('.'), optional(digits))
),
optional(
seq(
choice(
exponent_marker.toLowerCase(),
exponent_marker.toUpperCase()
),
seq(optional(choice('-', '+')), exponent_digits)
)
)
);
}
const decimal_digits = /[0-9]+/;
const decimal_literal = number_literal(decimal_digits, 'e', decimal_digits);
const hex_digits = /[a-fA-F0-9]+/;
const hex_literal = seq(
choice('0x', '0X'),
number_literal(hex_digits, 'p', decimal_digits)
);
const bin_digits = /[01]+/;
const bin_literal = seq(choice('0b', '0B'), bin_digits);
return token(choice(decimal_literal, hex_literal, bin_literal));
},
string: ($) => choice($._quote_string, $._block_string),
_quote_string: ($) =>
choice(
seq(
field('start', alias('"', '"')),
field(
'content',
optional(alias($._doublequote_string_content, $.string_content))
),
field('end', alias('"', '"'))
),
seq(
field('start', alias("'", "'")),
field(
'content',
optional(alias($._singlequote_string_content, $.string_content))
),
field('end', alias("'", "'"))
)
),
_doublequote_string_content: ($) =>
repeat1(choice(token.immediate(prec(1, /[^"\\]+/)), $.escape_sequence)),
_singlequote_string_content: ($) =>
repeat1(choice(token.immediate(prec(1, /[^'\\]+/)), $.escape_sequence)),
_block_string: ($) =>
seq(
field('start', alias($._block_string_start, '[[')),
field('content', alias($._block_string_content, $.string_content)),
field('end', alias($._block_string_end, ']]'))
),
escape_sequence: () =>
token.immediate(
seq(
'\\',
choice(
/[\nabfnrtv\\'"]/,
/z\s*/,
/[0-9]{1,3}/,
/x[0-9a-fA-F]{2}/,
/u\{[0-9a-fA-F]+\}/
)
)
),
vararg_expression: (_) => '...',
function_definition: ($) => seq('function', $._function_body),
_function_body: ($) =>
seq(
field('parameters', $.parameters),
field('body', optional_block($)),
'end'
),
parameters: ($) => seq('(', optional($._parameter_list), ')'),
_parameter_list: ($) =>
choice(
seq(name_list($), optional(seq(',', $._vararg_parameter))),
$._vararg_parameter
),
_vararg_parameter: ($) =>
seq($.vararg_expression, optional(field('name', $.identifier))),
_prefix_expression: ($) =>
prec(1, choice($.variable, $.function_call, $.parenthesized_expression)),
variable: ($) =>
choice($.identifier, $.bracket_index_expression, $.dot_index_expression),
bracket_index_expression: ($) =>
seq(
field('table', $._prefix_expression),
'[',
field('field', $.expression),
']'
),
dot_index_expression: ($) =>
seq(
field('table', $._prefix_expression),
'.',
field('field', $.identifier)
),
function_call: ($) =>
seq(
field('name', choice($._prefix_expression, $.method_index_expression)),
field('arguments', $.arguments)
),
method_index_expression: ($) =>
seq(
field('table', $._prefix_expression),
':',
field('method', $.identifier)
),
arguments: ($) =>
choice(
seq('(', optional(list_seq($.expression, ',')), ')'),
$.table_constructor,
$.string
),
parenthesized_expression: ($) => seq('(', $.expression, ')'),
table_constructor: ($) => seq('{', optional($._field_list), '}'),
_field_list: ($) => list_seq($.field, $._field_sep, true),
_field_sep: (_) => choice(',', ';'),
field: ($) =>
choice(
seq(
'[',
field('name', $.expression),
']',
field('operator', '='),
field('value', $.expression)
),
seq(field('name', $.identifier), '=', field('value', $.expression)),
field('value', $.expression)
),
binary_expression: ($) =>
choice(
...[
['or', PREC.OR],
['and', PREC.AND],
['<', PREC.COMPARE],
['<=', PREC.COMPARE],
['==', PREC.COMPARE],
['~=', PREC.COMPARE],
['!=', PREC.COMPARE], // PICO-8 alias for ~=
['>=', PREC.COMPARE],
['>', PREC.COMPARE],
['|', PREC.BIT_OR],
['~', PREC.BIT_NOT], // bitwise xor (Lua 5.3 binary form)
['^^', PREC.BIT_NOT], // PICO-8 bitwise xor
['&', PREC.BIT_AND],
['<<', PREC.BIT_SHIFT],
['>>', PREC.BIT_SHIFT],
['>>>', PREC.BIT_SHIFT], // PICO-8 logical shift right
['<<>', PREC.BIT_SHIFT], // PICO-8 rotate left
['>><', PREC.BIT_SHIFT], // PICO-8 rotate right
['+', PREC.PLUS],
['-', PREC.PLUS],
['*', PREC.MULTI],
['/', PREC.MULTI],
['//', PREC.MULTI],
['%', PREC.MULTI],
['\\', PREC.MULTI], // PICO-8 integer divide
].map(([operator, precedence]) =>
prec.left(
precedence,
seq(
field('left', $.expression),
field('operator', operator),
field('right', $.expression)
)
)
),
...[
['..', PREC.CONCAT],
['^', PREC.POWER],
].map(([operator, precedence]) =>
prec.right(
precedence,
seq(
field('left', $.expression),
field('operator', operator),
field('right', $.expression)
)
)
)
),
unary_expression: ($) =>
prec.left(
PREC.UNARY,
seq(
// @ $ % are PICO-8 peek prefixes ( peek / peek4 / peek2 ).
// % collides lexically with binary modulo; the GLR parser
// resolves usage by surrounding context.
field('operator', choice('not', '#', '-', '~', '@', '$', '%')),
field('operand', $.expression),
)
),
identifier: (_) => {
// PICO-8 dialect carves out !, ?, @, $ as operator tokens, so they
// are not valid in identifiers ( upstream allowed them ).
const identifier_start =
/[^\p{Control}\s!?@$+\-*/%^#&~|<>=(){}\[\];:,.\\'"\d]/;
const identifier_continue =
/[^\p{Control}\s!?@$+\-*/%^#&~|<>=(){}\[\];:,.\\'"]*/;
return token(seq(identifier_start, identifier_continue));
},
comment: ($) =>
choice(
seq(
field('start', '--'),
field('content', alias(/[^\r\n]*/, $.comment_content))
),
seq(
field('start', alias($._block_comment_start, '[[')),
field('content', alias($._block_comment_content, $.comment_content)),
field('end', alias($._block_comment_end, ']]'))
)
),
},
});
+7
View File
@@ -0,0 +1,7 @@
{
"name": "tree-sitter-pico8-lua",
"version": "0.0.1",
"description": "tree-sitter grammar for the PICO-8 Lua dialect (forked from tree-sitter-lua)",
"type": "module",
"license": "MIT"
}
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
+230
View File
@@ -0,0 +1,230 @@
#include <stdio.h>
#include "tree_sitter/alloc.h"
#include "tree_sitter/parser.h"
#include <wctype.h>
enum TokenType {
BLOCK_COMMENT_START,
BLOCK_COMMENT_CONTENT,
BLOCK_COMMENT_END,
BLOCK_STRING_START,
BLOCK_STRING_CONTENT,
BLOCK_STRING_END,
// PICO-8 line-significance: terminates the body of `if (cond) ...` /
// `while (cond) ...` shorthand. Emitted only when the parser expects it
// (see scan() — this token is gated on valid_symbols[LINE_END]) so that
// newlines outside of shorthand contexts continue to fall through to
// extras and be skipped silently.
LINE_END,
};
static inline void consume(TSLexer *lexer) { lexer->advance(lexer, false); }
static inline void skip(TSLexer *lexer) { lexer->advance(lexer, true); }
static inline bool consume_char(char c, TSLexer *lexer) {
if (lexer->lookahead != c) {
return false;
}
consume(lexer);
return true;
}
static inline uint8_t consume_and_count_char(char c, TSLexer *lexer) {
uint8_t count = 0;
while (lexer->lookahead == c) {
++count;
consume(lexer);
}
return count;
}
static inline void skip_whitespaces(TSLexer *lexer) {
while (iswspace(lexer->lookahead)) {
skip(lexer);
}
}
typedef struct {
char ending_char;
uint8_t level_count;
} Scanner;
static inline void reset_state(Scanner *scanner) {
scanner->ending_char = 0;
scanner->level_count = 0;
}
void *tree_sitter_pico8_lua_external_scanner_create() {
Scanner *scanner = ts_calloc(1, sizeof(Scanner));
return scanner;
}
void tree_sitter_pico8_lua_external_scanner_destroy(void *payload) {
Scanner *scanner = (Scanner *)payload;
ts_free(scanner);
}
unsigned tree_sitter_pico8_lua_external_scanner_serialize(void *payload, char *buffer) {
Scanner *scanner = (Scanner *)payload;
buffer[0] = scanner->ending_char;
buffer[1] = (char)scanner->level_count;
return 2;
}
void tree_sitter_pico8_lua_external_scanner_deserialize(void *payload, const char *buffer, unsigned length) {
Scanner *scanner = (Scanner *)payload;
if (length == 0) return;
scanner->ending_char = buffer[0];
if (length == 1) return;
scanner->level_count = buffer[1];
}
static bool scan_block_start(Scanner *scanner, TSLexer *lexer) {
if (consume_char('[', lexer)) {
uint8_t level = consume_and_count_char('=', lexer);
if (consume_char('[', lexer)) {
scanner->level_count = level;
return true;
}
}
return false;
}
static bool scan_block_end(Scanner *scanner, TSLexer *lexer) {
if (consume_char(']', lexer)) {
uint8_t level = consume_and_count_char('=', lexer);
if (scanner->level_count == level && consume_char(']', lexer)) {
return true;
}
}
return false;
}
static bool scan_block_content(Scanner *scanner, TSLexer *lexer) {
while (lexer->lookahead != 0) {
if (lexer->lookahead == ']') {
lexer->mark_end(lexer);
if (scan_block_end(scanner, lexer)) {
return true;
}
} else {
consume(lexer);
}
}
return false;
}
static bool scan_comment_start(Scanner *scanner, TSLexer *lexer) {
if (consume_char('-', lexer) && consume_char('-', lexer)) {
lexer->mark_end(lexer);
if (scan_block_start(scanner, lexer)) {
lexer->mark_end(lexer);
lexer->result_symbol = BLOCK_COMMENT_START;
return true;
}
}
return false;
}
static bool scan_comment_content(Scanner *scanner, TSLexer *lexer) {
if (scanner->ending_char == 0) { // block comment
if (scan_block_content(scanner, lexer)) {
lexer->result_symbol = BLOCK_COMMENT_CONTENT;
return true;
}
return false;
}
while (lexer->lookahead != 0) {
if (lexer->lookahead == scanner->ending_char) {
reset_state(scanner);
lexer->result_symbol = BLOCK_COMMENT_CONTENT;
return true;
}
consume(lexer);
}
return false;
}
bool tree_sitter_pico8_lua_external_scanner_scan(void *payload, TSLexer *lexer, const bool *valid_symbols) {
Scanner *scanner = (Scanner *)payload;
// LINE_END must be checked before any whitespace-skipping path below,
// because the bytes that signal it (\n, \r, EOF) would otherwise be
// consumed as extras and be invisible to us. The check is also
// intentionally placed before the block_string / block_comment branches
// so that those branches' skip_whitespaces() can't eat our newline.
//
// The scanner emits LINE_END only when the parser's current state lists
// it as valid (i.e., we're at the body-or-terminator decision point of a
// shorthand_if_statement / shorthand_while_statement). Everywhere else,
// \n falls through to the /\s/ extras pattern and is skipped silently,
// so this branch is invisible to the rest of the grammar.
//
// LINE_END is intentionally zero-width: we do NOT consume the newline.
// That lets nested shorthands on the same line each see the same \n and
// close in turn (e.g. `if (a) if (b) c()\nd()` — the \n must terminate
// BOTH shorthands so that `d()` is a top-level statement). Once every
// enclosing shorthand has reduced, LINE_END is no longer in any parser
// state's valid_symbols, the scanner returns false, and the trailing
// \n is consumed by /\s/ in extras as usual. There is no infinite-loop
// risk: each LINE_END shift reduces one shorthand statement, so the
// emit chain is bounded by static nesting depth.
if (valid_symbols[LINE_END] &&
(lexer->lookahead == '\n' || lexer->lookahead == '\r' ||
lexer->lookahead == 0)) {
lexer->result_symbol = LINE_END;
return true;
}
if (valid_symbols[BLOCK_STRING_END] && scan_block_end(scanner, lexer)) {
reset_state(scanner);
lexer->result_symbol = BLOCK_STRING_END;
return true;
}
if (valid_symbols[BLOCK_STRING_CONTENT] && scan_block_content(scanner, lexer)) {
lexer->result_symbol = BLOCK_STRING_CONTENT;
return true;
}
if (valid_symbols[BLOCK_COMMENT_END] && scanner->ending_char == 0 && scan_block_end(scanner, lexer)) {
reset_state(scanner);
lexer->result_symbol = BLOCK_COMMENT_END;
return true;
}
if (valid_symbols[BLOCK_COMMENT_CONTENT] && scan_comment_content(scanner, lexer)) {
return true;
}
skip_whitespaces(lexer);
if (valid_symbols[BLOCK_STRING_START] && scan_block_start(scanner, lexer)) {
lexer->result_symbol = BLOCK_STRING_START;
return true;
}
if (valid_symbols[BLOCK_COMMENT_START]) {
if (scan_comment_start(scanner, lexer)) {
return true;
}
}
return false;
}
@@ -0,0 +1,54 @@
#ifndef TREE_SITTER_ALLOC_H_
#define TREE_SITTER_ALLOC_H_
#ifdef __cplusplus
extern "C" {
#endif
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
// Allow clients to override allocation functions
#ifdef TREE_SITTER_REUSE_ALLOCATOR
extern void *(*ts_current_malloc)(size_t size);
extern void *(*ts_current_calloc)(size_t count, size_t size);
extern void *(*ts_current_realloc)(void *ptr, size_t size);
extern void (*ts_current_free)(void *ptr);
#ifndef ts_malloc
#define ts_malloc ts_current_malloc
#endif
#ifndef ts_calloc
#define ts_calloc ts_current_calloc
#endif
#ifndef ts_realloc
#define ts_realloc ts_current_realloc
#endif
#ifndef ts_free
#define ts_free ts_current_free
#endif
#else
#ifndef ts_malloc
#define ts_malloc malloc
#endif
#ifndef ts_calloc
#define ts_calloc calloc
#endif
#ifndef ts_realloc
#define ts_realloc realloc
#endif
#ifndef ts_free
#define ts_free free
#endif
#endif
#ifdef __cplusplus
}
#endif
#endif // TREE_SITTER_ALLOC_H_
+291
View File
@@ -0,0 +1,291 @@
#ifndef TREE_SITTER_ARRAY_H_
#define TREE_SITTER_ARRAY_H_
#ifdef __cplusplus
extern "C" {
#endif
#include "./alloc.h"
#include <assert.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
#ifdef _MSC_VER
#pragma warning(push)
#pragma warning(disable : 4101)
#elif defined(__GNUC__) || defined(__clang__)
#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wunused-variable"
#endif
#define Array(T) \
struct { \
T *contents; \
uint32_t size; \
uint32_t capacity; \
}
/// Initialize an array.
#define array_init(self) \
((self)->size = 0, (self)->capacity = 0, (self)->contents = NULL)
/// Create an empty array.
#define array_new() \
{ NULL, 0, 0 }
/// Get a pointer to the element at a given `index` in the array.
#define array_get(self, _index) \
(assert((uint32_t)(_index) < (self)->size), &(self)->contents[_index])
/// Get a pointer to the first element in the array.
#define array_front(self) array_get(self, 0)
/// Get a pointer to the last element in the array.
#define array_back(self) array_get(self, (self)->size - 1)
/// Clear the array, setting its size to zero. Note that this does not free any
/// memory allocated for the array's contents.
#define array_clear(self) ((self)->size = 0)
/// Reserve `new_capacity` elements of space in the array. If `new_capacity` is
/// less than the array's current capacity, this function has no effect.
#define array_reserve(self, new_capacity) \
_array__reserve((Array *)(self), array_elem_size(self), new_capacity)
/// Free any memory allocated for this array. Note that this does not free any
/// memory allocated for the array's contents.
#define array_delete(self) _array__delete((Array *)(self))
/// Push a new `element` onto the end of the array.
#define array_push(self, element) \
(_array__grow((Array *)(self), 1, array_elem_size(self)), \
(self)->contents[(self)->size++] = (element))
/// Increase the array's size by `count` elements.
/// New elements are zero-initialized.
#define array_grow_by(self, count) \
do { \
if ((count) == 0) break; \
_array__grow((Array *)(self), count, array_elem_size(self)); \
memset((self)->contents + (self)->size, 0, (count) * array_elem_size(self)); \
(self)->size += (count); \
} while (0)
/// Append all elements from one array to the end of another.
#define array_push_all(self, other) \
array_extend((self), (other)->size, (other)->contents)
/// Append `count` elements to the end of the array, reading their values from the
/// `contents` pointer.
#define array_extend(self, count, contents) \
_array__splice( \
(Array *)(self), array_elem_size(self), (self)->size, \
0, count, contents \
)
/// Remove `old_count` elements from the array starting at the given `index`. At
/// the same index, insert `new_count` new elements, reading their values from the
/// `new_contents` pointer.
#define array_splice(self, _index, old_count, new_count, new_contents) \
_array__splice( \
(Array *)(self), array_elem_size(self), _index, \
old_count, new_count, new_contents \
)
/// Insert one `element` into the array at the given `index`.
#define array_insert(self, _index, element) \
_array__splice((Array *)(self), array_elem_size(self), _index, 0, 1, &(element))
/// Remove one element from the array at the given `index`.
#define array_erase(self, _index) \
_array__erase((Array *)(self), array_elem_size(self), _index)
/// Pop the last element off the array, returning the element by value.
#define array_pop(self) ((self)->contents[--(self)->size])
/// Assign the contents of one array to another, reallocating if necessary.
#define array_assign(self, other) \
_array__assign((Array *)(self), (const Array *)(other), array_elem_size(self))
/// Swap one array with another
#define array_swap(self, other) \
_array__swap((Array *)(self), (Array *)(other))
/// Get the size of the array contents
#define array_elem_size(self) (sizeof *(self)->contents)
/// Search a sorted array for a given `needle` value, using the given `compare`
/// callback to determine the order.
///
/// If an existing element is found to be equal to `needle`, then the `index`
/// out-parameter is set to the existing value's index, and the `exists`
/// out-parameter is set to true. Otherwise, `index` is set to an index where
/// `needle` should be inserted in order to preserve the sorting, and `exists`
/// is set to false.
#define array_search_sorted_with(self, compare, needle, _index, _exists) \
_array__search_sorted(self, 0, compare, , needle, _index, _exists)
/// Search a sorted array for a given `needle` value, using integer comparisons
/// of a given struct field (specified with a leading dot) to determine the order.
///
/// See also `array_search_sorted_with`.
#define array_search_sorted_by(self, field, needle, _index, _exists) \
_array__search_sorted(self, 0, _compare_int, field, needle, _index, _exists)
/// Insert a given `value` into a sorted array, using the given `compare`
/// callback to determine the order.
#define array_insert_sorted_with(self, compare, value) \
do { \
unsigned _index, _exists; \
array_search_sorted_with(self, compare, &(value), &_index, &_exists); \
if (!_exists) array_insert(self, _index, value); \
} while (0)
/// Insert a given `value` into a sorted array, using integer comparisons of
/// a given struct field (specified with a leading dot) to determine the order.
///
/// See also `array_search_sorted_by`.
#define array_insert_sorted_by(self, field, value) \
do { \
unsigned _index, _exists; \
array_search_sorted_by(self, field, (value) field, &_index, &_exists); \
if (!_exists) array_insert(self, _index, value); \
} while (0)
// Private
typedef Array(void) Array;
/// This is not what you're looking for, see `array_delete`.
static inline void _array__delete(Array *self) {
if (self->contents) {
ts_free(self->contents);
self->contents = NULL;
self->size = 0;
self->capacity = 0;
}
}
/// This is not what you're looking for, see `array_erase`.
static inline void _array__erase(Array *self, size_t element_size,
uint32_t index) {
assert(index < self->size);
char *contents = (char *)self->contents;
memmove(contents + index * element_size, contents + (index + 1) * element_size,
(self->size - index - 1) * element_size);
self->size--;
}
/// This is not what you're looking for, see `array_reserve`.
static inline void _array__reserve(Array *self, size_t element_size, uint32_t new_capacity) {
if (new_capacity > self->capacity) {
if (self->contents) {
self->contents = ts_realloc(self->contents, new_capacity * element_size);
} else {
self->contents = ts_malloc(new_capacity * element_size);
}
self->capacity = new_capacity;
}
}
/// This is not what you're looking for, see `array_assign`.
static inline void _array__assign(Array *self, const Array *other, size_t element_size) {
_array__reserve(self, element_size, other->size);
self->size = other->size;
memcpy(self->contents, other->contents, self->size * element_size);
}
/// This is not what you're looking for, see `array_swap`.
static inline void _array__swap(Array *self, Array *other) {
Array swap = *other;
*other = *self;
*self = swap;
}
/// This is not what you're looking for, see `array_push` or `array_grow_by`.
static inline void _array__grow(Array *self, uint32_t count, size_t element_size) {
uint32_t new_size = self->size + count;
if (new_size > self->capacity) {
uint32_t new_capacity = self->capacity * 2;
if (new_capacity < 8) new_capacity = 8;
if (new_capacity < new_size) new_capacity = new_size;
_array__reserve(self, element_size, new_capacity);
}
}
/// This is not what you're looking for, see `array_splice`.
static inline void _array__splice(Array *self, size_t element_size,
uint32_t index, uint32_t old_count,
uint32_t new_count, const void *elements) {
uint32_t new_size = self->size + new_count - old_count;
uint32_t old_end = index + old_count;
uint32_t new_end = index + new_count;
assert(old_end <= self->size);
_array__reserve(self, element_size, new_size);
char *contents = (char *)self->contents;
if (self->size > old_end) {
memmove(
contents + new_end * element_size,
contents + old_end * element_size,
(self->size - old_end) * element_size
);
}
if (new_count > 0) {
if (elements) {
memcpy(
(contents + index * element_size),
elements,
new_count * element_size
);
} else {
memset(
(contents + index * element_size),
0,
new_count * element_size
);
}
}
self->size += new_count - old_count;
}
/// A binary search routine, based on Rust's `std::slice::binary_search_by`.
/// This is not what you're looking for, see `array_search_sorted_with` or `array_search_sorted_by`.
#define _array__search_sorted(self, start, compare, suffix, needle, _index, _exists) \
do { \
*(_index) = start; \
*(_exists) = false; \
uint32_t size = (self)->size - *(_index); \
if (size == 0) break; \
int comparison; \
while (size > 1) { \
uint32_t half_size = size / 2; \
uint32_t mid_index = *(_index) + half_size; \
comparison = compare(&((self)->contents[mid_index] suffix), (needle)); \
if (comparison <= 0) *(_index) = mid_index; \
size -= half_size; \
} \
comparison = compare(&((self)->contents[*(_index)] suffix), (needle)); \
if (comparison == 0) *(_exists) = true; \
else if (comparison < 0) *(_index) += 1; \
} while (0)
/// Helper macro for the `_sorted_by` routines below. This takes the left (existing)
/// parameter by reference in order to work with the generic sorting function above.
#define _compare_int(a, b) ((int)*(a) - (int)(b))
#ifdef _MSC_VER
#pragma warning(pop)
#elif defined(__GNUC__) || defined(__clang__)
#pragma GCC diagnostic pop
#endif
#ifdef __cplusplus
}
#endif
#endif // TREE_SITTER_ARRAY_H_
@@ -0,0 +1,266 @@
#ifndef TREE_SITTER_PARSER_H_
#define TREE_SITTER_PARSER_H_
#ifdef __cplusplus
extern "C" {
#endif
#include <stdbool.h>
#include <stdint.h>
#include <stdlib.h>
#define ts_builtin_sym_error ((TSSymbol)-1)
#define ts_builtin_sym_end 0
#define TREE_SITTER_SERIALIZATION_BUFFER_SIZE 1024
#ifndef TREE_SITTER_API_H_
typedef uint16_t TSStateId;
typedef uint16_t TSSymbol;
typedef uint16_t TSFieldId;
typedef struct TSLanguage TSLanguage;
#endif
typedef struct {
TSFieldId field_id;
uint8_t child_index;
bool inherited;
} TSFieldMapEntry;
typedef struct {
uint16_t index;
uint16_t length;
} TSFieldMapSlice;
typedef struct {
bool visible;
bool named;
bool supertype;
} TSSymbolMetadata;
typedef struct TSLexer TSLexer;
struct TSLexer {
int32_t lookahead;
TSSymbol result_symbol;
void (*advance)(TSLexer *, bool);
void (*mark_end)(TSLexer *);
uint32_t (*get_column)(TSLexer *);
bool (*is_at_included_range_start)(const TSLexer *);
bool (*eof)(const TSLexer *);
void (*log)(const TSLexer *, const char *, ...);
};
typedef enum {
TSParseActionTypeShift,
TSParseActionTypeReduce,
TSParseActionTypeAccept,
TSParseActionTypeRecover,
} TSParseActionType;
typedef union {
struct {
uint8_t type;
TSStateId state;
bool extra;
bool repetition;
} shift;
struct {
uint8_t type;
uint8_t child_count;
TSSymbol symbol;
int16_t dynamic_precedence;
uint16_t production_id;
} reduce;
uint8_t type;
} TSParseAction;
typedef struct {
uint16_t lex_state;
uint16_t external_lex_state;
} TSLexMode;
typedef union {
TSParseAction action;
struct {
uint8_t count;
bool reusable;
} entry;
} TSParseActionEntry;
typedef struct {
int32_t start;
int32_t end;
} TSCharacterRange;
struct TSLanguage {
uint32_t version;
uint32_t symbol_count;
uint32_t alias_count;
uint32_t token_count;
uint32_t external_token_count;
uint32_t state_count;
uint32_t large_state_count;
uint32_t production_id_count;
uint32_t field_count;
uint16_t max_alias_sequence_length;
const uint16_t *parse_table;
const uint16_t *small_parse_table;
const uint32_t *small_parse_table_map;
const TSParseActionEntry *parse_actions;
const char * const *symbol_names;
const char * const *field_names;
const TSFieldMapSlice *field_map_slices;
const TSFieldMapEntry *field_map_entries;
const TSSymbolMetadata *symbol_metadata;
const TSSymbol *public_symbol_map;
const uint16_t *alias_map;
const TSSymbol *alias_sequences;
const TSLexMode *lex_modes;
bool (*lex_fn)(TSLexer *, TSStateId);
bool (*keyword_lex_fn)(TSLexer *, TSStateId);
TSSymbol keyword_capture_token;
struct {
const bool *states;
const TSSymbol *symbol_map;
void *(*create)(void);
void (*destroy)(void *);
bool (*scan)(void *, TSLexer *, const bool *symbol_whitelist);
unsigned (*serialize)(void *, char *);
void (*deserialize)(void *, const char *, unsigned);
} external_scanner;
const TSStateId *primary_state_ids;
};
static inline bool set_contains(TSCharacterRange *ranges, uint32_t len, int32_t lookahead) {
uint32_t index = 0;
uint32_t size = len - index;
while (size > 1) {
uint32_t half_size = size / 2;
uint32_t mid_index = index + half_size;
TSCharacterRange *range = &ranges[mid_index];
if (lookahead >= range->start && lookahead <= range->end) {
return true;
} else if (lookahead > range->end) {
index = mid_index;
}
size -= half_size;
}
TSCharacterRange *range = &ranges[index];
return (lookahead >= range->start && lookahead <= range->end);
}
/*
* Lexer Macros
*/
#ifdef _MSC_VER
#define UNUSED __pragma(warning(suppress : 4101))
#else
#define UNUSED __attribute__((unused))
#endif
#define START_LEXER() \
bool result = false; \
bool skip = false; \
UNUSED \
bool eof = false; \
int32_t lookahead; \
goto start; \
next_state: \
lexer->advance(lexer, skip); \
start: \
skip = false; \
lookahead = lexer->lookahead;
#define ADVANCE(state_value) \
{ \
state = state_value; \
goto next_state; \
}
#define ADVANCE_MAP(...) \
{ \
static const uint16_t map[] = { __VA_ARGS__ }; \
for (uint32_t i = 0; i < sizeof(map) / sizeof(map[0]); i += 2) { \
if (map[i] == lookahead) { \
state = map[i + 1]; \
goto next_state; \
} \
} \
}
#define SKIP(state_value) \
{ \
skip = true; \
state = state_value; \
goto next_state; \
}
#define ACCEPT_TOKEN(symbol_value) \
result = true; \
lexer->result_symbol = symbol_value; \
lexer->mark_end(lexer);
#define END_STATE() return result;
/*
* Parse Table Macros
*/
#define SMALL_STATE(id) ((id) - LARGE_STATE_COUNT)
#define STATE(id) id
#define ACTIONS(id) id
#define SHIFT(state_value) \
{{ \
.shift = { \
.type = TSParseActionTypeShift, \
.state = (state_value) \
} \
}}
#define SHIFT_REPEAT(state_value) \
{{ \
.shift = { \
.type = TSParseActionTypeShift, \
.state = (state_value), \
.repetition = true \
} \
}}
#define SHIFT_EXTRA() \
{{ \
.shift = { \
.type = TSParseActionTypeShift, \
.extra = true \
} \
}}
#define REDUCE(symbol_name, children, precedence, prod_id) \
{{ \
.reduce = { \
.type = TSParseActionTypeReduce, \
.symbol = symbol_name, \
.child_count = children, \
.dynamic_precedence = precedence, \
.production_id = prod_id \
}, \
}}
#define RECOVER() \
{{ \
.type = TSParseActionTypeRecover \
}}
#define ACCEPT_INPUT() \
{{ \
.type = TSParseActionTypeAccept \
}}
#ifdef __cplusplus
}
#endif
#endif // TREE_SITTER_PARSER_H_
@@ -0,0 +1,249 @@
================================================================
shorthand if — single statement body, terminated by newline
================================================================
if (cond) honk()
toot()
----------------------------------------------------------------
(chunk
(shorthand_if_statement
condition: (identifier)
consequence: (function_call
name: (identifier)
arguments: (arguments)))
(function_call
name: (identifier)
arguments: (arguments)))
================================================================
shorthand if — single statement body, terminated by EOF
================================================================
if (cond) honk()
----------------------------------------------------------------
(chunk
(shorthand_if_statement
condition: (identifier)
consequence: (function_call
name: (identifier)
arguments: (arguments))))
================================================================
shorthand if — multi-statement body collected into shorthand
================================================================
if (is_falling()) wheeee() splat()
----------------------------------------------------------------
(chunk
(shorthand_if_statement
condition: (function_call
name: (identifier)
arguments: (arguments))
consequence: (function_call
name: (identifier)
arguments: (arguments))
consequence: (function_call
name: (identifier)
arguments: (arguments))))
================================================================
shorthand if — same-line else
================================================================
if (cond) honk() else toot()
----------------------------------------------------------------
(chunk
(shorthand_if_statement
condition: (identifier)
consequence: (function_call
name: (identifier)
arguments: (arguments))
alternative: (function_call
name: (identifier)
arguments: (arguments))))
================================================================
shorthand if — same-line multi-statement else
================================================================
if (cond) honk() else toot() squawk()
----------------------------------------------------------------
(chunk
(shorthand_if_statement
condition: (identifier)
consequence: (function_call
name: (identifier)
arguments: (arguments))
alternative: (function_call
name: (identifier)
arguments: (arguments))
alternative: (function_call
name: (identifier)
arguments: (arguments))))
================================================================
shorthand if nested in standard if — `else` on later line binds
to OUTER if, not the shorthand (PICO-8 line-significance)
================================================================
if is_noisy then
if (is_goose()) honk()
else
toot()
end
----------------------------------------------------------------
(chunk
(if_statement
condition: (identifier)
consequence: (block
(shorthand_if_statement
condition: (function_call
name: (identifier)
arguments: (arguments))
consequence: (function_call
name: (identifier)
arguments: (arguments))))
alternative: (else_statement
body: (block
(function_call
name: (identifier)
arguments: (arguments))))))
================================================================
shorthand if — line comment between body and newline still
terminates the shorthand at the newline (line comment is in
extras and is attached to the deepest enclosing node)
================================================================
if (cond) honk() -- inline
toot()
----------------------------------------------------------------
(chunk
(shorthand_if_statement
condition: (identifier)
consequence: (function_call
name: (identifier)
arguments: (arguments))
(comment
content: (comment_content)))
(function_call
name: (identifier)
arguments: (arguments)))
================================================================
shorthand if inside a do-block — newline before `end` terminates
shorthand, then `end` closes the do-block
================================================================
do
if (cond) honk()
end
----------------------------------------------------------------
(chunk
(do_statement
body: (block
(shorthand_if_statement
condition: (identifier)
consequence: (function_call
name: (identifier)
arguments: (arguments))))))
================================================================
shorthand while — multi-statement body, terminated by newline
================================================================
while (running) tick() draw()
cleanup()
----------------------------------------------------------------
(chunk
(shorthand_while_statement
condition: (identifier)
body: (function_call
name: (identifier)
arguments: (arguments))
body: (function_call
name: (identifier)
arguments: (arguments)))
(function_call
name: (identifier)
arguments: (arguments)))
================================================================
shorthand while — single statement body, terminated by EOF
================================================================
while (cond) tick()
----------------------------------------------------------------
(chunk
(shorthand_while_statement
condition: (identifier)
body: (function_call
name: (identifier)
arguments: (arguments))))
================================================================
nested shorthand ifs on the same line — a single newline must
terminate BOTH shorthands (otherwise the outer one greedily
absorbs the next-line statement)
================================================================
if (a) if (b) c()
d()
----------------------------------------------------------------
(chunk
(shorthand_if_statement
condition: (identifier)
consequence: (shorthand_if_statement
condition: (identifier)
consequence: (function_call
name: (identifier)
arguments: (arguments))))
(function_call
name: (identifier)
arguments: (arguments)))
================================================================
standard if with parenthesized condition coexists with shorthand
— GLR resolves on the token after `)` (then vs statement)
================================================================
if (cond) then a() end
if (cond) a()
----------------------------------------------------------------
(chunk
(if_statement
condition: (parenthesized_expression
(identifier))
consequence: (block
(function_call
name: (identifier)
arguments: (arguments))))
(shorthand_if_statement
condition: (identifier)
consequence: (function_call
name: (identifier)
arguments: (arguments))))
+27
View File
@@ -0,0 +1,27 @@
{
"$schema": "https://tree-sitter.github.io/tree-sitter/assets/schemas/config.schema.json",
"grammars": [
{
"name": "pico8_lua",
"scope": "source.pico8-lua",
"path": ".",
"file-types": [
"p8lua"
],
"injection-regex": "^pico-?8[-_ ]?lua$"
}
],
"metadata": {
"version": "0.0.1",
"license": "MIT",
"description": "PICO-8 Lua dialect grammar (forked from tree-sitter-lua)"
},
"bindings": {
"c": true,
"go": false,
"node": true,
"python": false,
"rust": false,
"swift": false
}
}
+8 -9
View File
@@ -1,11 +1,10 @@
; Inject Lua syntax highlighting into the __lua__ section.
; Hand the body of the __lua__ section to the Pico-8 Lua grammar so the
; dialect-aware parser parses it (compound-assignment statements, the `?`
; print shorthand, single-line `if (cond) stmt`, peek prefixes, etc.).
;
; NOTE: This injects Zed's built-in Lua grammar, which does not
; understand Pico-8 dialect extensions ( ?, +=, !=, single-line
; `if (cond) stmt`, binary literals, memory peek prefix operators,
; etc. ). Code that uses those forms will produce parse errors
; locally, with degraded highlighting in those regions only — the
; rest of the file will still render correctly. A future Pico-8
; Lua grammar fork will replace this; see README for status.
; The injection.language string must match the target language's `name`
; field in its config.toml. Zed case-folds via UniCase but does not
; treat hyphens as equivalent to spaces, so this has to be the literal
; "Pico-8 Lua" ( with a space ).
((lua_content) @injection.content
(#set! injection.language "lua"))
(#set! injection.language "Pico-8 Lua"))
+4
View File
@@ -0,0 +1,4 @@
("(" @open ")" @close)
("[" @open "]" @close)
("{" @open "}" @close)
("\"" @open "\"" @close)
+16
View File
@@ -0,0 +1,16 @@
name = "Pico-8 Lua"
grammar = "pico8_lua"
path_suffixes = ["p8lua"]
line_comments = ["-- "]
block_comment = ["--[[", "]]"]
hard_tabs = false
tab_size = 1
autoclose_before = ";:.,=}])>"
brackets = [
{ start = "{", end = "}", close = true, newline = true },
{ start = "[", end = "]", close = true, newline = true },
{ start = "(", end = ")", close = true, newline = true },
{ start = "\"", end = "\"", close = true, newline = false, not_in = ["string"] },
{ start = "'", end = "'", close = true, newline = false, not_in = ["string", "comment"] },
{ start = "[[", end = "]]", close = true, newline = true },
]
+215
View File
@@ -0,0 +1,215 @@
; --- Keywords ---
"return" @keyword.return
[
"goto"
"in"
"local"
] @keyword
(label_statement) @label
(break_statement) @keyword
(do_statement
["do" "end"] @keyword)
(while_statement
["while" "do" "end"] @keyword.repeat)
(shorthand_while_statement
"while" @keyword.repeat)
(repeat_statement
["repeat" "until"] @keyword.repeat)
(if_statement
["if" "elseif" "else" "then" "end"] @keyword.conditional)
(elseif_statement
["elseif" "then" "end"] @keyword.conditional)
(else_statement
["else" "end"] @keyword.conditional)
(shorthand_if_statement
["if" "else"] @keyword.conditional)
(for_statement
["for" "do" "end"] @keyword.repeat)
(function_declaration
["function" "end"] @keyword.function)
(function_definition
["function" "end"] @keyword.function)
; --- PICO-8 dialect: print shorthand and #include ---
(print_shorthand_statement
directive: "?" @function.builtin)
(include_statement
directive: _ @keyword.directive)
(include_statement
path: (include_path) @string.special.path)
; --- Operators ---
(binary_expression
operator: _ @operator)
(unary_expression
operator: _ @operator)
(compound_assignment_statement
operator: _ @operator)
"=" @operator
[
"and"
"not"
"or"
] @keyword.operator
; --- Punctuation ---
[
";"
":"
","
"."
] @punctuation.delimiter
[
"("
")"
"["
"]"
"{"
"}"
] @punctuation.bracket
; --- Variables and constants ---
(identifier) @variable
(variable_list
(attribute
"<" @punctuation.bracket
(identifier) @attribute
">" @punctuation.bracket))
((identifier) @constant
(#match? @constant "^[A-Z][A-Z_0-9]*$"))
(vararg_expression) @constant
(nil) @constant.builtin
[
(false)
(true)
] @boolean
; PICO-8 callback hooks — recognized by name regardless of definition site.
((identifier) @function.builtin
(#any-of? @function.builtin
"_init" "_update" "_update60" "_draw"))
; --- Tables ---
(field
name: (identifier) @property)
(dot_index_expression
field: (identifier) @property)
(table_constructor
["{" "}"] @constructor)
; --- Functions ---
(parameters
(identifier) @variable.parameter)
(function_declaration
name: [
(identifier) @function
(dot_index_expression
field: (identifier) @function)
])
(function_declaration
name: (method_index_expression
method: (identifier) @function.method))
(assignment_statement
(variable_list
.
name: [
(identifier) @function
(dot_index_expression
field: (identifier) @function)
])
(expression_list
.
value: (function_definition)))
(table_constructor
(field
name: (identifier) @function
value: (function_definition)))
(function_call
name: [
(identifier) @function.call
(dot_index_expression
field: (identifier) @function.call)
(method_index_expression
method: (identifier) @function.method.call)
])
; --- PICO-8 builtin globals ---
(function_call
(identifier) @function.builtin
(#any-of? @function.builtin
; system
"load" "save" "ls" "run" "stop" "assert" "reset" "info" "flip" "printh"
"time" "t" "stat" "extcmd" "holdframe" "_set_fps"
; graphics
"clip" "pset" "pget" "sget" "sset" "fget" "fset" "print" "cursor"
"color" "cls" "camera" "circ" "circfill" "oval" "ovalfill" "line"
"rect" "rectfill" "rrect" "rrectfill" "pal" "palt" "spr" "sspr" "fillp"
; tables
"add" "del" "deli" "count" "all" "foreach" "pairs" "ipairs" "next"
; input
"btn" "btnp"
; audio
"sfx" "music"
; map
"mget" "mset" "map" "tline"
; memory
"peek" "poke" "peek2" "poke2" "peek4" "poke4" "memcpy" "reload"
"cstore" "memset"
; math
"max" "min" "mid" "flr" "ceil" "cos" "sin" "atan2" "sqrt" "abs"
"rnd" "srand"
; bitwise (named forms — operator forms covered by @operator)
"band" "bor" "bxor" "bnot" "shl" "shr" "lshr" "rotl" "rotr"
; custom menu
"menuitem"
; strings / conversion
"tostr" "tonum" "chr" "ord" "sub" "split" "type"
; cart data
"cartdata" "dget" "dset"
; metatables
"setmetatable" "getmetatable" "rawset" "rawget" "rawequal" "rawlen"
; coroutines
"cocreate" "coresume" "costatus" "yield"
; error
"error" "pcall" "xpcall"))
; --- Misc ---
(comment) @comment
(hash_bang_line) @comment.documentation
(number) @number
(string) @string
(escape_sequence) @string.escape
+20
View File
@@ -0,0 +1,20 @@
; Indent inside any block-bearing construct.
(do_statement) @indent
(while_statement) @indent
(repeat_statement) @indent
(for_statement) @indent
(if_statement) @indent
(elseif_statement) @indent
(else_statement) @indent
(function_declaration) @indent
(function_definition) @indent
(table_constructor) @indent
(parenthesized_expression) @indent
(arguments) @indent
; The closing keywords/brackets sit at the parent's indent level.
"end" @end
"until" @end
"}" @end
")" @end
"]" @end
+2
View File
@@ -0,0 +1,2 @@
; ( reserved for future use — e.g. injecting hex blob content into peek/poke
; string literals, or shader code in spr() comments )
+12
View File
@@ -0,0 +1,12 @@
; Top-level functions appear in the outline.
(function_declaration
"function" @context
name: (_) @name
parameters: (parameters) @context.extra) @item
; Local functions: local function foo()
(declaration
(function_declaration
"function" @context
name: (identifier) @name
parameters: (parameters) @context.extra)) @item
+4 -11
View File
@@ -1,15 +1,8 @@
{
"name": "tree-sitter-p8-cart",
"version": "0.0.1",
"description": "tree-sitter grammar for the PICO-8 .p8 cartridge text format",
"main": "bindings/node",
"types": "bindings/node",
"keywords": [
"tree-sitter",
"parser",
"pico-8",
"pico8"
],
"name": "zed-p8-workspace",
"version": "0.0.2",
"private": true,
"description": "Workspace root for the zed-p8 extension; hosts tree-sitter-cli for the grammars under grammars/",
"license": "0BSD",
"devDependencies": {
"tree-sitter-cli": "^0.24.7"
-1683
View File
File diff suppressed because it is too large Load Diff