Add v0.1 scaffold: tree-sitter-p8-cart grammar and Zed extension

The grammar parses the .p8 cartridge container ( header, version, and
the named __lua__/__gfx__/__gff__/__label__/__map__/__sfx__/__music__
sections, plus a fallback unknown_section ). The Zed language definition
hands the __lua__ body to Zed's built-in Lua via injections.scm, so
non-dialect Lua code highlights correctly today; PICO-8-specific syntax
(?, +=, !=, single-line if, etc.) will fall back to error highlighting
in those regions only — see README roadmap for the dialect grammar fork.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-01 11:50:47 -07:00
parent b4191384e0
commit 3f209efa89
20 changed files with 6450 additions and 1 deletions
+74
View File
@@ -0,0 +1,74 @@
/**
* tree-sitter grammar for the PICO-8 .p8 cartridge text format.
*
* The .p8 format is a flat text container divided into named sections
* delimited by lines of the form `__name__`. The first section is
* always `__lua__` and contains the cart's Lua source; the remaining
* sections (`__gfx__`, `__gff__`, `__label__`, `__map__`, `__sfx__`,
* `__music__`) hold hex-encoded asset data. The file begins with a
* fixed magic header line and a `version N` line.
*
* This grammar is intentionally minimal: it parses the section
* structure and exposes each section's body as a single named node
* so that injection queries (see languages/pico8-cart/injections.scm)
* can hand the contents off to other languages — most importantly
* Lua for the `__lua__` section.
*/
module.exports = grammar({
name: 'p8_cart',
// Whitespace is significant inside hex sections, so we don't skip it.
extras: $ => [],
rules: {
cartridge: $ => seq(
optional($.header),
optional($.version),
repeat($.section),
),
header: $ => /pico-8 cartridge \/\/[^\n]*\n/,
version: $ => /version[ \t]+\d+\n/,
section: $ => choice(
$.lua_section,
$.gfx_section,
$.gff_section,
$.label_section,
$.map_section,
$.sfx_section,
$.music_section,
$.unknown_section,
),
lua_section: $ => seq($.lua_marker, optional($.lua_content)),
gfx_section: $ => seq($.gfx_marker, optional($.body)),
gff_section: $ => seq($.gff_marker, optional($.body)),
label_section: $ => seq($.label_marker, optional($.body)),
map_section: $ => seq($.map_marker, optional($.body)),
sfx_section: $ => seq($.sfx_marker, optional($.body)),
music_section: $ => seq($.music_marker, optional($.body)),
unknown_section: $ => seq($.section_marker, optional($.body)),
lua_marker: $ => token(prec(2, '__lua__\n')),
gfx_marker: $ => token(prec(2, '__gfx__\n')),
gff_marker: $ => token(prec(2, '__gff__\n')),
label_marker: $ => token(prec(2, '__label__\n')),
map_marker: $ => token(prec(2, '__map__\n')),
sfx_marker: $ => token(prec(2, '__sfx__\n')),
music_marker: $ => token(prec(2, '__music__\n')),
section_marker: $ => token(prec(1, /__[a-z][a-z0-9_]*__\n/)),
lua_content: $ => repeat1($.line),
body: $ => repeat1($.line),
// A single physical line. The lexer prefers section markers over
// generic lines via the precedence above, so a line that happens
// to be exactly `__name__\n` will tokenize as a marker, not a line.
line: $ => choice(
token(prec(0, /[^\n]*\n/)),
token(prec(0, /[^\n]+/)), // final line with no trailing newline
),
},
});