-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate TextMate grammar from Tree-sitter grammar #2
Comments
I spent a bit searching for a proper converter between the two looks like it's a hard task since there isn't a single one. I have a few ideas though for automatically updating parts of it and I'll create a draft PR in the moment if you want to give some thoughts. |
Thanks for your interest. Yes, it will be a bit tricky to write a generic tree-sitter to TextMate grammar converter. Anyways feel free to open a draft PR if you have something to share. |
I'm just gonna give my thoughts here before I commit to any coding. The https://github.com/charmbracelet/tree-sitter-vhs/blob/main/src/grammar.json file is easy to parse and contains a lot of information that we can scrape. For example, the "setting": {
"type": "CHOICE",
"members": [
{
"type": "SEQ",
"members": [
{
"type": "STRING",
"value": "Shell"
},
{
"type": "SYMBOL",
"name": "string"
}
]
},
{
"type": "SEQ",
"members": [
{
"type": "STRING",
"value": "FontFamily"
},
{
"type": "SYMBOL",
"name": "string"
}
]
},
{
"type": "SEQ",
"members": [
{
"type": "STRING",
"value": "FontSize"
},
{
"type": "SYMBOL",
"name": "float"
}
]
},
{
"type": "SEQ",
"members": [
{
"type": "STRING",
"value": "Framerate"
},
{
"type": "SYMBOL",
"name": "integer"
}
]
},
{
"type": "SEQ",
"members": [
{
"type": "STRING",
"value": "PlaybackSpeed"
},
{
"type": "SYMBOL",
"name": "float"
}
]
},
{
"type": "SEQ",
"members": [
{
"type": "STRING",
"value": "Height"
},
{
"type": "SYMBOL",
"name": "integer"
}
]
},
{
"type": "SEQ",
"members": [
{
"type": "STRING",
"value": "LetterSpacing"
},
{
"type": "SYMBOL",
"name": "float"
}
]
},
{
"type": "SEQ",
"members": [
{
"type": "STRING",
"value": "TypingSpeed"
},
{
"type": "SYMBOL",
"name": "time"
}
]
},
{
"type": "SEQ",
"members": [
{
"type": "STRING",
"value": "LineHeight"
},
{
"type": "SYMBOL",
"name": "float"
}
]
},
{
"type": "SEQ",
"members": [
{
"type": "STRING",
"value": "Padding"
},
{
"type": "SYMBOL",
"name": "float"
}
]
},
{
"type": "SEQ",
"members": [
{
"type": "STRING",
"value": "Theme"
},
{
"type": "CHOICE",
"members": [
{
"type": "SYMBOL",
"name": "json"
},
{
"type": "SYMBOL",
"name": "string"
}
]
}
]
},
{
"type": "SEQ",
"members": [
{
"type": "STRING",
"value": "LoopOffset"
},
{
"type": "SEQ",
"members": [
{
"type": "SYMBOL",
"name": "float"
},
{
"type": "CHOICE",
"members": [
{
"type": "STRING",
"value": "%"
},
{
"type": "BLANK"
}
]
}
]
}
]
},
{
"type": "SEQ",
"members": [
{
"type": "STRING",
"value": "Width"
},
{
"type": "SYMBOL",
"name": "integer"
}
]
},
{
"type": "SEQ",
"members": [
{
"type": "STRING",
"value": "BorderRadius"
},
{
"type": "SYMBOL",
"name": "integer"
}
]
},
{
"type": "SEQ",
"members": [
{
"type": "STRING",
"value": "Margin"
},
{
"type": "SYMBOL",
"name": "integer"
}
]
},
{
"type": "SEQ",
"members": [
{
"type": "STRING",
"value": "MarginFill"
},
{
"type": "SYMBOL",
"name": "string"
}
]
},
{
"type": "SEQ",
"members": [
{
"type": "STRING",
"value": "WindowBar"
},
{
"type": "SYMBOL",
"name": "string"
}
]
},
{
"type": "SEQ",
"members": [
{
"type": "STRING",
"value": "WindowBarSize"
},
{
"type": "SYMBOL",
"name": "integer"
}
]
}
]
} Super easy to extract from: #!/usr/bin/env node
const data = require('./tree-sitter.json');
let settings = [];
for (const setting of data.rules.setting.members) {
settings.push(setting.members[0].value)
}
console.log(settings) [
'Shell', 'FontFamily',
'FontSize', 'Framerate',
'PlaybackSpeed', 'Height',
'LetterSpacing', 'TypingSpeed',
'LineHeight', 'Padding',
'Theme', 'LoopOffset',
'Width', 'BorderRadius',
'Margin', 'MarginFill',
'WindowBar', 'WindowBarSize'
] We don't have to write something like this for every little bit, but it could be a good way to easily update some parts. A workflow that runs once a week could check if anything has changed and update it automatically. |
The file https://github.com/charmbracelet/tree-sitter-vhs/blob/main/src/grammar.json gets generated from https://github.com/charmbracelet/tree-sitter-vhs/blob/main/grammar.js. I think we can use the later to generate TextMate grammar. |
I noticed but it seems harder to scrape/generate it from a JS file... I'll take another look. |
You don't have to scrape it, think of how this file must be getting used by tree-sitter itself to generate the resultant json. Can we override the global functions used in the grammar.js file like project, seq, choice, repeat, choice, etc and use the same file to generate TextMate grammar instead of tree-sitter grammar? |
Totally, that's why I said scrape/generate. The only issue I'm noticing is just naming certain patterns and rulesets. I'll give it a go tonight and see what gives. |
I'm gonna be honest this is pretty difficult. A lot of it has to be hard-coded into the functions and it might honestly be easier to just do it by hand. module.exports = grammar({
name: 'vhs',
rules: {
program: $ => repeat(choice($.command, $.comment)),
command: $ => choice(
$.control,
$.alt,
$.hide,
$.show,
$.output,
$.sleep,
$.type,
$.backspace,
$.down,
$.enter,
$.escape,
$.left,
$.right,
$.set,
$.space,
$.tab,
$.up,
$.pageup,
$.pagedown,
),
control: $ => /Ctrl\+[A-Z]/,
alt: $ => /Alt\+[A-Z]/,
hide: $ => seq('Hide'),
show: $ => seq('Show'),
output: $ => seq('Output', $.path),
set: $ => seq('Set', $.setting),
sleep: $ => seq('Sleep', $.time),
type: $ => seq('Type', optional($.speed), repeat1($.string)),
backspace: $ => seq('Backspace', optional($.speed), optional($.integer)),
down: $ => seq('Down', optional($.speed), optional($.integer)),
enter: $ => seq('Enter', optional($.speed), optional($.integer)),
escape: $ => seq('Escape', optional($.speed), optional($.integer)),
left: $ => seq('Left', optional($.speed), optional($.integer)),
right: $ => seq('Right', optional($.speed), optional($.integer)),
space: $ => seq('Space', optional($.speed), optional($.integer)),
tab: $ => seq('Tab', optional($.speed), optional($.integer)),
up: $ => seq('Up', optional($.speed), optional($.integer)),
pageup: $ => seq('PageUp', optional($.speed), optional($.integer)),
pagedown: $ => seq('PageDown', optional($.speed), optional($.integer)),
setting: $ => choice(
seq('Shell', $.string),
seq('FontFamily', $.string),
seq('FontSize', $.float),
seq('Framerate', $.integer),
seq('PlaybackSpeed', $.float),
seq('Height', $.integer),
seq('LetterSpacing', $.float),
seq('TypingSpeed', $.time),
seq('LineHeight', $.float),
seq('Padding', $.float),
seq('Theme', choice($.json, $.string)),
seq('LoopOffset', seq($.float, optional('%'))),
seq('Width', $.integer),
seq('BorderRadius', $.integer),
seq('Margin', $.integer),
seq('MarginFill', $.string),
seq('WindowBar', $.string),
seq('WindowBarSize', $.integer),
),
string: $ => choice(/"[^"]*"/, /'[^']*'/, /`[^`]*`/),
comment: $ => /#.*/,
float: $ => /\d*\.?\d+/,
integer: $ => /\d+/,
json: $ => /\{.*\}/,
path: $ => /[\.\-\/A-Za-z0-9%]+/,
speed: $ => seq('@', $.time),
time: $ => /\d*\.?\d+m?s?/,
}
}); There all of the types ( |
No worries. Thanks for looking into this @uncenter and I really appreciate you spending time on this. I know this is a bit tricky. We can definitely do this by parsing the JSON and we are already maintaining this repo by hand. I see this as a coding exercise and want to solve this by writing good enough parser. Let me look into this and come up with a small writeup on how can this be achieved, maybe add in some example code. If it looks achievable, maybe you can pickup from there. This can become a good learning experience for both of us, if you are up for it :) |
Totally! I would love to figure this out I'm just totally stumped/lost. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
I gave it a shot here: https://github.com/griimick/vscode-vhs/blob/treesitter-textmate/generate.js I found out that tokens generated by tree-sitter grammar are less detailed compared to TextMate grammar in this repo. Tree-sitter token also do not directly map to the highlight definitions directly. Also, TextMate uses Ruby regex which I don't think can be always converted to from js Regex as they are incompatible. Knowing all this, I am inclined to maintain the rules manually now. If someone still wants to give it a shot, feel free. |
Exactly my thinking. At least we tried 😅... |
The maintenance efforts will reduce drastically if we can generate TextMate grammar which is used by VSCode from VHS official tree-sitter grammar.
Resources
The text was updated successfully, but these errors were encountered: