Skip to content

Sec-ant/lezer-python-regex

Repository files navigation

lezer-python-regex

npm npm bundle size jsDelivr hits

A Lezer grammar for parsing Python regular expressions with incremental parsing support and TypeScript definitions.

Install

npm i lezer-python-regex

Features

  • Basic patterns, character classes, quantifiers, groups
  • Lookarounds, backreferences, conditionals, alternation
  • Inline flags, embedded comments, escape sequences
  • Named groups (?P<name>), atomic groups (?>)
  • Full Python regex syntax compliance

Usage

Basic

import { parser } from "lezer-python-regex";

const tree = parser.parse(`(?P<word>\w+)\s+(?P=word)`);
console.log(tree.toString());

With CodeMirror

import { parser, pythonRegexHighlighting } from "lezer-python-regex";
import { LRLanguage } from "@codemirror/language";
import { HighlightStyle, syntaxHighlighting } from "@codemirror/language";

const pythonRegexLanguage = LRLanguage.define({
  parser,
  languageData: { name: "python-regex" },
});

const highlightStyle = HighlightStyle.define([pythonRegexHighlighting]);
const extensions = [pythonRegexLanguage, syntaxHighlighting(highlightStyle)];

Tree Navigation

import { parser } from "lezer-python-regex";
import * as terms from "lezer-python-regex";

const tree = parser.parse(`(?P<email>[^@]+@[^@]+)`);
const cursor = tree.cursor();

// Find named groups
cursor.iterate((node) => {
  if (node.type.id === terms.NamedCapturingGroup) {
    console.log("Named group found:", node);
  }
});

Error Handling

import { parser } from "lezer-python-regex";

function parseWithErrors(pattern: string) {
  const tree = parser.parse(pattern);
  const errors: any[] = [];

  tree.cursor().iterate((node) => {
    if (node.type.isError) {
      errors.push({
        from: node.from,
        to: node.to,
        message: `Syntax error at ${node.from}-${node.to}`,
      });
    }
  });

  return { tree, errors };
}

API

Exports

  • parser - Lezer parser instance
  • pythonRegexHighlighting - CodeMirror syntax highlighting
  • Grammar terms - Node type constants for tree navigation

Types

parser.parse(input: string, fragments?: TreeFragment[], ranges?: {from: number, to: number}[]): Tree

Development

git clone https://github.com/Sec-ant/lezer-python-regex
cd lezer-python-regex
pnpm install
pnpm build
pnpm test

Test categories: basic patterns, character classes, quantifiers, groups, lookarounds, backreferences, conditionals, alternation, comments, complex patterns, edge cases.

Commands:

  • pnpm test:run - Run all tests
  • pnpm test:ui - Interactive test UI

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Add tests in tests/fixtures/
  4. Ensure tests pass
  5. Submit a pull request

License

MIT

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published