Skip to content

Conversation

RPG-Alex
Copy link

@RPG-Alex RPG-Alex commented Oct 16, 2025

This PR refactors the internal representation of comments in the tokenizer and parser and adds leading comment support for AlterTable, CreateTable, and ColumnDef.

The SingleLine and MultiLine comment variants previously defined in the Whitespace enum are now represented by a new, dedicated Comment enum. The Comment is used for both whitespace comments (interstitial comments) and leading comments. Leading comment support propagated for AlterTable, CreateTable, and ColumnDef, to parse leading comments alongside the associatied struct (see below for example).

Rationale

This change improves the semantic clarity of comment handling in the SQL tokenizer and parser.
As discussed in #2065, comments preceding a table or column definition may serve as inline documentation, and should be distinguishable from interstitial (whitespace) comments.
For example:

-- Leading comment for table users
CREATE TABLE IF NOT EXISTS users (
  id BIGINT PRIMARY KEY,
  -- Leading comment for table field name
  name TEXT NOT NULL -- interstitial comment ignored by parser
);

Term Definitions

Intersitial Comment: a comment preceded by something (if nothing then defined as leading comment) that is not a comma or semicolon:

CREATE -- comment trailing the create keyword 
TABLE -- comment trailing the table keyword 
my_table

Leading Comment: a comment that is preceded by either nothing, a comma, or a semicolon.
currently the variants covered include single line comments:

-- a comment preceding the create table statement 
CREATE TABLE IF NOT EXISTS users (...

and multi-line comments:

/* a multi-line comment 
Preceding this table */
CREATE TABLE IF NOT EXISTS users (...

By separating comment handling from generic whitespace, the parser can now support more context-aware interpretations and contribute to a lossless syntax tree, addressing #175 and complementing PR #189.

Summary of Changes

  • Added: Comment enum encapsulating SingleLine and MultiLine comment variants.
  • Refactored:
    • Whitespace to include InterstitialComment(Comment) variant.
    • Tokenizer.rs to emit Comment values instead of Whitespace::SingleLineComment / Whitespace::MultiLineComment.
    • parser/mod.rs implemented LeadingComment for CreateTable, ColumnDef, AlterTable parsing.
    • added leading_comment: Option<Comment> to the CreateTable, ColumnDef, AlterTable:
pub struct ColumnDef {
    pub name: Ident,
    pub data_type: DataType,
    pub options: Vec<ColumnOptionDef>,
    /// Leading comment for the column.
    pub leading_comment: Option<Comment>,
}
  • Added logic for differentiating InterstitialComment and LeadingComment.
  • Propagated refactors to all dependent parser components.

@RPG-Alex RPG-Alex changed the title Single Line and Multi Line Comment Support Leading comment support added for AlterTable, CreateTable and ColumnDef` Oct 16, 2025
@RPG-Alex RPG-Alex changed the title Leading comment support added for AlterTable, CreateTable and ColumnDef` Leading comment support added for AlterTable, CreateTable and ColumnDef Oct 16, 2025
@RPG-Alex RPG-Alex changed the title Leading comment support added for AlterTable, CreateTable and ColumnDef Leading comment support added for AlterTable, CreateTable, and ColumnDef Oct 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant