refactor(message): rewrite parse logic#84
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR refactors the message parsing logic by moving the complex text processing from the UserMessage.toFragments() method to a new dedicated parse.ts module. The main purpose is to improve code organization and maintainability by separating parsing concerns from the message class.
Key changes:
- Replaces the
toFragments()method with anodesgetter that uses the new parser - Introduces a new parsing system with
Nodetypes instead ofFragmenttypes - Updates component interfaces to use the new node structure
Reviewed Changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/lib/message/user-message.ts |
Removes complex parsing logic and Fragment types, adds nodes getter using new parser |
src/lib/message/parse.ts |
New parsing module with Node types and parse function for message text processing |
src/lib/message/index.ts |
Exports the new parse module |
src/lib/handlers/eventsub/* |
Updates method calls from addAutoModMetadata to setAutoMod |
src/lib/components/message/Message.svelte |
Updates to use Node types instead of Fragment types |
src/lib/components/Emote.svelte |
Renames overlays prop to layers |
package.json |
Adds tldts dependency for URL parsing |
Files not reviewed (1)
- pnpm-lock.yaml: Language not supported
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| }; | ||
|
|
||
| const url = URL.parse(`https://${part.replace(/^https?:\/\/|\.$/i, "")}`); | ||
| const result = url ? parseTld(url.hostname) : null; |
There was a problem hiding this comment.
The URL parsing logic is flawed. The regex /^https?:\/\/|\.$/i will remove trailing dots from all text, not just URLs. For example, 'Hello.' would become 'Hello' and incorrectly be parsed as a URL. Consider checking if the part starts with 'http' before applying URL parsing logic.
| const result = url ? parseTld(url.hostname) : null; | |
| let url: URL | null = null; | |
| let result: ReturnType<typeof parseTld> | null = null; | |
| // Only try to parse as URL if part looks like a URL or domain | |
| if ( | |
| /^https?:\/\//i.test(part) || | |
| // domain-like: contains a dot, no spaces, not just a number | |
| (/\./.test(part) && !/\s/.test(part) && !/^\d+$/.test(part)) | |
| ) { | |
| const cleaned = part.replace(/^https?:\/\/|\.$/i, ""); | |
| try { | |
| url = URL.parse(`https://${cleaned}`); | |
| result = url ? parseTld(url.hostname) : null; | |
| } catch { | |
| url = null; | |
| result = null; | |
| } | |
| } |
| return hasPrefix && hasBits; | ||
| }); | ||
|
|
||
| const ircEmote = ircEmotes.find((e) => e.code === part); |
There was a problem hiding this comment.
IRC emotes are being matched by token text rather than by their position ranges. This could cause incorrect matching when the same emote code appears multiple times in a message, as the first occurrence found will be used regardless of position.
| const ircEmote = ircEmotes.find((e) => e.code === part); | |
| const ircEmote = ircEmotes.find((e) => e.start === start && e.end === end); |
| }); | ||
| } | ||
| } | ||
| } else if (ircEmote) { |
There was a problem hiding this comment.
The IRC emote handling creates a node with ircEmote.range.start and ircEmote.range.end positions, but these positions may not align with the current token's position. This could result in overlapping or misaligned nodes when the emote spans multiple tokens or partial tokens.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Closes #77
Just need to reimplement automod boundary marking