Skip to content

[Bug] databricks-aibi-dashboards: queryLines array elements fuse without separator → invalid SQL deployed #501

@dbalistair

Description

@dbalistair

Summary

When building an AI/BI dashboard via the databricks-aibi-dashboards skill, dataset SQL is constructed as an array of strings under datasets[].queryLines. Lakeview concatenates those array elements with empty string (no newline, no space) when it executes the dataset. If any element ends with an identifier and the next element begins with a SQL keyword, the result is invalid SQL with fused tokens.

The skill's reference example (4-examples.md) happens to use trailing spaces on most lines, which sidesteps the problem cosmetically — but the convention is never explained as load-bearing. A reader picks up the structure and naturally writes lines without trailing whitespace, then ships broken SQL.

Reproduction

Build a dataset whose queryLines looks like this:

"queryLines": [
  "SELECT a.Id, a.Name, a.OwnerId",
  "FROM some_catalog.some_schema.some_table a",
  "JOIN other_table h ON a.OwnerId = h.user_id",
  "LEFT JOIN third_table p ON p.user_id = h.user_id",
  "LEFT JOIN fourth_table c ON c.account_id = a.Id",
  "WHERE COALESCE(a.IsDeleted, FALSE) = FALSE"
]

After deployment, fetching the dashboard back via manage_dashboard get and concatenating queryLines with empty string (which is what Lakeview does at execution) yields:

... ON a.OwnerId = h.user_idLEFT JOIN third_table p ON p.user_id = h.user_idLEFT JOIN fourth_table c ON c.account_id = a.IdWHERE COALESCE(a.IsDeleted, FALSE) = FALSE

user_idLEFT, IdWHERE — fused tokens. Dashboard renders with a SQL parse error in every widget that references the dataset.

Lines that happen to end with ,, (, or ) are immune (those characters are token delimiters in SQL). Lines that end with an identifier (alias, column, table) and are followed by a line starting with a keyword (LEFT, WHERE, JOIN, ORDER, GROUP, HAVING, UNION, etc.) are the failure case.

Expected behavior

Either:

  • Lakeview should join queryLines with newlines, OR
  • The skill should prevent broken array shapes from being deployed.

Actual behavior

Skill happily deploys an array of strings whose empty-string concatenation produces invalid SQL. The post-deploy widget error is generic ("Invalid widget definition" / SQL parse error), so the cause isn't obvious from the symptom.

The MANDATORY VALIDATION WORKFLOW in SKILL.md says to test every dataset query via execute_sql before deployment. That step passes — the SQL is valid when assembled in a SQL editor — but the queryLines array form post-join is what actually executes, and that form is never tested.

Suggested fixes (in order of effort)

  1. Docs warning, easy — add an explicit note in SKILL.md and 1-widget-specifications.md:

    WARNING: queryLines array elements are concatenated with empty string. Each element must end with whitespace, a newline, or a token delimiter (, ( )), OR the next element must begin with whitespace. Otherwise identifiers will fuse with adjacent keywords. Safest: pass the entire SQL as a single multi-line string in a one-element array with \n newlines preserved.

  2. Skill validation, medium — in the validation workflow, after building the dataset JSON, do an empty-string concat of queryLines and run that exact string through execute_sql. Catches the fusion case automatically.

  3. MCP tool normalization, robust — in manage_dashboard create_or_update, detect adjacent queryLines elements where last-char + first-char would fuse identifiers (regex: \w$ followed by ^\w) and either:

    • Auto-insert \n between them, OR
    • Always rewrite multi-element queryLines to a single-element array internally.

    This is the most resilient fix because no skill reader has to know the trap exists.

Workaround

Pass the entire SQL as one multi-line string in a one-element queryLines array:

"queryLines": ["WITH cte AS (\n  SELECT ...\n)\nSELECT ...\nFROM cte"]

Lakeview then splits on newlines for storage but preserves them, so the empty-string concat at execution time still yields valid SQL.

Environment

  • Skill: databricks-aibi-dashboards
  • MCP tool: manage_dashboard action create_or_update
  • Workflow followed: get_table_stats_and_schemaexecute_sql (passed) → manage_dashboard create_or_update (deployed broken SQL silently)
  • Detected by inspecting the deployed dashboard via manage_dashboard get and concatenating queryLines manually

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions