docs(iceberg): update JSON schema translation behavior#1603
docs(iceberg): update JSON schema translation behavior#1603nvartolomei wants to merge 1 commit intomainfrom
Conversation
📝 WalkthroughWalkthroughThe change updates JSON Schema validation guidance documentation for Iceberg schema specification. It clarifies and expands supported JSON Schema features: Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Deploy Preview for redpanda-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
modules/manage/pages/iceberg/specify-iceberg-schema.adoc (3)
313-313: Consider adding inline clarification for "exclusiveoneOfnullable pattern".The term "exclusive
oneOfnullable pattern" is used here but only defined later in the "Keyword behavior" section (Line 347). Readers encountering this in the table may be confused.Consider adding a brief inline example or parenthetical clarification, such as:
The `null` type is only supported as a nullability marker, either in a `type` array (for example, `["string", "null"]`) or in an exclusive `oneOf` nullable pattern (where exactly one branch is `{"type":"null"}` and the other is a non-null schema).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/manage/pages/iceberg/specify-iceberg-schema.adoc` at line 313, Add a short parenthetical clarification to the sentence about the `null` type so readers immediately understand "exclusive `oneOf` nullable pattern" without jumping to the later section; for example, append a brief phrase after "oneOf nullable pattern" such as "(i.e., an exclusive oneOf where exactly one branch is {"type":"null"} and the other branch is a non-null schema)" referencing the `null` type and `oneOf` examples so the meaning is clear in-place.
329-329: Minor grammar improvement: remove redundant "object".The phrase "A schema-valued
additionalPropertiesobject" is slightly awkward. The word "object" at the end is redundant sinceadditionalPropertiesitself is the keyword.✍️ Suggested rewording
Consider one of these alternatives:
Option 1:
-Use `properties` to define `struct` fields and constrain their types. `additionalProperties: false` is supported for closed objects. A schema-valued `additionalProperties` object is translated to an Iceberg `map<string, T>`. You cannot combine `properties` and schema-valued `additionalProperties` in the same object. +Use `properties` to define `struct` fields and constrain their types. `additionalProperties: false` is supported for closed objects. A schema-valued `additionalProperties` is translated to an Iceberg `map<string, T>`. You cannot combine `properties` and schema-valued `additionalProperties` in the same object.Option 2:
-Use `properties` to define `struct` fields and constrain their types. `additionalProperties: false` is supported for closed objects. A schema-valued `additionalProperties` object is translated to an Iceberg `map<string, T>`. You cannot combine `properties` and schema-valued `additionalProperties` in the same object. +Use `properties` to define `struct` fields and constrain their types. `additionalProperties: false` is supported for closed objects. When `additionalProperties` is a schema object, it is translated to an Iceberg `map<string, T>`. You cannot combine `properties` and schema-valued `additionalProperties` in the same object.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/manage/pages/iceberg/specify-iceberg-schema.adoc` at line 329, The sentence "A schema-valued `additionalProperties` object is translated to an Iceberg `map<string, T>`." uses a redundant trailing "object"; update the wording to remove it so it reads e.g. "A schema-valued `additionalProperties` is translated to an Iceberg `map<string, T>`." Make this change wherever this exact sentence appears (referencing the `additionalProperties` keyword and the Iceberg `map<string, T>` translation) to improve grammar and clarity.
354-355: Improve technical terminology for JSON Schema features.Two phrases use non-standard or ambiguous terminology:
Line 354: "Conditional typing (
if,then,else,dependencieskeywords)" - The term "conditional typing" is not standard JSON Schema terminology. The standard term is "conditional subschemas" or "schema dependencies".Line 355: "Boolean JSON Schema combinations" - This is non-standard terminology. The standard terms are "schema combinators" or "schema composition keywords".
📚 Suggested improvements
-* Conditional typing (`if`, `then`, `else`, `dependencies` keywords) -* Boolean JSON Schema combinations (`allOf`, `anyOf`, and non-nullable `oneOf` patterns) +* Conditional subschemas (`if`, `then`, `else`, `dependencies` keywords) +* Schema combinators (`allOf`, `anyOf`, and non-nullable `oneOf` patterns)Or alternatively:
-* Conditional typing (`if`, `then`, `else`, `dependencies` keywords) -* Boolean JSON Schema combinations (`allOf`, `anyOf`, and non-nullable `oneOf` patterns) +* Conditional schema application (`if`, `then`, `else`, `dependencies` keywords) +* Boolean schema composition (`allOf`, `anyOf`, and non-nullable `oneOf` patterns)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/manage/pages/iceberg/specify-iceberg-schema.adoc` around lines 354 - 355, Replace the non-standard phrases in the bulleted list: change "Conditional typing (`if`, `then`, `else`, `dependencies` keywords)" to use standard JSON Schema terminology such as "Conditional subschemas (`if`, `then`, `else`) and schema dependencies (`dependencies`)" and change "Boolean JSON Schema combinations (`allOf`, `anyOf`, and non-nullable `oneOf` patterns)" to "Schema combinators / composition keywords (`allOf`, `anyOf`, `oneOf`)" so the document uses "conditional subschemas" and "schema combinators" consistently (update the two phrase occurrences shown in the diff).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@modules/manage/pages/iceberg/specify-iceberg-schema.adoc`:
- Line 346: The phrase "references to unknown keywords" in the sentence about
$ref is ambiguous; update the wording around the $ref/$id sentence to enumerate
the exact unsupported cases (e.g., references to external resources, references
to non-existent schema $id values, undefined anchors or unresolvable
fragments/JSON Pointer fragments) and keep the note that a root-level $ref
schema is not supported; reference the $ref and $id tokens as the symbols to
edit so the text explicitly says "References to external resources, non-existent
$id values, or undefined anchors/fragments are not supported" (or similar
concise wording).
---
Nitpick comments:
In `@modules/manage/pages/iceberg/specify-iceberg-schema.adoc`:
- Line 313: Add a short parenthetical clarification to the sentence about the
`null` type so readers immediately understand "exclusive `oneOf` nullable
pattern" without jumping to the later section; for example, append a brief
phrase after "oneOf nullable pattern" such as "(i.e., an exclusive oneOf where
exactly one branch is {"type":"null"} and the other branch is a non-null
schema)" referencing the `null` type and `oneOf` examples so the meaning is
clear in-place.
- Line 329: The sentence "A schema-valued `additionalProperties` object is
translated to an Iceberg `map<string, T>`." uses a redundant trailing "object";
update the wording to remove it so it reads e.g. "A schema-valued
`additionalProperties` is translated to an Iceberg `map<string, T>`." Make this
change wherever this exact sentence appears (referencing the
`additionalProperties` keyword and the Iceberg `map<string, T>` translation) to
improve grammar and clarity.
- Around line 354-355: Replace the non-standard phrases in the bulleted list:
change "Conditional typing (`if`, `then`, `else`, `dependencies` keywords)" to
use standard JSON Schema terminology such as "Conditional subschemas (`if`,
`then`, `else`) and schema dependencies (`dependencies`)" and change "Boolean
JSON Schema combinations (`allOf`, `anyOf`, and non-nullable `oneOf` patterns)"
to "Schema combinators / composition keywords (`allOf`, `anyOf`, `oneOf`)" so
the document uses "conditional subschemas" and "schema combinators" consistently
(update the two phrase occurrences shown in the diff).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 73bd4385-5e4f-4e87-b541-1003e1b95486
📒 Files selected for processing (1)
modules/manage/pages/iceberg/specify-iceberg-schema.adoc
|
|
||
| Keyword behavior: | ||
|
|
||
| * The `$ref` keyword is supported for internal references resolved from schema resources declared in the same document (using `$id`), including relative and absolute URI forms. References to external resources and references to unknown keywords are not supported. A root-level `$ref` schema is not supported. |
There was a problem hiding this comment.
Clarify "references to unknown keywords".
The phrase "references to unknown keywords" is unclear. In JSON Schema, $ref typically references schema resources (via $id, anchors, or URIs), not keywords directly.
This phrase might mean:
- References to non-existent schema IDs
- References to undefined anchors or fragments
- Something else
📝 Suggested clarification
Consider more specific wording, such as:
-The `$ref` keyword is supported for internal references resolved from schema resources declared in the same document (using `$id`), including relative and absolute URI forms. References to external resources and references to unknown keywords are not supported. A root-level `$ref` schema is not supported.
+The `$ref` keyword is supported for internal references resolved from schema resources declared in the same document (using `$id`), including relative and absolute URI forms. References to external resources and references to undefined schema resources are not supported. A root-level `$ref` schema is not supported.Or if you specifically mean JSON Pointer fragments:
References to external resources and unresolvable fragments are not supported.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| * The `$ref` keyword is supported for internal references resolved from schema resources declared in the same document (using `$id`), including relative and absolute URI forms. References to external resources and references to unknown keywords are not supported. A root-level `$ref` schema is not supported. | |
| * The `$ref` keyword is supported for internal references resolved from schema resources declared in the same document (using `$id`), including relative and absolute URI forms. References to external resources and references to undefined schema resources are not supported. A root-level `$ref` schema is not supported. |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modules/manage/pages/iceberg/specify-iceberg-schema.adoc` at line 346, The
phrase "references to unknown keywords" in the sentence about $ref is ambiguous;
update the wording around the $ref/$id sentence to enumerate the exact
unsupported cases (e.g., references to external resources, references to
non-existent schema $id values, undefined anchors or unresolvable fragments/JSON
Pointer fragments) and keep the note that a root-level $ref schema is not
supported; reference the $ref and $id tokens as the symbols to edit so the text
explicitly says "References to external resources, non-existent $id values, or
undefined anchors/fragments are not supported" (or similar concise wording).
|
|
||
| Keyword behavior: | ||
|
|
||
| * The `$ref` keyword is supported for internal references resolved from schema resources declared in the same document (using `$id`), including relative and absolute URI forms. References to external resources and references to unknown keywords are not supported. A root-level `$ref` schema is not supported. |
There was a problem hiding this comment.
| * The `$ref` keyword is supported for internal references resolved from schema resources declared in the same document (using `$id`), including relative and absolute URI forms. References to external resources and references to unknown keywords are not supported. A root-level `$ref` schema is not supported. | |
| * The `$ref` keyword is supported for internal references resolved from schema resources declared in the same document (using `$id`), including relative and absolute URI forms. References to external resources and references to undefined schema resources are not supported. A root-level `$ref` schema is not supported. |
Description
Resolves https://redpandadata.atlassian.net/browse/DOC-1957
Review deadline: v26.1 release
Page previews
Checks