Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The --collapse-root-models switch can cause "Cannot parse for target version Python 3.xx" errors #2161

Open
smcl opened this issue Nov 11, 2024 · 0 comments

Comments

@smcl
Copy link

smcl commented Nov 11, 2024

Describe the bug
I can reliably cause datamodel-code-generator to error when processing a jsonschema file which is otherwise valid, and which can be made to succeed by tweaking the schema very slightly in a way that doesn't fundamentally alter it.

To Reproduce

Example schema:

{
    "$ref": "#/definitions/LogicalExpression",
    "$schema": "http://json-schema.org/draft-07/schema#",
    "definitions": {
      "ValueExpression": {
        "title": "ValueExpression",
        "anyOf": [
          {
            "$ref": "#/definitions/ConditionalValueExpression"
          }
        ]
      },
      "ConditionalValueExpression": {
        "additionalProperties": false,
        "title": "ConditionalValueExpression",
        "properties": {
          "default": {
            "$ref": "#/definitions/ValueExpression"
          }
        },
        "type": "object"
      },
      "LogicalExpression": {
        "title": "LogicalExpression",
        "anyOf": [
          {
            "$ref": "#/definitions/ValueExpression"
          },
          {
            "type": "string"
          }
        ]
      } 
    }
  }
  

Used commandline:

$ datamodel-codegen --input schema2.json --output model.py --collapse-root-models --target-python-version=3.10
The input file type was determined to be: jsonschema
This can be specified explicitly with the `--input-file-type` option.
Traceback (most recent call last):
  File "C:\Users\sean.mclemon\src\scratch\scm-schema-test\venv\lib\site-packages\datamodel_code_generator\__main__.py", line 476, in main
    generate(
  File "C:\Users\sean.mclemon\src\scratch\scm-schema-test\venv\lib\site-packages\datamodel_code_generator\__init__.py", line 485, in generate
    results = parser.parse()
  File "C:\Users\sean.mclemon\src\scratch\scm-schema-test\venv\lib\site-packages\datamodel_code_generator\parser\base.py", line 1474, in parse
    body = code_formatter.format_code(body)
  File "C:\Users\sean.mclemon\src\scratch\scm-schema-test\venv\lib\site-packages\datamodel_code_generator\format.py", line 238, in format_code
    code = self.apply_black(code)
  File "C:\Users\sean.mclemon\src\scratch\scm-schema-test\venv\lib\site-packages\datamodel_code_generator\format.py", line 246, in apply_black
    return black.format_str(
  File "src\black\__init__.py", line 1204, in format_str
  File "src\black\__init__.py", line 1218, in _format_str_once
  File "src\black\parsing.py", line 98, in lib2to3_parse
black.parsing.InvalidInput: Cannot parse for target version Python 3.10: 9:20:     __root__: Union[, str] = Field(..., title='Expression')

Expected behavior

The command should complete successfully and generate classes in model.py. Interestingly the schema can be adjusted very slightly in a way that is functionally identical and the code generation succeeds without any issue. If you just swap the order of the two types inside the anyOf in LogicalExpression (like the below) code generation will succeed.

      "LogicalExpression": {
        "title": "LogicalExpression",
        "anyOf": [
          {
            "type": "string"
          },
          {
            "$ref": "#/definitions/ValueExpression"
          }
        ]
      } 

I know the schema may not make sense and may look stupid but I trimmed down a fairly large json schema to get a minimal example that reproduces the problem. And, as I said, it can be adjusted very trivially so that code generation succeeds.

Version:

  • OS: Windows 10
  • Python version: 3.10.11
  • datamodel-code-generator version: 0.26.3

Additional context

Note that while I say the schema can be adjusted so that it works, this isn't a feasible workaround for us. We are consuming a pretty large schema file that is generated automatically - so identifying and changing things around that need to be changed in the schema to workaround the issue would be really difficult.

And when I say it errors/succeeds - it looks like the code itself generates - it's just that when the datatmodel-codegen tool invokes black to reformat the code it fails because the type Union[, str] isn't valid Python. So during the collapse process I guess types get flattened and removed, but we still end up with something there in their place. And presumably when the order of the anyOf is reversed we end up with Union[str, ] which is syntactically fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant