Skip to content

Pyspark: from_json is missing MapType as a type hint for schema argument #55900

@Julian-J-S

Description

@Julian-J-S

Using pyspark we always try to use concrete types everywhere instead of strings to improve quality and maintainability.

We recently came across this:

from pyspark.sql import functions as F
from pyspark.sql import types as T

_ = F.from_json(
    F.col("value"),
    schema=T.MapType(T.StringType(), T.StringType())  # >>> TYPE ERROR
)

This code works! ✅

However, type checkers are not happy: 🛑
Type Error: Argument to function from_json is incorrect: Expected ArrayType | StructType | Column | str, found MapType.

Is there a reason that MapType is not included as a type hint?
There is even a concrete example using MAP in the function documentation of from_json:

df.select(sf.from_json(df.value, "MAP<STRING,INT>").alias("json")).show()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions