Using pyspark we always try to use concrete types everywhere instead of strings to improve quality and maintainability.
We recently came across this:
from pyspark.sql import functions as F
from pyspark.sql import types as T
_ = F.from_json(
F.col("value"),
schema=T.MapType(T.StringType(), T.StringType()) # >>> TYPE ERROR
)
This code works! ✅
However, type checkers are not happy: 🛑
Type Error: Argument to function from_json is incorrect: Expected ArrayType | StructType | Column | str, found MapType.
Is there a reason that MapType is not included as a type hint?
There is even a concrete example using MAP in the function documentation of from_json:
df.select(sf.from_json(df.value, "MAP<STRING,INT>").alias("json")).show()
Using pyspark we always try to use concrete types everywhere instead of strings to improve quality and maintainability.
We recently came across this:
This code works! ✅
However, type checkers are not happy: 🛑
Type Error: Argument to function
from_jsonis incorrect: ExpectedArrayType | StructType | Column | str, foundMapType.Is there a reason that
MapTypeis not included as a type hint?There is even a concrete example using
MAPin the function documentation offrom_json: