-
For String fields a try to get the length. My current code does not return anything : public static async Task<(string Name, Type ClrType,int Length, int Precision, int Scale)[]> GetParquetSchema(string parquetFilePath)
{
using (Stream fs = System.IO.File.OpenRead(parquetFilePath))
{
using (ParquetReader reader = await ParquetReader.CreateAsync(fs))
{
var schema = new (string Name, Type ClrType, int Length, int Precision, int Scale)[reader.Schema.GetDataFields().Length];
int i = 0;
foreach (DataField df in reader.Schema.GetDataFields())
{
// populate an array using Name, ClrType and Length, Precision and Scale of the Datafield
schema[i]= (df.Name, df.ClrType, df.ThriftSchemaElement.Type_length, df.ThriftSchemaElement.Precision, df.ThriftSchemaElement.Scale);
i++;
}
return schema;
}
}
} If the column/field length cannot be found in DataField.ThriftSchemaElement.Type_length, where can i found it ? |
Beta Was this translation helpful? Give feedback.
Answered by
aloneguid
Apr 6, 2023
Replies: 2 comments 5 replies
-
Strings are variable length, do you mean number of values? |
Beta Was this translation helpful? Give feedback.
5 replies
-
I switch to a const MAX_VARCHAR_LEN public static async Task<(string Name, Type ClrType,int Length, int Precision, int Scale)[]> GetParquetSchema(string parquetFilePath)
{
using (Stream fs = System.IO.File.OpenRead(parquetFilePath))
{
using (ParquetReader reader = await ParquetReader.CreateAsync(fs))
{
var schema = new (string Name, Type ClrType, int Length, int Precision, int Scale)[reader.Schema.GetDataFields().Length];
int i = 0;
foreach (DataField df in reader.Schema.GetDataFields())
{
// populate an array using Name, ClrType , MAX_VARCHAR_LEN (before founding better alternative), Precision and Scale of the Datafield
schema[i]= (df.Name, df.ClrType, MAX_VARCHAR_LEN , df.ThriftSchemaElement.Precision, df.ThriftSchemaElement.Scale);
i++;
}
return schema;
}
}
} |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Correct. Unlike relational tables, strings in parquet are not limited by length. Hope you find a good solution.