How Can I read Json, Csv or Parquet with schema? #3785
-
|
Is it possible to read a file passing schema? like spark. |
Beta Was this translation helpful? Give feedback.
Answered by
HaoYang670
Nov 9, 2022
Replies: 1 comment
-
|
Yes. You could try this to read https://github.com/apache/arrow-datafusion/blob/master/datafusion/core/tests/example.csv with schema use arrow::datatypes::{ Field, Schema, DataType};
use datafusion::error::Result;
use datafusion::prelude::*;
use datafusion::arrow::util::pretty;
#[tokio::main]
async fn main() -> Result<()> {
let ctx = SessionContext::new();
let schema = generate_schema();
// register csv file with the execution context
ctx.register_csv(
"example",
"/home/remziy/learning/datafusion/datafusion/core/tests/example.csv",
CsvReadOptions::new().has_header(true).schema(&schema),
)
.await?;
// execute the query
let df = ctx
.sql("SELECT * from example")
.await?;
let results = df.collect().await?;
// print the results with data type
println!("{:?}", results);
// print the table
pretty::print_batches(&results)?;
Ok(())
}
fn generate_schema() -> Schema {
let field_a = Field::new("a", DataType::Int32, false);
let field_b = Field::new("b", DataType::UInt16, false);
let field_c = Field::new("c", DataType::Float32, false);
Schema::new(vec![field_a, field_b, field_c])
}And this is the output I get |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
Jefffrey
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Yes. You could try this to read https://github.com/apache/arrow-datafusion/blob/master/datafusion/core/tests/example.csv with schema