Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 1 addition & 4 deletions Getting_Started/Part_2_Reading_Spatial_Files.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,6 @@
"\n",
"- GeoParquet from an S3 bucket\n",
"- GeoJSON from the notebook's local file storage\n",
"- Shapefile\n",
"- A CSV file with latitude and longitude stored in two columns\n",
"\n",
"In all these examples, we are loading the data into an Apache Spark DataFrame."
Expand Down Expand Up @@ -175,15 +174,13 @@
"source": [
"Let's break those calls down.\n",
"\n",
"**GeoParquet**: The Wherobots [Spatial Catalog](https://cloud.wherobots.com/spatial-catalog) hosts datasets stored in S3 buckets. \n",
"**GeoParquet**: The Wherobots [Data Hub](https://cloud.wherobots.com/data-hub) hosts datasets stored in S3 buckets. \n",
"\n",
"- `format(\"geoparquet\")` → Specifies that we are reading a GeoParquet file.\n",
"- `load(\"s3a://...\")` → Loads the dataset directly from S3 without downloading it locally.\n",
"\n",
"**GeoJSON** is often used for web-based mapping applications. GeoJSON data is often hierarchical, so it's often useful to pull those fields from inside a struct and make them columns of their own.\n",
"\n",
"**Shapefiles** consist of multiple files (`.shp`, `.dbf`, `.shx`), so we load the directory containing them.\n",
"\n",
"**CSV** cannot store binary fields like geometries, so spatial data often needs to be converted so we can use WherobotsDB's spatial query functions.\n",
"\n",
"- `option(\"header\", \"true\")` → Reads the first line as column names.\n",
Expand Down