Skip to content

Commit

Permalink
Fix up CSV data tutorial: fix typos; link to pandas.DataFrame docs …
Browse files Browse the repository at this point in the history
…on first usage, rather than second; add transition sentence to mixed data types section.

PiperOrigin-RevId: 572711158
  • Loading branch information
pcoet authored and copybara-github committed Oct 11, 2023
1 parent 2bcfa24 commit c6fe18c
Showing 1 changed file with 7 additions and 5 deletions.
12 changes: 7 additions & 5 deletions site/en/tutorials/load_data/csv.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@
"id": "ny5TEgcmHjVx"
},
"source": [
"For any small CSV dataset the simplest way to train a TensorFlow model on it is to load it into memory as a pandas Dataframe or a NumPy array.\n"
"For any small CSV dataset the simplest way to train a TensorFlow model on it is to load it into memory as a [pandas `DataFrame`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html) or a NumPy array.\n"
]
},
{
Expand All @@ -132,9 +132,9 @@
"A relatively simple example is the [abalone dataset](https://archive.ics.uci.edu/ml/datasets/abalone).\n",
"\n",
"* The dataset is small.\n",
"* All the input features are all limited-range floating point values.\n",
"* All the input features are limited-range floating point values.\n",
"\n",
"Here is how to download the data into a [pandas `DataFrame`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html):"
"Here is how to download the data into a `DataFrame`:"
]
},
{
Expand Down Expand Up @@ -355,6 +355,8 @@
"source": [
"## Mixed data types\n",
"\n",
"In the previous sections, you worked with a dataset where all the features were limited-range floating point values. But not all datasets are limited to a single data type.\n",
"\n",
"The \"Titanic\" dataset contains information about the passengers on the Titanic. The nominal task on this dataset is to predict who survived.\n",
"\n",
"![The Titanic](images/csv/Titanic.jpg)\n",
Expand Down Expand Up @@ -903,7 +905,7 @@
"source": [
"### From a single file\n",
"\n",
"So far this tutorial has worked with in-memory data. `tf.data` is a highly scalable toolkit for building data pipelines, and provides a few functions for dealing loading CSV files. "
"So far this tutorial has worked with in-memory data. `tf.data` is a highly scalable toolkit for building data pipelines, and provides a few functions for loading CSV files. "
]
},
{
Expand Down Expand Up @@ -1373,7 +1375,7 @@
"id": "3jiGZeUijJNd"
},
"source": [
"So far this tutorial has focused on the highest-level utilities for reading csv data. There are other two APIs that may be helpful for advanced users if your use-case doesn't fit the basic patterns.\n",
"So far this tutorial has focused on the highest-level utilities for reading csv data. There are two other APIs that may be helpful for advanced users if your use-case doesn't fit the basic patterns.\n",
"\n",
"* `tf.io.decode_csv`: a function for parsing lines of text into a list of CSV column tensors.\n",
"* `tf.data.experimental.CsvDataset`: a lower-level CSV dataset constructor.\n",
Expand Down

0 comments on commit c6fe18c

Please sign in to comment.