From c6fe18c88c6078bea4f66226f88570e97e0765c6 Mon Sep 17 00:00:00 2001 From: David Huntsperger Date: Wed, 11 Oct 2023 15:52:02 -0700 Subject: [PATCH] Fix up CSV data tutorial: fix typos; link to `pandas.DataFrame` docs on first usage, rather than second; add transition sentence to mixed data types section. PiperOrigin-RevId: 572711158 --- site/en/tutorials/load_data/csv.ipynb | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/site/en/tutorials/load_data/csv.ipynb b/site/en/tutorials/load_data/csv.ipynb index 0d4af134a01..0d4287a425e 100644 --- a/site/en/tutorials/load_data/csv.ipynb +++ b/site/en/tutorials/load_data/csv.ipynb @@ -120,7 +120,7 @@ "id": "ny5TEgcmHjVx" }, "source": [ - "For any small CSV dataset the simplest way to train a TensorFlow model on it is to load it into memory as a pandas Dataframe or a NumPy array.\n" + "For any small CSV dataset the simplest way to train a TensorFlow model on it is to load it into memory as a [pandas `DataFrame`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html) or a NumPy array.\n" ] }, { @@ -132,9 +132,9 @@ "A relatively simple example is the [abalone dataset](https://archive.ics.uci.edu/ml/datasets/abalone).\n", "\n", "* The dataset is small.\n", - "* All the input features are all limited-range floating point values.\n", + "* All the input features are limited-range floating point values.\n", "\n", - "Here is how to download the data into a [pandas `DataFrame`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html):" + "Here is how to download the data into a `DataFrame`:" ] }, { @@ -355,6 +355,8 @@ "source": [ "## Mixed data types\n", "\n", + "In the previous sections, you worked with a dataset where all the features were limited-range floating point values. But not all datasets are limited to a single data type.\n", + "\n", "The \"Titanic\" dataset contains information about the passengers on the Titanic. The nominal task on this dataset is to predict who survived.\n", "\n", "![The Titanic](images/csv/Titanic.jpg)\n", @@ -903,7 +905,7 @@ "source": [ "### From a single file\n", "\n", - "So far this tutorial has worked with in-memory data. `tf.data` is a highly scalable toolkit for building data pipelines, and provides a few functions for dealing loading CSV files. " + "So far this tutorial has worked with in-memory data. `tf.data` is a highly scalable toolkit for building data pipelines, and provides a few functions for loading CSV files. " ] }, { @@ -1373,7 +1375,7 @@ "id": "3jiGZeUijJNd" }, "source": [ - "So far this tutorial has focused on the highest-level utilities for reading csv data. There are other two APIs that may be helpful for advanced users if your use-case doesn't fit the basic patterns.\n", + "So far this tutorial has focused on the highest-level utilities for reading csv data. There are two other APIs that may be helpful for advanced users if your use-case doesn't fit the basic patterns.\n", "\n", "* `tf.io.decode_csv`: a function for parsing lines of text into a list of CSV column tensors.\n", "* `tf.data.experimental.CsvDataset`: a lower-level CSV dataset constructor.\n",