From 108cebca69bc07b0ebf28982c30f3146947a46c4 Mon Sep 17 00:00:00 2001
From: jrzaurin <jrzaurin@gmail.com>
Date: Thu, 10 Mar 2022 10:31:56 +0100
Subject: [PATCH] updated README and docs

---
 README.md            | 120 +++++++++++++++++++++----------------------
 docs/index.rst       |  40 +++++++++------
 docs/quick_start.rst |  80 +++++++++++++----------------
 pypi_README.md       |  96 +++++++++++++++-------------------
 4 files changed, 159 insertions(+), 177 deletions(-)
diff --git a/README.md b/README.md
index a49517a5..b02dab58 100644
--- a/README.md
+++ b/README.md
@@ -15,14 +15,14 @@
 
 # pytorch-widedeep
 
-A flexible package to use Deep Learning with tabular data, text and images
-using wide and deep models.
+A flexible package for multimodal-deep-learning to combine tabular data with
+text and images using Wide and Deep models in Pytorch
 
 **Documentation:** [https://pytorch-widedeep.readthedocs.io](https://pytorch-widedeep.readthedocs.io/en/latest/index.html)
 
 **Companion posts and tutorials:** [infinitoml](https://jrzaurin.github.io/infinitoml/)
 
-**Experiments and comparisson with `LightGBM`**: [TabularDL vs LightGBM](https://github.com/jrzaurin/tabulardl-benchmark)
+**Experiments and comparison with `LightGBM`**: [TabularDL vs LightGBM](https://github.com/jrzaurin/tabulardl-benchmark)
 
 The content of this document is organized as follows:
 
@@ -33,7 +33,8 @@ The content of this document is organized as follows:
 
 ### Introduction
 
-``pytorch-widedeep`` is based on Google's [Wide and Deep Algorithm](https://arxiv.org/abs/1606.07792)
+``pytorch-widedeep`` is based on Google's [Wide and Deep Algorithm](https://arxiv.org/abs/1606.07792),
+adjusted for multi-modal datasets
 
 In general terms, `pytorch-widedeep` is a package to use deep learning with
 tabular data. In particular, is intended to facilitate the combination of text
@@ -89,15 +90,11 @@ into:
   <img width="300" src="docs/figures/architecture_2_math.png">
 </p>
 
-I recommend using the ``wide`` and ``deeptabular`` models in
-``pytorch-widedeep``. However it is very likely that users will want to use
-their own models for the ``deeptext`` and ``deepimage`` components. That is
-perfectly possible as long as the the custom models have an attribute called
+It is perfectly possible to use custom models (and not necessarily those in
+the library) as long as the the custom models have an attribute called
 ``output_dim`` with the size of the last layer of activations, so that
-``WideDeep`` can be constructed. Again, examples on how to use custom
-components can be found in the Examples folder. Just in case
-``pytorch-widedeep`` includes standard text (stack of LSTMs) and image
-(pre-trained ResNets or stack of CNNs) models.
+``WideDeep`` can be constructed. Examples on how to use custom components can
+be found in the Examples folder.
 
 ### The ``deeptabular`` component
 
@@ -110,15 +107,17 @@ its own, i.e. what one might normally refer as Deep Learning for Tabular
 Data. Currently, ``pytorch-widedeep`` offers the following different models
 for that component:
 
-
+0. **Wide**: a simple linear model where the nonlinearities are captured via
+cross-product transformations, as explained before.
 1. **TabMlp**: a simple MLP that receives embeddings representing the
-categorical features, concatenated with the continuous features.
+categorical features, concatenated with the continuous features, which can
+also be embedded.
 2. **TabResnet**: similar to the previous model but the embeddings are
 passed through a series of ResNet blocks built with dense layers.
 3. **TabNet**: details on TabNet can be found in
 [TabNet: Attentive Interpretable Tabular Learning](https://arxiv.org/abs/1908.07442)
 
-And the ``Tabformer`` family, i.e. Transformers for Tabular data:
+The ``Tabformer`` family, i.e. Transformers for Tabular data:
 
 4. **TabTransformer**: details on the TabTransformer can be found in
 [TabTransformer: Tabular Data Modeling Using Contextual Embeddings](https://arxiv.org/pdf/2012.06678.pdf).
@@ -133,12 +132,19 @@ on the Fasformer can be found in
 the Perceiver can be found in
 [Perceiver: General Perception with Iterative Attention](https://arxiv.org/abs/2103.03206)
 
+And probabilistic DL models for tabular data based on
+[Weight Uncertainty in Neural Networks](https://arxiv.org/abs/1505.05424):
+
+9. **BayesianWide**: Probabilistic adaptation of the `Wide` model.
+10. **BayesianTabMlp**: Probabilistic adaptation of the `TabMlp` model
+
 Note that while there are scientific publications for the TabTransformer,
 SAINT and FT-Transformer, the TabFasfFormer and TabPerceiver are our own
 adaptation of those algorithms for tabular data.
 
-For details on these models and their options please see the examples in the
-Examples folder and the documentation.
+For details on these models (and all the other models in the library for the
+different data modes) and their corresponding options please see the examples
+in the Examples folder and the documentation.
 
 ###  Installation
 
@@ -165,13 +171,6 @@ cd pytorch-widedeep
 pip install -e .
 ```
 
-**Important note for Mac users**: Since `python
-3.8`, [the `multiprocessing` library start method changed from `'fork'` to`'spawn'`](https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods) which affects the data-loaders.
-For the time being, `pytorch-widedeep` sets the `num_workers` to 0 when using
-Mac and python version 3.8+.
-
-Note that this issue does not affect Linux users.
-
 ### Quick start
 
 Binary classification with the [adult
@@ -181,7 +180,6 @@ using `Wide` and `DeepDense` and defaults settings.
 Building a wide (linear) and deep model with ``pytorch-widedeep``:
 
 ```python
-
 import pandas as pd
 import numpy as np
 import torch
@@ -191,16 +189,15 @@ from pytorch_widedeep import Trainer
 from pytorch_widedeep.preprocessing import WidePreprocessor, TabPreprocessor
 from pytorch_widedeep.models import Wide, TabMlp, WideDeep
 from pytorch_widedeep.metrics import Accuracy
+from pytorch_widedeep.datasets import load_adult
+
 
-# the following 4 lines are not directly related to ``pytorch-widedeep``. I
-# assume you have downloaded the dataset and place it in a dir called
-# data/adult/
-df = pd.read_csv("data/adult/adult.csv.zip")
+df = load_adult(as_frame=True)
 df["income_label"] = (df["income"].apply(lambda x: ">50K" in x)).astype(int)
 df.drop("income", axis=1, inplace=True)
 df_train, df_test = train_test_split(df, test_size=0.2, stratify=df.income_label)
 
-# prepare wide, crossed, embedding and continuous columns
+# Define the 'column set up'
 wide_cols = [
     "education",
     "relationship",
@@ -209,38 +206,43 @@ wide_cols = [
     "native-country",
     "gender",
 ]
-cross_cols = [("education", "occupation"), ("native-country", "occupation")]
-embed_cols = [
-    ("education", 16),
-    ("workclass", 16),
-    ("occupation", 16),
-    ("native-country", 32),
-]
-cont_cols = ["age", "hours-per-week"]
-target_col = "income_label"
+crossed_cols = [("education", "occupation"), ("native-country", "occupation")]
 
-# target
-target = df_train[target_col].values
+cat_embed_cols = [
+    "workclass",
+    "education",
+    "marital-status",
+    "occupation",
+    "relationship",
+    "race",
+    "gender",
+    "capital-gain",
+    "capital-loss",
+    "native-country",
+]
+continuous_cols = ["age", "hours-per-week"]
+target = "income_label"
+target = df_train[target].values
 
-# wide
-wide_preprocessor = WidePreprocessor(wide_cols=wide_cols, crossed_cols=cross_cols)
+# prepare the data
+wide_preprocessor = WidePreprocessor(wide_cols=wide_cols, crossed_cols=crossed_cols)
 X_wide = wide_preprocessor.fit_transform(df_train)
-wide = Wide(wide_dim=np.unique(X_wide).shape[0], pred_dim=1)
 
-# deeptabular
-tab_preprocessor = TabPreprocessor(cat_embed_cols=embed_cols, continuous_cols=cont_cols)
+tab_preprocessor = TabPreprocessor(
+    cat_embed_cols=cat_embed_cols, continuous_cols=continuous_cols  # type: ignore[arg-type]
+)
 X_tab = tab_preprocessor.fit_transform(df_train)
-deeptabular = TabMlp(
-    mlp_hidden_dims=[64, 32],
+
+# build the model
+wide = Wide(input_dim=np.unique(X_wide).shape[0], pred_dim=1)
+tab_mlp = TabMlp(
     column_idx=tab_preprocessor.column_idx,
-    embed_input=tab_preprocessor.cat_embed_input,
-    continuous_cols=cont_cols,
+    cat_embed_input=tab_preprocessor.cat_embed_input,
+    continuous_cols=continuous_cols,
 )
+model = WideDeep(wide=wide, deeptabular=tab_mlp)
 
-# wide and deep
-model = WideDeep(wide=wide, deeptabular=deeptabular)
-
-# train the model
+# train and validate
 trainer = Trainer(model, objective="binary", metrics=[Accuracy])
 trainer.fit(
     X_wide=X_wide,
@@ -248,10 +250,9 @@ trainer.fit(
     target=target,
     n_epochs=5,
     batch_size=256,
-    val_split=0.1,
 )
 
-# predict
+# predict on test
 X_wide_te = wide_preprocessor.transform(df_test)
 X_tab_te = tab_preprocessor.transform(df_test)
 preds = trainer.predict(X_wide=X_wide_te, X_tab=X_tab_te)
@@ -268,14 +269,11 @@ torch.save(model.state_dict(), "model_weights/wd_model.pt")
 # From here in advance, Option 1 or 2 are the same. I assume the user has
 # prepared the data and defined the new model components:
 # 1. Build the model
-model_new = WideDeep(wide=wide, deeptabular=deeptabular)
+model_new = WideDeep(wide=wide, deeptabular=tab_mlp)
 model_new.load_state_dict(torch.load("model_weights/wd_model.pt"))
 
 # 2. Instantiate the trainer
-trainer_new = Trainer(
-    model_new,
-    objective="binary",
-)
+trainer_new = Trainer(model_new, objective="binary")
 
 # 3. Either start the fit or directly predict
 preds = trainer_new.predict(X_wide=X_wide, X_tab=X_tab)
diff --git a/docs/index.rst b/docs/index.rst
index 2e573f66..32c3a33a 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -31,7 +31,8 @@ Documentation
 Introduction
 ------------
 ``pytorch-widedeep`` is based on Google's `Wide and Deep Algorithm
-<https://arxiv.org/abs/1606.07792>`_.
+<https://arxiv.org/abs/1606.07792>`_, adjusted for multi-modal datasets
+
 
 In general terms, ``pytorch-widedeep`` is a package to use deep learning with
 tabular and multimodal data. In particular, is intended to facilitate the
@@ -97,9 +98,12 @@ own, i.e. what one might normally refer as Deep Learning for Tabular Data.
 Currently, ``pytorch-widedeep`` offers the following different models for
 that component:
 
+0. **Wide**: a simple linear model where the nonlinearities are captured via
+cross-product transformations, as explained before.
 
 1. **TabMlp**: a simple MLP that receives embeddings representing the
-categorical features, concatenated with the continuous features.
+categorical features, concatenated with the continuous features, which can
+also be embedded.
 
 2. **TabResnet**: similar to the previous model but the embeddings are
 passed through a series of ResNet blocks built with dense layers.
@@ -107,7 +111,7 @@ passed through a series of ResNet blocks built with dense layers.
 3. **TabNet**: details on TabNet can be found in `TabNet: Attentive
 Interpretable Tabular Learning <https://arxiv.org/abs/1908.07442>`_
 
-And the ``Tabformer`` family, i.e. Transformers for Tabular data:
+The ``Tabformer`` family, i.e. Transformers for Tabular data:
 
 4. **TabTransformer**: details on the TabTransformer can be found in
 `TabTransformer: Tabular Data Modeling Using Contextual Embeddings
@@ -130,22 +134,24 @@ Models for Natural Language Understanding
 the Perceiver can be found in `Perceiver: General Perception with Iterative
 Attention <https://arxiv.org/abs/2103.03206>`_
 
+And probabilistic DL models for tabular data based on
+`Weight Uncertainty in Neural Networks <https://arxiv.org/abs/1505.05424>`_:
+
+9. **BayesianWide**: Probabilistic adaptation of the `Wide` model.
+
+10. **BayesianTabMlp**: Probabilistic adaptation of the `TabMlp` model
+
 Note that while there are scientific publications for the TabTransformer,
 SAINT and FT-Transformer, the TabFasfFormer and TabPerceiver are our own
-adaptation of those algorithms for tabular data.
-
-For details on these models and their options please see the examples in the
-Examples folder and the documentation.
-
-Finally, while I recommend using the ``wide`` and ``deeptabular`` models in
-``pytorch-widedeep`` it is very likely that users will want to use their own
-models for the ``deeptext`` and ``deepimage`` components. That is perfectly
-possible as long as the the custom models have an attribute called
-``output_dim`` with the size of the last layer of activations, so that
-``WideDeep`` can be constructed. Again, examples on how to use custom
-components can be found in the Examples folder. Just in case
-``pytorch-widedeep`` includes standard text (stack of LSTMs or GRUs) and
-image(pre-trained ResNets or stack of CNNs) models.
+adaptation of those algorithms for tabular data. For details on these models
+and their options please see the examples in the Examples folder and the
+documentation.
+
+Finally, it is perfectly possible to use custom models as long as the the
+custom models have an attribute called ``output_dim`` with the size of the
+last layer of activations, so that ``WideDeep`` can be constructed. Again,
+examples on how to use custom components can be found in the Examples
+folder.
 
 Indices and tables
 ==================
diff --git a/docs/quick_start.rst b/docs/quick_start.rst
index 60718364..e21d618e 100644
--- a/docs/quick_start.rst
+++ b/docs/quick_start.rst
@@ -15,8 +15,9 @@ Read and split the dataset
     import pandas as pd
     import numpy as np
     from sklearn.model_selection import train_test_split
+    from pytorch_widedeep.datasets import load_adult
 
-    df = pd.read_csv("data/adult/adult.csv.zip")
+    df = load_adult(as_frame=True)
     df["income_label"] = (df["income"].apply(lambda x: ">50K" in x)).astype(int)
     df.drop("income", axis=1, inplace=True)
     df_train, df_test = train_test_split(df, test_size=0.2, stratify=df.income_label)
@@ -28,13 +29,12 @@ Prepare the wide and deep columns
 
 .. code-block:: python
 
-    import torch
     from pytorch_widedeep import Trainer
     from pytorch_widedeep.preprocessing import WidePreprocessor, TabPreprocessor
     from pytorch_widedeep.models import Wide, TabMlp, WideDeep
     from pytorch_widedeep.metrics import Accuracy
 
-    # prepare wide, crossed, embedding and continuous columns
+    # Define the 'column set up'
     wide_cols = [
         "education",
         "relationship",
@@ -43,41 +43,45 @@ Prepare the wide and deep columns
         "native-country",
         "gender",
     ]
-    cross_cols = [("education", "occupation"), ("native-country", "occupation")]
-    embed_cols = [
-        ("education", 16),
-        ("workclass", 16),
-        ("occupation", 16),
-        ("native-country", 32),
-    ]
-    cont_cols = ["age", "hours-per-week"]
-    target_col = "income_label"
+    crossed_cols = [("education", "occupation"), ("native-country", "occupation")]
 
-    # target
-    target = df_train[target_col].values
+    cat_embed_cols = [
+        "workclass",
+        "education",
+        "marital-status",
+        "occupation",
+        "relationship",
+        "race",
+        "gender",
+        "capital-gain",
+        "capital-loss",
+        "native-country",
+    ]
+    continuous_cols = ["age", "hours-per-week"]
+    target = "income_label"
+    target = df_train[target].values
 
 Preprocessing and model components definition
 ---------------------------------------------
 
 .. code-block:: python
 
-    # wide
-    wide_preprocessor = WidePreprocessor(wide_cols=wide_cols, crossed_cols=cross_cols)
+    wide_preprocessor = WidePreprocessor(wide_cols=wide_cols, crossed_cols=crossed_cols)
     X_wide = wide_preprocessor.fit_transform(df_train)
-    wide = Wide(input_dim=np.unique(X_wide).shape[0], pred_dim=1)
 
-    # deeptabular
-    tab_preprocessor = TabPreprocessor(cat_embed_cols=embed_cols, continuous_cols=cont_cols)
+    tab_preprocessor = TabPreprocessor(
+        cat_embed_cols=cat_embed_cols, continuous_cols=continuous_cols  # type: ignore[arg-type]
+    )
     X_tab = tab_preprocessor.fit_transform(df_train)
-    deeptabular = TabMlp(
+
+    # build the model
+    wide = Wide(input_dim=np.unique(X_wide).shape[0], pred_dim=1)
+    tab_mlp = TabMlp(
         column_idx=tab_preprocessor.column_idx,
         cat_embed_input=tab_preprocessor.cat_embed_input,
-        continuous_cols=cont_cols,
-        mlp_hidden_dims=[64, 32],
+        continuous_cols=continuous_cols,
     )
-
-    # wide and deep
-    model = WideDeep(wide=wide, deeptabular=deeptabular)
+    model = WideDeep(wide=wide, deeptabular=tab_mlp)
 
 
 Fit and predict
@@ -85,7 +89,7 @@ Fit and predict
 
 .. code-block:: python
 
-    # train the model
+    # train and validate
     trainer = Trainer(model, objective="binary", metrics=[Accuracy])
     trainer.fit(
         X_wide=X_wide,
@@ -93,10 +97,9 @@ Fit and predict
         target=target,
         n_epochs=5,
         batch_size=256,
-        val_split=0.1,
     )
 
-    # predict
+    # predict on test
     X_wide_te = wide_preprocessor.transform(df_test)
     X_tab_te = tab_preprocessor.transform(df_test)
     preds = trainer.predict(X_wide=X_wide_te, X_tab=X_tab_te)
@@ -109,34 +112,23 @@ Save and load
 
     # Option 1: this will also save training history and lr history if the
     # LRHistory callback is used
-
-    # Day 0, you have trained your model, save it using the trainer.save
-    # method
     trainer.save(path="model_weights", save_state_dict=True)
 
     # Option 2: save as any other torch model
-
-    # Day 0, you have trained your model, save as any other torch model
     torch.save(model.state_dict(), "model_weights/wd_model.pt")
 
-    # From here in advance, Option 1 or 2 are the same
-
-    # Few days have passed...I assume the user has prepared the data and
-    # defined the model components:
+    # From here in advance, Option 1 or 2 are the same. I assume the user has
+    # prepared the data and defined the new model components:
     # 1. Build the model
-    model_new = WideDeep(wide=wide, deeptabular=deeptabular)
+    model_new = WideDeep(wide=wide, deeptabular=tab_mlp)
     model_new.load_state_dict(torch.load("model_weights/wd_model.pt"))
 
     # 2. Instantiate the trainer
-    trainer_new = Trainer(
-        model_new,
-        objective="binary",
-    )
+    trainer_new = Trainer(model_new, objective="binary")
 
-    # 3. Either fit or directly predict
+    # 3. Either start the fit or directly predict
     preds = trainer_new.predict(X_wide=X_wide, X_tab=X_tab)
 
-
 Of course, one can do **much more**. See the Examples folder in the repo, this
 documentation or the companion posts for a better understanding of the content
 of the package and its functionalities.
diff --git a/pypi_README.md b/pypi_README.md
index 90af089f..a83248c1 100644
--- a/pypi_README.md
+++ b/pypi_README.md
@@ -11,8 +11,8 @@
 
 # pytorch-widedeep
 
-A flexible package to use Deep Learning with tabular data, text and images
-using wide and deep models.
+A flexible package for multimodal-deep-learning to combine tabular data with
+text and images using Wide and Deep models in Pytorch
 
 **Documentation:** [https://pytorch-widedeep.readthedocs.io](https://pytorch-widedeep.readthedocs.io/en/latest/index.html)
 
@@ -24,7 +24,8 @@ using wide and deep models.
 
 ### Introduction
 
-``pytorch-widedeep`` is based on Google's [Wide and Deep Algorithm](https://arxiv.org/abs/1606.07792)
+``pytorch-widedeep`` is based on Google's [Wide and Deep Algorithm](https://arxiv.org/abs/1606.07792),
+adjusted for multi-modal datasets
 
 In general terms, `pytorch-widedeep` is a package to use deep learning with
 tabular data. In particular, is intended to facilitate the combination of text
@@ -35,7 +36,7 @@ architectures please visit the
 [repo](https://github.com/jrzaurin/pytorch-widedeep).
 
 
-### Installation
+###  Installation
 
 Install using pip:
 
@@ -60,20 +61,6 @@ cd pytorch-widedeep
 pip install -e .
 ```
 
-**Important note for Mac users**: Since `python
-3.8`, [the `multiprocessing` library start method changed from `'fork'` to`'spawn'`](https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods) which affects the data-loaders.
-For the time being, `pytorch-widedeep` sets the `num_workers` to 0 when using
-Mac and python version 3.8+.
-
-Note that this issue does not affect Linux users.
-
-```bash
-pip install pytorch-widedeep
-pip install torch==1.6.0 torchvision==0.7.0
-```
-
-None of these issues affect Linux users.
-
 ### Quick start
 
 Binary classification with the [adult
@@ -83,7 +70,6 @@ using `Wide` and `DeepDense` and defaults settings.
 Building a wide (linear) and deep model with ``pytorch-widedeep``:
 
 ```python
-
 import pandas as pd
 import numpy as np
 import torch
@@ -93,16 +79,15 @@ from pytorch_widedeep import Trainer
 from pytorch_widedeep.preprocessing import WidePreprocessor, TabPreprocessor
 from pytorch_widedeep.models import Wide, TabMlp, WideDeep
 from pytorch_widedeep.metrics import Accuracy
+from pytorch_widedeep.datasets import load_adult
+
 
-# the following 4 lines are not directly related to ``pytorch-widedeep``. I
-# assume you have downloaded the dataset and place it in a dir called
-# data/adult/
-df = pd.read_csv("data/adult/adult.csv.zip")
+df = load_adult(as_frame=True)
 df["income_label"] = (df["income"].apply(lambda x: ">50K" in x)).astype(int)
 df.drop("income", axis=1, inplace=True)
 df_train, df_test = train_test_split(df, test_size=0.2, stratify=df.income_label)
 
-# prepare wide, crossed, embedding and continuous columns
+# Define the 'column set up'
 wide_cols = [
     "education",
     "relationship",
@@ -111,38 +96,43 @@ wide_cols = [
     "native-country",
     "gender",
 ]
-cross_cols = [("education", "occupation"), ("native-country", "occupation")]
-embed_cols = [
-    ("education", 16),
-    ("workclass", 16),
-    ("occupation", 16),
-    ("native-country", 32),
-]
-cont_cols = ["age", "hours-per-week"]
-target_col = "income_label"
+crossed_cols = [("education", "occupation"), ("native-country", "occupation")]
 
-# target
-target = df_train[target_col].values
+cat_embed_cols = [
+    "workclass",
+    "education",
+    "marital-status",
+    "occupation",
+    "relationship",
+    "race",
+    "gender",
+    "capital-gain",
+    "capital-loss",
+    "native-country",
+]
+continuous_cols = ["age", "hours-per-week"]
+target = "income_label"
+target = df_train[target].values
 
-# wide
-wide_preprocessor = WidePreprocessor(wide_cols=wide_cols, crossed_cols=cross_cols)
+# prepare the data
+wide_preprocessor = WidePreprocessor(wide_cols=wide_cols, crossed_cols=crossed_cols)
 X_wide = wide_preprocessor.fit_transform(df_train)
-wide = Wide(wide_dim=np.unique(X_wide).shape[0], pred_dim=1)
 
-# deeptabular
-tab_preprocessor = TabPreprocessor(cat_embed_cols=embed_cols, continuous_cols=cont_cols)
+tab_preprocessor = TabPreprocessor(
+    cat_embed_cols=cat_embed_cols, continuous_cols=continuous_cols  # type: ignore[arg-type]
+)
 X_tab = tab_preprocessor.fit_transform(df_train)
-deeptabular = TabMlp(
-    mlp_hidden_dims=[64, 32],
+
+# build the model
+wide = Wide(input_dim=np.unique(X_wide).shape[0], pred_dim=1)
+tab_mlp = TabMlp(
     column_idx=tab_preprocessor.column_idx,
-    embed_input=tab_preprocessor.cat_embed_input,
-    continuous_cols=cont_cols,
+    cat_embed_input=tab_preprocessor.cat_embed_input,
+    continuous_cols=continuous_cols,
 )
+model = WideDeep(wide=wide, deeptabular=tab_mlp)
 
-# wide and deep
-model = WideDeep(wide=wide, deeptabular=deeptabular)
-
-# train the model
+# train and validate
 trainer = Trainer(model, objective="binary", metrics=[Accuracy])
 trainer.fit(
     X_wide=X_wide,
@@ -150,10 +140,9 @@ trainer.fit(
     target=target,
     n_epochs=5,
     batch_size=256,
-    val_split=0.1,
 )
 
-# predict
+# predict on test
 X_wide_te = wide_preprocessor.transform(df_test)
 X_tab_te = tab_preprocessor.transform(df_test)
 preds = trainer.predict(X_wide=X_wide_te, X_tab=X_tab_te)
@@ -170,14 +159,11 @@ torch.save(model.state_dict(), "model_weights/wd_model.pt")
 # From here in advance, Option 1 or 2 are the same. I assume the user has
 # prepared the data and defined the new model components:
 # 1. Build the model
-model_new = WideDeep(wide=wide, deeptabular=deeptabular)
+model_new = WideDeep(wide=wide, deeptabular=tab_mlp)
 model_new.load_state_dict(torch.load("model_weights/wd_model.pt"))
 
 # 2. Instantiate the trainer
-trainer_new = Trainer(
-    model_new,
-    objective="binary",
-)
+trainer_new = Trainer(model_new, objective="binary")
 
 # 3. Either start the fit or directly predict
 preds = trainer_new.predict(X_wide=X_wide, X_tab=X_tab)