Merge pull request #80 from jrzaurin/pmulinka/dir

Pmulinka/dir
jrzaurin · Mar 10, 2022 · 923011c · 923011c
2 parents 8b4c3a8 + 108cebc
commit 923011c
Show file tree

Hide file tree

Showing 189 changed files with 16,189 additions and 161,921 deletions.
diff --git a/.coveragerc b/.coveragerc
@@ -2,8 +2,10 @@
 parallel = True
 omit =
     pytorch_widedeep/optim/*
+    pytorch_widedeep/bayesian_models/bayesian_nn/modules/*
 
 [report]
 omit =
     pytorch_widedeep/optim/*
+    pytorch_widedeep/bayesian_models/bayesian_nn/modules/*
 precision = 2
diff --git a/.gitignore b/.gitignore
@@ -11,6 +11,10 @@ __pycache__*
 .ipynb_checkpoints
 Untitled*.ipynb
 
+# sublime debugger
+*.sublime-project
+*.sublime-workspace
+
 # data related dirs
 tmp_data/
 model_weights/
@@ -27,6 +31,9 @@ htmlcov*/
 .cache
 .hypothesis/
 
+# vscode
+.vscode
+
 # sublime
 *.sublime-workspace
 sftp*-config.json
@@ -44,4 +51,8 @@ _build
 _templates
 
 # test checkpoints
-checkpoints
+checkpoints
+
+# wnb
+wandb/
+wandb_api.key
diff --git a/README.md b/README.md
@@ -15,14 +15,14 @@
 
 # pytorch-widedeep
 
-A flexible package to use Deep Learning with tabular data, text and images
-using wide and deep models.
+A flexible package for multimodal-deep-learning to combine tabular data with
+text and images using Wide and Deep models in Pytorch
 
 **Documentation:** [https://pytorch-widedeep.readthedocs.io](https://pytorch-widedeep.readthedocs.io/en/latest/index.html)
 
 **Companion posts and tutorials:** [infinitoml](https://jrzaurin.github.io/infinitoml/)
 
-**Experiments and comparisson with `LightGBM`**: [TabularDL vs LightGBM](https://github.com/jrzaurin/tabulardl-benchmark)
+**Experiments and comparison with `LightGBM`**: [TabularDL vs LightGBM](https://github.com/jrzaurin/tabulardl-benchmark)
 
 The content of this document is organized as follows:
 
@@ -33,7 +33,8 @@ The content of this document is organized as follows:
 
 ### Introduction
 
-``pytorch-widedeep`` is based on Google's [Wide and Deep Algorithm](https://arxiv.org/abs/1606.07792)
+``pytorch-widedeep`` is based on Google's [Wide and Deep Algorithm](https://arxiv.org/abs/1606.07792),
+adjusted for multi-modal datasets
 
 In general terms, `pytorch-widedeep` is a package to use deep learning with
 tabular data. In particular, is intended to facilitate the combination of text
@@ -89,15 +90,11 @@ into:
   <img width="300" src="docs/figures/architecture_2_math.png">
 </p>
 
-I recommend using the ``wide`` and ``deeptabular`` models in
-``pytorch-widedeep``. However it is very likely that users will want to use
-their own models for the ``deeptext`` and ``deepimage`` components. That is
-perfectly possible as long as the the custom models have an attribute called
+It is perfectly possible to use custom models (and not necessarily those in
+the library) as long as the the custom models have an attribute called
 ``output_dim`` with the size of the last layer of activations, so that
-``WideDeep`` can be constructed. Again, examples on how to use custom
-components can be found in the Examples folder. Just in case
-``pytorch-widedeep`` includes standard text (stack of LSTMs) and image
-(pre-trained ResNets or stack of CNNs) models.
+``WideDeep`` can be constructed. Examples on how to use custom components can
+be found in the Examples folder.
 
 ### The ``deeptabular`` component
 
@@ -110,15 +107,17 @@ its own, i.e. what one might normally refer as Deep Learning for Tabular
 Data. Currently, ``pytorch-widedeep`` offers the following different models
 for that component:
 
-
+0. **Wide**: a simple linear model where the nonlinearities are captured via
+cross-product transformations, as explained before.
 1. **TabMlp**: a simple MLP that receives embeddings representing the
-categorical features, concatenated with the continuous features.
+categorical features, concatenated with the continuous features, which can
+also be embedded.
 2. **TabResnet**: similar to the previous model but the embeddings are
 passed through a series of ResNet blocks built with dense layers.
 3. **TabNet**: details on TabNet can be found in
 [TabNet: Attentive Interpretable Tabular Learning](https://arxiv.org/abs/1908.07442)
 
-And the ``Tabformer`` family, i.e. Transformers for Tabular data:
+The ``Tabformer`` family, i.e. Transformers for Tabular data:
 
 4. **TabTransformer**: details on the TabTransformer can be found in
 [TabTransformer: Tabular Data Modeling Using Contextual Embeddings](https://arxiv.org/pdf/2012.06678.pdf).
@@ -133,12 +132,19 @@ on the Fasformer can be found in
 the Perceiver can be found in
 [Perceiver: General Perception with Iterative Attention](https://arxiv.org/abs/2103.03206)
 
+And probabilistic DL models for tabular data based on
+[Weight Uncertainty in Neural Networks](https://arxiv.org/abs/1505.05424):
+
+9. **BayesianWide**: Probabilistic adaptation of the `Wide` model.
+10. **BayesianTabMlp**: Probabilistic adaptation of the `TabMlp` model
+
 Note that while there are scientific publications for the TabTransformer,
 SAINT and FT-Transformer, the TabFasfFormer and TabPerceiver are our own
 adaptation of those algorithms for tabular data.
 
-For details on these models and their options please see the examples in the
-Examples folder and the documentation.
+For details on these models (and all the other models in the library for the
+different data modes) and their corresponding options please see the examples
+in the Examples folder and the documentation.
 
 ###  Installation
 
@@ -165,27 +171,6 @@ cd pytorch-widedeep
 pip install -e .
 ```
 
-**Important note for Mac users**: at the time of writing the latest `torch`
-release is `1.9`. Some past [issues](https://stackoverflow.com/questions/64772335/pytorch-w-parallelnative-cpp206)
-when running on Mac, present in previous versions, persist on this release
-and the data-loaders will not run in parallel. In addition, since `python
-3.8`, [the `multiprocessing` library start method changed from `'fork'` to`'spawn'`](https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods).
-This also affects the data-loaders (for any `torch` version) and they will
-not run in parallel. Therefore, for Mac users I recommend using `python 3.7`
-and `torch <= 1.6` (with the corresponding, consistent
-version of `torchvision`, e.g. `0.7.0` for `torch 1.6`). I do not want to
-force this versioning in the `setup.py` file since I expect that all these
-issues are fixed in the future. Therefore, after installing
-`pytorch-widedeep` via pip or directly from github, downgrade `torch` and
-`torchvision` manually:
-
-```bash
-pip install pytorch-widedeep
-pip install torch==1.6.0 torchvision==0.7.0
-```
-
-None of these issues affect Linux users.
-
 ### Quick start
 
 Binary classification with the [adult
@@ -195,7 +180,6 @@ using `Wide` and `DeepDense` and defaults settings.
 Building a wide (linear) and deep model with ``pytorch-widedeep``:
 
 ```python
-
 import pandas as pd
 import numpy as np
 import torch
@@ -205,16 +189,15 @@ from pytorch_widedeep import Trainer
 from pytorch_widedeep.preprocessing import WidePreprocessor, TabPreprocessor
 from pytorch_widedeep.models import Wide, TabMlp, WideDeep
 from pytorch_widedeep.metrics import Accuracy
+from pytorch_widedeep.datasets import load_adult
+
 
-# the following 4 lines are not directly related to ``pytorch-widedeep``. I
-# assume you have downloaded the dataset and place it in a dir called
-# data/adult/
-df = pd.read_csv("data/adult/adult.csv.zip")
+df = load_adult(as_frame=True)
 df["income_label"] = (df["income"].apply(lambda x: ">50K" in x)).astype(int)
 df.drop("income", axis=1, inplace=True)
 df_train, df_test = train_test_split(df, test_size=0.2, stratify=df.income_label)
 
-# prepare wide, crossed, embedding and continuous columns
+# Define the 'column set up'
 wide_cols = [
     "education",
     "relationship",
@@ -223,49 +206,53 @@ wide_cols = [
     "native-country",
     "gender",
 ]
-cross_cols = [("education", "occupation"), ("native-country", "occupation")]
-embed_cols = [
-    ("education", 16),
-    ("workclass", 16),
-    ("occupation", 16),
-    ("native-country", 32),
-]
-cont_cols = ["age", "hours-per-week"]
-target_col = "income_label"
+crossed_cols = [("education", "occupation"), ("native-country", "occupation")]
 
-# target
-target = df_train[target_col].values
+cat_embed_cols = [
+    "workclass",
+    "education",
+    "marital-status",
+    "occupation",
+    "relationship",
+    "race",
+    "gender",
+    "capital-gain",
+    "capital-loss",
+    "native-country",
+]
+continuous_cols = ["age", "hours-per-week"]
+target = "income_label"
+target = df_train[target].values
 
-# wide
-wide_preprocessor = WidePreprocessor(wide_cols=wide_cols, crossed_cols=cross_cols)
+# prepare the data
+wide_preprocessor = WidePreprocessor(wide_cols=wide_cols, crossed_cols=crossed_cols)
 X_wide = wide_preprocessor.fit_transform(df_train)
-wide = Wide(wide_dim=np.unique(X_wide).shape[0], pred_dim=1)
 
-# deeptabular
-tab_preprocessor = TabPreprocessor(embed_cols=embed_cols, continuous_cols=cont_cols)
+tab_preprocessor = TabPreprocessor(
+    cat_embed_cols=cat_embed_cols, continuous_cols=continuous_cols  # type: ignore[arg-type]
+)
 X_tab = tab_preprocessor.fit_transform(df_train)
-deeptabular = TabMlp(
-    mlp_hidden_dims=[64, 32],
+
+# build the model
+wide = Wide(input_dim=np.unique(X_wide).shape[0], pred_dim=1)
+tab_mlp = TabMlp(
     column_idx=tab_preprocessor.column_idx,
-    embed_input=tab_preprocessor.embeddings_input,
-    continuous_cols=cont_cols,
+    cat_embed_input=tab_preprocessor.cat_embed_input,
+    continuous_cols=continuous_cols,
 )
+model = WideDeep(wide=wide, deeptabular=tab_mlp)
 
-# wide and deep
-model = WideDeep(wide=wide, deeptabular=deeptabular)
-
-# train the model
+# train and validate
 trainer = Trainer(model, objective="binary", metrics=[Accuracy])
 trainer.fit(
     X_wide=X_wide,
     X_tab=X_tab,
     target=target,
     n_epochs=5,
     batch_size=256,
-    val_split=0.1,
 )
 
-# predict
+# predict on test
 X_wide_te = wide_preprocessor.transform(df_test)
 X_tab_te = tab_preprocessor.transform(df_test)
 preds = trainer.predict(X_wide=X_wide_te, X_tab=X_tab_te)
@@ -282,14 +269,11 @@ torch.save(model.state_dict(), "model_weights/wd_model.pt")
 # From here in advance, Option 1 or 2 are the same. I assume the user has
 # prepared the data and defined the new model components:
 # 1. Build the model
-model_new = WideDeep(wide=wide, deeptabular=deeptabular)
+model_new = WideDeep(wide=wide, deeptabular=tab_mlp)
 model_new.load_state_dict(torch.load("model_weights/wd_model.pt"))
 
 # 2. Instantiate the trainer
-trainer_new = Trainer(
-    model_new,
-    objective="binary",
-)
+trainer_new = Trainer(model_new, objective="binary")
 
 # 3. Either start the fit or directly predict
 preds = trainer_new.predict(X_wide=X_wide, X_tab=X_tab)

diff --git a/VERSION b/VERSION
@@ -1 +1 @@
-1.0.14
+1.1.0
diff --git a/docs/bayesian_models.rst b/docs/bayesian_models.rst
@@ -0,0 +1,15 @@
+The ``bayesian models`` module
+==============================
+
+This module contains the two Bayesian Models available in this library, namely
+the bayesian version of the ``Wide`` and ``TabMlp`` models, referred as
+``BayesianWide`` and ``BayesianTabMlp``
+
+
+.. autoclass:: pytorch_widedeep.bayesian_models.tabular.bayesian_linear.bayesian_wide.BayesianWide
+    :exclude-members: forward
+    :members:
+
+.. autoclass:: pytorch_widedeep.bayesian_models.tabular.bayesian_mlp.bayesian_tab_mlp.BayesianTabMlp
+    :exclude-members: forward
+    :members:
diff --git a/docs/conf.py b/docs/conf.py
@@ -103,7 +103,7 @@
 # List of patterns, relative to source directory, that match files and
 # directories to ignore when looking for source files.
 # This pattern also affects html_static_path and html_extra_path.
-exclude_patterns = [u"_build", "Thumbs.db", ".DS_Store"]
+exclude_patterns = ["_build", "Thumbs.db", ".DS_Store"]
 
 # The name of the Pygments (syntax highlighting) style to use.
 pygments_style = "sphinx"

diff --git a/docs/examples.rst b/docs/examples.rst
@@ -5,16 +5,17 @@ This section provides links to example notebooks that may be helpful to better
 understand the functionalities withing ``pytorch-widedeep`` and how to use
 them to address different problems
 
-* `Preprocessors and Utils <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/01_Preprocessors_and_utils.ipynb>`__
-* `Model Components <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/02_1_Model_Components.ipynb>`__
-* `deeptabular Models <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/02_2_deeptabular_models.ipynb>`__
-* `Binary Classification with default parameters <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/03_Binary_Classification_with_Defaults.ipynb>`__
-* `Binary Classification with varying parameters <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/04_Binary_Classification_Varying_Parameters.ipynb>`__
-* `Regression with Images and Text <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/05_Regression_with_Images_and_Text.ipynb>`__
-* `FineTune routines <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/06_FineTune_and_WarmUp_Model_Components.ipynb>`__
-* `Custom Components <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/07_Custom_Components.ipynb>`__
-* `Save and Load Model and Artifacts <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/08_save_and_load_model_and_artifacts.ipynb>`__
-* `Using Custom DataLoaders and Torchmetrics <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/09_Custom_DataLoader_Imbalanced_dataset.ipynb>`__
-* `The Transformer Family <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/10_The_Transformer_Family.ipynb>`__
-* `Extracting Embeddings <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/11_Extracting_Embeddings.ipynb>`__
-* `HyperParameter Tuning With RayTune <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/12_HyperParameter_tuning_w_RayTune.ipynb>`__
+* `Preprocessors and Utils <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/notebooks/01_Preprocessors_and_utils.ipynb>`__
+* `Model Components <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/notebooks/02_model_components.ipynb>`__
+* `Binary Classification with default parameters <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/notebooks/03_Binary_Classification_with_Defaults.ipynb>`__
+* `Regression with Images and Text <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/notebooks/04_regression_with_images_and_text.ipynb>`__
+* `Save and Load Model and Artifacts <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/notebooks/05_save_and_load_model_and_artifacts.ipynb>`__
+* `FineTune routines <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/notebooks/06_fineTune_and_warmup.ipynb>`__
+* `Custom Components <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/notebooks/07_Custom_Components.ipynb>`__
+* `Using Custom DataLoaders and Torchmetrics <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/notebooks/08_custom_dataLoader_imbalanced_dataset.ipynb>`__
+* `Extracting Embeddings <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/notebooks/09_extracting_embeddings.ipynb>`__
+* `HyperParameter Tuning With RayTune <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/notebooks/10_hyperParameter_tuning_w_raytune_n_wnb.ipynb>`__
+* `Model Uncertainty Prediction <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/notebooks/13_Model_Uncertainty_prediction.ipynb>`__
+* `Bayesian Models <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/notebooks/14_bayesian_models.ipynb>`__
+* `Deep Imbalanced Regression <https://github.com/jrzaurin/pytorch-widedeep/blob/master/examples/notebooks/15_DIR-LDS_and_FDS.ipynb>`__
+
diff --git a/docs/figures/01_Preprocessors_and_utils_40_0.png b/docs/figures/01_Preprocessors_and_utils_40_0.png
diff --git a/docs/figures/01_Preprocessors_and_utils_43_0.png b/docs/figures/01_Preprocessors_and_utils_43_0.png
diff --git a/docs/figures/01_Preprocessors_and_utils_46_0.png b/docs/figures/01_Preprocessors_and_utils_46_0.png
diff --git a/docs/figures/ft_transformer_arch.png b/docs/figures/ft_transformer_arch.png
diff --git a/docs/figures/resnet_block.png b/docs/figures/resnet_block.png
diff --git a/docs/figures/saint_arch.png b/docs/figures/saint_arch.png
diff --git a/docs/figures/tabmlp_arch.png b/docs/figures/tabmlp_arch.png
diff --git a/docs/figures/tabnet_arch_1.png b/docs/figures/tabnet_arch_1.png
diff --git a/docs/figures/tabnet_arch_2.png b/docs/figures/tabnet_arch_2.png
diff --git a/docs/figures/tabresnet_arch.png b/docs/figures/tabresnet_arch.png
diff --git a/docs/figures/tabtransformer_arch.png b/docs/figures/tabtransformer_arch.png
diff --git a/docs/figures/transformer_block.png b/docs/figures/transformer_block.png