Skip to content

Commit

Permalink
Merge pull request #20 from jrzaurin/wide_embedding
Browse files Browse the repository at this point in the history
Wide embedding
  • Loading branch information
jrzaurin committed Aug 9, 2020
2 parents 65465a4 + e40a088 commit 627caf4
Show file tree
Hide file tree
Showing 37 changed files with 686 additions and 385 deletions.
38 changes: 19 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,12 @@

[![Build Status](https://travis-ci.org/jrzaurin/pytorch-widedeep.svg?branch=master)](https://travis-ci.org/jrzaurin/pytorch-widedeep)
[![Documentation Status](https://readthedocs.org/projects/pytorch-widedeep/badge/?version=latest)](https://pytorch-widedeep.readthedocs.io/en/latest/?badge=latest)
[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](https://github.com/jrzaurin/pytorch-widedeep/graphs/commit-activity)

Platform | Version Support
---------|:---------------
OSX | [![Python 3.6 3.7](https://img.shields.io/badge/python-3.6%20%203.7-blue.svg)](https://www.python.org/)
Linux | [![Python 3.6 3.7 3.8](https://img.shields.io/badge/python-3.6%20%203.7%203.8-blue.svg)](https://www.python.org/)

# pytorch-widedeep

Expand Down Expand Up @@ -34,11 +40,11 @@ few lines of code.
<img width="600" src="docs/figures/architecture_1.png">
</p>

Architecture 1 combines the `Wide`, one-hot encoded features with the outputs
from the `DeepDense`, `DeepText` and `DeepImage` components connected to a
final output neuron or neurons, depending on whether we are performing a
binary classification or regression, or a multi-class classification. The
components within the faded-pink rectangles are concatenated.
Architecture 1 combines the `Wide`, Linear model with the outputs from the
`DeepDense`, `DeepText` and `DeepImage` components connected to a final output
neuron or neurons, depending on whether we are performing a binary
classification or regression, or a multi-class classification. The components
within the faded-pink rectangles are concatenated.

In math terms, and following the notation in the
[paper](https://arxiv.org/abs/1606.07792), Architecture 1 can be formulated
Expand All @@ -65,10 +71,10 @@ otherwise".*
<img width="600" src="docs/figures/architecture_2.png">
</p>

Architecture 2 combines the `Wide` one-hot encoded features with the Deep
components of the model connected to the output neuron(s), after the different
Deep components have been themselves combined through a FC-Head (that I refer
as `deephead`).
Architecture 2 combines the `Wide`, Linear model with the Deep components of
the model connected to the output neuron(s), after the different Deep
components have been themselves combined through a FC-Head (that I refer as
`deephead`).

In math terms, and following the notation in the
[paper](https://arxiv.org/abs/1606.07792), Architecture 2 can be formulated
Expand All @@ -84,7 +90,8 @@ and `DeepImage` are optional. `pytorch-widedeep` includes standard text (stack
of LSTMs) and image (pre-trained ResNets or stack of CNNs) models. However,
the user can use any custom model as long as it has an attribute called
`output_dim` with the size of the last layer of activations, so that
`WideDeep` can be constructed. See the examples folder for more information.
`WideDeep` can be constructed. See the examples folder or the docs for more
information.


### Installation
Expand Down Expand Up @@ -112,14 +119,6 @@ cd pytorch-widedeep
pip install -e .
```

### Examples

There are a number of notebooks in the `examples` folder plus some additional
files. These notebooks cover most of the utilities of this package and can
also act as documentation. In the case that github does not render the
notebooks, or it renders them missing some parts, they are saved as markdown
files in the `docs` folder.

### Quick start

Binary classification with the [adult
Expand All @@ -128,6 +127,7 @@ using `Wide` and `DeepDense` and defaults settings.

```python
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

from pytorch_widedeep.preprocessing import WidePreprocessor, DensePreprocessor
Expand Down Expand Up @@ -166,7 +166,7 @@ target = df_train[target_col].values
# wide
preprocess_wide = WidePreprocessor(wide_cols=wide_cols, crossed_cols=cross_cols)
X_wide = preprocess_wide.fit_transform(df_train)
wide = Wide(wide_dim=X_wide.shape[1], pred_dim=1)
wide = Wide(wide_dim=np.unique(X_wide).shape[0], pred_dim=1)

# deepdense
preprocess_deep = DensePreprocessor(embed_cols=embed_cols, continuous_cols=cont_cols)
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.4.2
0.4.5
File renamed without changes.
Binary file modified docs/figures/architecture_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/figures/architecture_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
24 changes: 13 additions & 11 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,10 @@ Documentation
Utilities <utils/index>
Preprocessing <preprocessing>
Model Components <model_components>
Wide and Deep Models <wide_deep/index>
Metrics <metrics>
Callbacks <callbacks>
Focal Loss <losses>
Wide and Deep Models <wide_deep>
Examples <examples>


Expand All @@ -45,12 +48,11 @@ Architectures
:width: 600px
:align: center

Architecture 1 combines the ``Wide``, one-hot encoded features with the
outputs from the ``DeepDense``, ``DeepText`` and ``DeepImage`` components
connected to a final output neuron or neurons, depending on whether we are
performing a binary classification or regression, or a multi-class
classification. The components within the faded-pink rectangles are
concatenated.
Architecture 1 combines the `Wide`, Linear model with the outputs from the
`DeepDense`, `DeepText` and `DeepImage` components connected to a final output
neuron or neurons, depending on whether we are performing a binary
classification or regression, or a multi-class classification. The components
within the faded-pink rectangles are concatenated.

In math terms, and following the notation in the `paper
<https://arxiv.org/abs/1606.07792>`_, Architecture 1 can be formulated as:
Expand All @@ -76,10 +78,10 @@ is the activation function.
:width: 600px
:align: center

Architecture 2 combines the ``Wide`` one-hot encoded features with the Deep
components of the model connected to the output neuron(s), after the different
Deep components have been themselves combined through a FC-Head (referred as
as ``deephead``).
Architecture 2 combines the `Wide`, Linear model with the Deep components of
the model connected to the output neuron(s), after the different Deep
components have been themselves combined through a FC-Head (that I refer as
`deephead`).

In math terms, and following the notation in the `paper
<https://arxiv.org/abs/1606.07792>`_, Architecture 2 can be formulated as:
Expand Down
File renamed without changes.
File renamed without changes.
3 changes: 1 addition & 2 deletions docs/model_components.rst
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
The ``models`` module
=====================
======================

This module contains the four main Wide and Deep model component. These are:
``Wide``, ``DeepDense``, ``DeepText`` and ``DeepImage``.


.. autoclass:: pytorch_widedeep.models.wide.Wide
:members:
:undoc-members:
Expand Down
4 changes: 3 additions & 1 deletion docs/quick_start.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ The following code snippet is not directly related to ``pytorch-widedeep``.
.. code-block:: python
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
df = pd.read_csv("data/adult/adult.csv.zip")
Expand All @@ -23,6 +24,7 @@ The following code snippet is not directly related to ``pytorch-widedeep``.
df_train, df_test = train_test_split(df, test_size=0.2, stratify=df.income_label)
Prepare the wide and deep columns
---------------------------------

Expand Down Expand Up @@ -63,7 +65,7 @@ Preprocessing and model components definition
# wide
preprocess_wide = WidePreprocessor(wide_cols=wide_cols, crossed_cols=cross_cols)
X_wide = preprocess_wide.fit_transform(df_train)
wide = Wide(wide_dim=X_wide.shape[1], pred_dim=1)
wide = Wide(wide_dim=np.unique(X_wide).shape[0], pred_dim=1)
# deepdense
preprocess_deep = DensePreprocessor(embed_cols=embed_cols, continuous_cols=cont_cols)
Expand Down
3 changes: 3 additions & 0 deletions docs/wide_deep/wide_deep.rst → docs/wide_deep.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,9 @@
Building Wide and Deep Models
=============================

Here is the documentation to build the two architectures, and the different
options available in ``pytorch-widedeep`` as one builds the model.

:class:`pytorch_widedeep.models.wide_deep.WideDeep` is the main class. It will
collect all model components and build one of the two possible architectures
with a series of optional parameters.
Expand Down
15 changes: 0 additions & 15 deletions docs/wide_deep/index.rst

This file was deleted.

94 changes: 77 additions & 17 deletions examples/01_Preprocessors_and_utils.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,9 @@
"source": [
"## 1. WidePreprocessor\n",
"\n",
"This class simply takes a dataset and one-hot encodes it, with a few additional rings and bells. "
"The Wide component of the model is a linear model that in principle, could be implemented as a linear layer receiving the result of on one-hot encoding categorical columns. However, this is not memory efficient. Therefore, we implement a liner layer as an Embedding layer plus a bias. I will explain in a bit more detail later. \n",
"\n",
"With that in mind, `WidePreprocessor` simply encodes the categories numerically so that they are the indexes of the lookup table that is an Embedding layer."
]
},
{
Expand Down Expand Up @@ -284,13 +286,13 @@
{
"data": {
"text/plain": [
"array([[0., 1., 0., ..., 0., 0., 0.],\n",
" [0., 0., 0., ..., 0., 0., 0.],\n",
" [0., 0., 0., ..., 0., 0., 0.],\n",
"array([[ 1, 17, 23, ..., 89, 91, 316],\n",
" [ 2, 18, 23, ..., 89, 92, 317],\n",
" [ 3, 18, 24, ..., 89, 93, 318],\n",
" ...,\n",
" [0., 0., 0., ..., 0., 0., 0.],\n",
" [0., 0., 0., ..., 0., 0., 0.],\n",
" [0., 0., 0., ..., 0., 0., 0.]])"
" [ 2, 20, 23, ..., 90, 103, 323],\n",
" [ 2, 17, 23, ..., 89, 103, 323],\n",
" [ 2, 21, 29, ..., 90, 115, 324]])"
]
},
"execution_count": 6,
Expand All @@ -306,45 +308,103 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"or sparse"
"Let's take from example the first entry"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"outputs": [
{
"data": {
"text/plain": [
"array([ 1, 17, 23, 32, 47, 89, 91, 316])"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"wide_preprocessor_sparse = WidePreprocessor(wide_cols=wide_cols, crossed_cols=crossed_cols, sparse=True)\n",
"X_wide_sparse = wide_preprocessor_sparse.fit_transform(df)"
"X_wide[0]"
]
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>education</th>\n",
" <th>relationship</th>\n",
" <th>workclass</th>\n",
" <th>occupation</th>\n",
" <th>native-country</th>\n",
" <th>gender</th>\n",
" <th>education_occupation</th>\n",
" <th>native-country_occupation</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>11th</td>\n",
" <td>Own-child</td>\n",
" <td>Private</td>\n",
" <td>Machine-op-inspct</td>\n",
" <td>United-States</td>\n",
" <td>Male</td>\n",
" <td>11th-Machine-op-inspct</td>\n",
" <td>United-States-Machine-op-inspct</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
"<48842x796 sparse matrix of type '<class 'numpy.float64'>'\n",
"\twith 390736 stored elements in Compressed Sparse Row format>"
" education relationship workclass occupation native-country gender \\\n",
"0 11th Own-child Private Machine-op-inspct United-States Male \n",
"\n",
" education_occupation native-country_occupation \n",
"0 11th-Machine-op-inspct United-States-Machine-op-inspct "
]
},
"execution_count": 8,
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"X_wide_sparse"
"wide_preprocessor.inverse_transform(X_wide[:1])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that while this will save memory on disk, due to the batch generation process for `WideDeep` the running time will be notably slow. See [here](https://github.com/jrzaurin/pytorch-widedeep/blob/bfbe6e5d2309857db0dcc5cf3282dfa60504aa52/pytorch_widedeep/models/_wd_dataset.py#L47) for more details."
"As we can see, `wide_preprocessor` numerically encodes the `wide_cols` and the `crossed_cols`, which can be recovered using the method `inverse_transform`."
]
},
{
Expand Down
Loading

0 comments on commit 627caf4

Please sign in to comment.