Skip to content

Commit

Permalink
Update examples
Browse files Browse the repository at this point in the history
  • Loading branch information
Gustavo Viera López committed Jun 8, 2023
1 parent 45385b5 commit 8eb5c83
Show file tree
Hide file tree
Showing 6 changed files with 423 additions and 13 deletions.
2 changes: 1 addition & 1 deletion docs/source/examples/example1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ To load the original GeoLife dataset we can simply do:
dataset = Dataset.geolife()
Then, we can process it to keep only the desired classes, combine similar classes
and create a tran/test split as proposed on [1]:
and create a train/test split as proposed on [1]:

.. code-block:: python
Expand Down
105 changes: 105 additions & 0 deletions docs/source/examples/example2.rst
Original file line number Diff line number Diff line change
@@ -1,2 +1,107 @@
Example 2
=========

We illustrate how to evaluate a Transformer Network for classifying the trajectories
of the MNIST stroke dataset. This examples seeks to partially reproduce the results
reported in [1]

The example is structured as follows:
| :ref:`Setup dependencies 2`
| :ref:`Definition of parameters 2`
| :ref:`Loading Data 2`
| :ref:`Loading the model 2`
| :ref:`Training and evaluation 2`
| :ref:`References 2`
.. note::
You can access `the script of this example <https://github.com/yupidevs/pactus/blob/master/examples/example_02.py>`_.

.. _Setup dependencies 2:

1. Setup dependencies
---------------------

Import all the dependencies:

.. code-block:: python
from tensorflow import keras
from pactus import Dataset
from pactus.models import TransformerModel
.. _Definition of parameters 2:

2. Definition of parameters
---------------------------

We define a random seed for reproducibility

.. code-block:: python
SEED = 0
.. _Loading Data 2:

3. Loading Data
---------------

To load the MNIST stroke dataset we can simply do:

.. code-block:: python
dataset = Dataset.mnist_stroke()
Then, we can create a train/test split as proposed on [1]:

.. code-block:: python
train, test = dataset.cut(60_000)
.. _Loading the model 2:

4. Loading the model
--------------------

Since transformers are able to deal with data of arbitrary length, there is no need
to create a featurizer for this model, and we can directly use it:

.. code-block:: python
model = TransformerModel(
head_size=512,
num_heads=4,
num_transformer_blocks=4,
optimizer=keras.optimizers.Adam(learning_rate=1e-4),
)
.. _Training and evaluation 2:

5. Training and evaluation
--------------------------

Training and evaluation can be conducted as follows:

.. code-block:: python
# Train the model on the train dataset
model.train(train, dataset, epochs=150, batch_size=64, checkpoint=checkpoint)
# Evaluate the model on a test dataset
evaluation = model.evaluate(test)
# Print the evaluation
evaluation.show()
Evaluation results should look like:

.. code-block:: text
[Coming soon]
.. _References 2:

6. References
-------------
| [1] BAE, Keywoong; LEE, Suan; LEE, Wookey. Transformer Networks for Trajectory Classification. En 2022 IEEE International Conference on Big Data and Smart Computing (BigComp). IEEE, 2022. p. 331-333.
178 changes: 178 additions & 0 deletions docs/source/examples/example3.rst
Original file line number Diff line number Diff line change
@@ -1,2 +1,180 @@
Example 3
=========

In this example we illustrate how to evaluate several models available in pactus
in a single dataset.

The example is structured as follows:
| :ref:`Setup dependencies 3`
| :ref:`Definition of parameters 3`
| :ref:`Loading Data 3`
| :ref:`Loading the model 3`
| :ref:`Training and evaluation 3`
| :ref:`References 3`
.. note::
You can access `the script of this example <https://github.com/yupidevs/pactus/blob/master/examples/example_03.py>`_.

.. _Setup dependencies 3:

1. Setup dependencies
---------------------

Import all the dependencies:

.. code-block:: python
from tensorflow import keras
from pactus import Dataset, featurizers
from pactus.models import (
DecisionTreeModel,
KNeighborsModel,
LSTMModel,
RandomForestModel,
SVMModel,
TransformerModel,
)
.. _Definition of parameters 3:

2. Definition of parameters
---------------------------

We define a random seed for reproducibility

.. code-block:: python
SEED = 0
.. _Loading Data 3:

3. Loading Data
---------------

To load the MNIST stroke dataset we can simply do:

.. code-block:: python
dataset = Dataset.mnist_stroke()
Then, we can create a train/test split as proposed on [1]:

.. code-block:: python
train, test = dataset.cut(60_000)
.. _Loading the model 3:

4. Loading the models
---------------------

Since we are going to use several models that are not able to deal with
data of arbitrary length, we need to create an object
that converts every trajectory into a fixed size feature vector. In this case,
we are going to use the UniversalFeaturizer for all those models. This featurizer
includes all available features.

.. code-block:: python
featurizer = featurizers.UniversalFeaturizer()
We can start by creating all the models requiring the featurizer and storing them
in a list:

.. code-block:: python
vectorized_models = [
RandomForestModel(
featurizer=featurizer,
max_features=16,
n_estimators=200,
bootstrap=False,
warm_start=True,
n_jobs=6,
),
KNeighborsModel(
featurizer=featurizer,
n_neighbors=7,
),
DecisionTreeModel(
featurizer=featurizer,
max_depth=7,
),
SVMModel(
featurizer=featurizer,
),
]
Then, we proceed to create the LSTM and Transformer models without the featurizer
since both of them can handle trajectories directly:

.. code-block:: python
lstm = LSTMModel(
loss="sparse_categorical_crossentropy",
optimizer="rmsprop",
metrics=["accuracy"],
)
model = TransformerModel(
head_size=512,
num_heads=4,
num_transformer_blocks=4,
optimizer=keras.optimizers.Adam(learning_rate=1e-4),
)
.. _Training and evaluation 3:

5. Training and evaluation
--------------------------

Training and evaluation of the models requiring the featurizer can be achieved by:

.. code-block:: python
for model in vectorized_models:
print(f"\nModel: {model.name}\n")
model.train(train, cross_validation=5)
evaluation = model.evaluate(test)
evaluation.show()
LSTM training and evaluation can be conducted by:

.. code-block:: python
checkpoint = keras.callbacks.ModelCheckpoint(
"partially_trained_model_lstm_mnist_stroke.h5",
monitor="loss",
verbose=1,
save_best_only=True,
mode="min",
)
lstm.train(train, dataset, epochs=20, checkpoint=checkpoint)
evaluation = lstm.evaluate(test)
evaluation.show()
Similarly, Transformer evaluation can be performed by:

.. code-block:: python
checkpoint = keras.callbacks.ModelCheckpoint(
"partially_trained_model_transformer_mnist_stroke.h5",
monitor="loss",
verbose=1,
save_best_only=True,
mode="min",
)
transformer.train(train, dataset, epochs=150, checkpoint=checkpoint)
evaluation = transformer.evaluate(test)
evaluation.show()
Each model should output the performance results using different metrics and they
can be fairly compared among each other since the data used for training and evaluation
was identical.

.. _References 3:

6. References
-------------
| [1] BAE, Keywoong; LEE, Suan; LEE, Wookey. Transformer Networks for Trajectory Classification. En 2022 IEEE International Conference on Big Data and Smart Computing (BigComp). IEEE, 2022. p. 331-333.
Loading

0 comments on commit 8eb5c83

Please sign in to comment.