Update examples

yupidevs · Jun 8, 2023 · 8eb5c83 · 8eb5c83
1 parent 45385b5
commit 8eb5c83
Show file tree

Hide file tree

Showing 6 changed files with 423 additions and 13 deletions.
diff --git a/docs/source/examples/example1.rst b/docs/source/examples/example1.rst
@@ -52,7 +52,7 @@ To load the original GeoLife dataset we can simply do:
    dataset = Dataset.geolife()
 
 Then, we can process it to keep only the desired classes, combine similar classes
-and create a tran/test split as proposed on [1]:
+and create a train/test split as proposed on [1]:
 
 .. code-block:: python
 

diff --git a/docs/source/examples/example2.rst b/docs/source/examples/example2.rst
@@ -1,2 +1,107 @@
 Example 2
 =========
+
+We illustrate how to evaluate a Transformer Network for classifying the trajectories
+of the MNIST stroke dataset. This examples seeks to partially reproduce the results
+reported in [1]
+
+The example is structured as follows:
+  | :ref:`Setup dependencies 2`
+  | :ref:`Definition of parameters 2`
+  | :ref:`Loading Data 2`
+  | :ref:`Loading the model 2`
+  | :ref:`Training and evaluation 2`
+  | :ref:`References 2`
+
+.. note::
+   You can access `the script of this example <https://github.com/yupidevs/pactus/blob/master/examples/example_02.py>`_.
+
+.. _Setup dependencies 2:
+
+1. Setup dependencies
+---------------------
+
+Import all the dependencies:
+
+.. code-block:: python
+
+   from tensorflow import keras
+   from pactus import Dataset
+   from pactus.models import TransformerModel
+
+
+.. _Definition of parameters 2:
+
+2. Definition of parameters
+---------------------------
+
+We define a random seed for reproducibility
+
+.. code-block:: python
+
+   SEED = 0
+
+.. _Loading Data 2:
+
+3. Loading Data
+---------------
+
+To load the MNIST stroke dataset we can simply do:
+
+.. code-block:: python
+
+   dataset = Dataset.mnist_stroke()
+
+Then, we can create a train/test split as proposed on [1]:
+
+.. code-block:: python
+
+   train, test = dataset.cut(60_000)
+
+.. _Loading the model 2:
+
+4. Loading the model
+--------------------
+
+Since transformers are able to deal with data of arbitrary length, there is no need
+to create a featurizer for this model, and we can directly use it:
+
+.. code-block:: python
+
+   model = TransformerModel(
+       head_size=512,
+       num_heads=4,
+       num_transformer_blocks=4,
+       optimizer=keras.optimizers.Adam(learning_rate=1e-4),
+   )
+
+.. _Training and evaluation 2:
+
+5. Training and evaluation
+--------------------------
+
+Training and evaluation can be conducted as follows:
+
+.. code-block:: python
+
+   # Train the model on the train dataset
+   model.train(train, dataset, epochs=150, batch_size=64, checkpoint=checkpoint)
+
+   # Evaluate the model on a test dataset
+   evaluation = model.evaluate(test)
+
+   # Print the evaluation
+   evaluation.show()
+
+Evaluation results should look like:
+
+.. code-block:: text
+
+   [Coming soon] 
+
+
+.. _References 2:
+
+6. References
+-------------
+| [1] BAE, Keywoong; LEE, Suan; LEE, Wookey. Transformer Networks for Trajectory Classification. En 2022 IEEE International Conference on Big Data and Smart Computing (BigComp). IEEE, 2022. p. 331-333.
diff --git a/docs/source/examples/example3.rst b/docs/source/examples/example3.rst
@@ -1,2 +1,180 @@
 Example 3
 =========
+
+In this example we illustrate how to evaluate several models available in pactus
+in a single dataset.
+
+The example is structured as follows:
+  | :ref:`Setup dependencies 3`
+  | :ref:`Definition of parameters 3`
+  | :ref:`Loading Data 3`
+  | :ref:`Loading the model 3`
+  | :ref:`Training and evaluation 3`
+  | :ref:`References 3`
+
+.. note::
+   You can access `the script of this example <https://github.com/yupidevs/pactus/blob/master/examples/example_03.py>`_.
+
+.. _Setup dependencies 3:
+
+1. Setup dependencies
+---------------------
+
+Import all the dependencies:
+
+.. code-block:: python
+
+   from tensorflow import keras
+
+   from pactus import Dataset, featurizers
+   from pactus.models import (
+       DecisionTreeModel,
+       KNeighborsModel,
+       LSTMModel,
+       RandomForestModel,
+       SVMModel,
+       TransformerModel,
+   )
+
+.. _Definition of parameters 3:
+
+2. Definition of parameters
+---------------------------
+
+We define a random seed for reproducibility
+
+.. code-block:: python
+
+   SEED = 0
+
+.. _Loading Data 3:
+
+3. Loading Data
+---------------
+
+To load the MNIST stroke dataset we can simply do:
+
+.. code-block:: python
+
+   dataset = Dataset.mnist_stroke()
+
+Then, we can create a train/test split as proposed on [1]:
+
+.. code-block:: python
+
+   train, test = dataset.cut(60_000)
+
+.. _Loading the model 3:
+
+4. Loading the models
+---------------------
+
+Since we are going to use several models that are not able to deal with 
+data of arbitrary length, we need to create an object
+that converts every trajectory into a fixed size feature vector. In this case,
+we are going to use the UniversalFeaturizer for all those models. This featurizer
+includes all available features.
+
+.. code-block:: python
+   
+   featurizer = featurizers.UniversalFeaturizer()
+
+We can start by creating all the models requiring the featurizer and storing them
+in a list:
+
+.. code-block:: python
+
+   vectorized_models = [
+       RandomForestModel(
+           featurizer=featurizer,
+           max_features=16,
+           n_estimators=200,
+           bootstrap=False,
+           warm_start=True,
+           n_jobs=6,
+       ),
+       KNeighborsModel(
+           featurizer=featurizer,
+           n_neighbors=7,
+       ),
+       DecisionTreeModel(
+           featurizer=featurizer,
+           max_depth=7,
+       ),
+       SVMModel(
+           featurizer=featurizer,
+       ),
+   ]
+
+Then, we proceed to create the LSTM and Transformer models without the featurizer
+since both of them can handle trajectories directly:
+
+.. code-block:: python
+   
+   lstm = LSTMModel(
+       loss="sparse_categorical_crossentropy",
+       optimizer="rmsprop",
+       metrics=["accuracy"],
+   )
+
+   model = TransformerModel(
+       head_size=512,
+       num_heads=4,
+       num_transformer_blocks=4,
+       optimizer=keras.optimizers.Adam(learning_rate=1e-4),
+   )
+
+.. _Training and evaluation 3:
+
+5. Training and evaluation
+--------------------------
+
+Training and evaluation of the models requiring the featurizer can be achieved by:
+
+.. code-block:: python
+
+   for model in vectorized_models:
+       print(f"\nModel: {model.name}\n")
+       model.train(train, cross_validation=5)
+       evaluation = model.evaluate(test)
+       evaluation.show()
+
+LSTM training and evaluation can be conducted by:
+
+.. code-block:: python
+
+   checkpoint = keras.callbacks.ModelCheckpoint(
+       "partially_trained_model_lstm_mnist_stroke.h5",
+       monitor="loss",
+       verbose=1,
+       save_best_only=True,
+       mode="min",
+   )
+   lstm.train(train, dataset, epochs=20, checkpoint=checkpoint)
+   evaluation = lstm.evaluate(test)
+   evaluation.show()
+
+Similarly, Transformer evaluation can be performed by:
+
+.. code-block:: python
+
+   checkpoint = keras.callbacks.ModelCheckpoint(
+       "partially_trained_model_transformer_mnist_stroke.h5",
+       monitor="loss",
+       verbose=1,
+       save_best_only=True,
+       mode="min",
+   )
+   transformer.train(train, dataset, epochs=150, checkpoint=checkpoint)
+   evaluation = transformer.evaluate(test)
+   evaluation.show()
+
+Each model should output the performance results using different metrics and they
+can be fairly compared among each other since the data used for training and evaluation 
+was identical.
+
+.. _References 3:
+
+6. References
+-------------
+| [1] BAE, Keywoong; LEE, Suan; LEE, Wookey. Transformer Networks for Trajectory Classification. En 2022 IEEE International Conference on Big Data and Smart Computing (BigComp). IEEE, 2022. p. 331-333.