Skip to content

Commit

Permalink
docs: add eval documentation (#428)
Browse files Browse the repository at this point in the history
Signed-off-by: ashors1 <[email protected]>
  • Loading branch information
ashors1 authored Dec 10, 2024
1 parent 7a2d427 commit 4830a07
Show file tree
Hide file tree
Showing 3 changed files with 43 additions and 2 deletions.
4 changes: 3 additions & 1 deletion docs/user-guide/aligner-algo-header.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
.. important::
Before starting this tutorial, be sure to review the :ref:`introduction <nemo-aligner-getting-started>` for tips on setting up your NeMo-Aligner environment.

If you run into any problems, refer to NeMo's `Known Issues page <https://docs.nvidia.com/nemo-framework/user-guide/latest/knownissues.html>`__. The page enumerates known issues and provides suggested workarounds where appropriate.
If you run into any problems, refer to NeMo's `Known Issues page <https://docs.nvidia.com/nemo-framework/user-guide/latest/knownissues.html>`__. The page enumerates known issues and provides suggested workarounds where appropriate.

After completing this tutorial, refer to the :ref:`evaluation documentation <nemo-aligner-eval>` for tips on evaluating a trained model.
39 changes: 39 additions & 0 deletions docs/user-guide/evaluation.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
.. include:: /content/nemo.rsts

.. _nemo-aligner-eval:

Evaluate a Trained Model
@@@@@@@@@@@@@@@@@@@@@@@@

After training a model, you may want to run evaluation to understand how the model performs on unseen tasks. You can use Eleuther AI's `Language Model Evaluation Harness <https://github.com/EleutherAI/lm-evaluation-harness>`_
to quickly run a variety of popular benchmarks, including MMLU, SuperGLUE, HellaSwag, and WinoGrande.
A full list of supported tasks can be found `here <https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/tasks/README.md>`_.

Install the LM Evaluation Harness
#################################

Run the following commands inside of a NeMo container to install the LM Evaluation Harness:

.. code-block:: bash
git clone --depth 1 https://github.com/EleutherAI/lm-evaluation-harness
cd lm-evaluation-harness
pip install -e .
Run Evaluations
###############

A detailed description of running evaluation with ``.nemo`` models can be found in Eleuther AI's `documentation <https://github.com/EleutherAI/lm-evaluation-harness?tab=readme-ov-file#nvidia-nemo-models>`_.
Single- and multi-GPU evaluation is supported. The following is an example of running evaluation using 8 GPUs on the ``hellaswag``, ``super_glue``, and ``winogrande`` tasks using a ``.nemo`` file from NeMo-Aligner.
Please note that while it is recommended, you are not required to unzip your .nemo file before running evaluations.

.. code-block:: bash
mkdir unzipped_checkpoint
tar -xvf /path/to/model.nemo -c unzipped_checkpoint
torchrun --nproc-per-node=8 --no-python lm_eval --model nemo_lm \
--model_args path='unzipped_checkpoint',devices=8,tensor_model_parallel_size=8 \
--tasks lambada_openai,super-glue-lm-eval-v1,winogrande \
--batch_size 8
2 changes: 1 addition & 1 deletion examples/nlp/data/sft/remove_long_dialogues.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
Usage:
python3 remove_long_dialogues.py \
--tokenizer_path <PATH TO TOKENIZER MODEL> \
--tokenizer_type sentencepiece
--tokenizer_type sentencepiece \
--dataset_file <PATH TO DATASET TO PREPROCESS> \
--output_file <WHERE TO SAVE PREPROCESSED DATASET> \
--seq_len <MAX_SEQ_LEN TO USE DURING TRAINING>
Expand Down

0 comments on commit 4830a07

Please sign in to comment.