kubeflow · andreyvelich · Jan 13, 2025 · Jan 14, 2025 · Jan 14, 2025 · Jan 14, 2025
diff --git a/content/en/_index.html b/content/en/_index.html
@@ -111,7 +111,7 @@ <h5 class="card-title text-white section-head">AutoML</h5>
         </div>
       </div>
      <div class="card border-primary-dark">
-        <a href="/docs/components/training/overview/" target="_blank" rel="noopener" >
+        <a href="/docs/components/trainer/overview/" target="_blank" rel="noopener" >
           <img
             src="/docs/images/logos/tensorflow-pytorch.png"
             class="card-img-top"
@@ -123,7 +123,7 @@ <h5 class="card-title text-white section-head">AutoML</h5>
         <div class="card-body bg-primary-dark">
           <h5 class="card-title text-white section-head">Model Training</h5>
           <p class="card-text text-white">
-            <a href="/docs/components/training/overview/" target="_blank" rel="noopener" >Kubeflow Training Operator</a> is a unified interface for model training and fine-tuning on Kubernetes.
+            <a href="/docs/components/trainer/overview/" target="_blank" rel="noopener" >Kubeflow Trainer</a> is a unified interface for model training and LLM fine-tuning on Kubernetes.
             It runs scalable and distributed training jobs for popular frameworks including PyTorch, TensorFlow, MPI, MXNet, PaddlePaddle, and XGBoost.
           </p>
         </div>

diff --git a/content/en/_redirects b/content/en/_redirects
@@ -337,4 +337,22 @@ docs/started/requirements/                    /docs/started/getting-started/
 /docs/components/pipelines/v2/reference/api/kubeflow-pipeline-api-spec/     /docs/components/pipelines/reference/api/kubeflow-pipeline-api-spec/
 /docs/components/pipelines/v2/reference/sdk/                                /docs/components/pipelines/reference/sdk/
 /docs/components/pipelines/v2/run-a-pipeline/                               /docs/components/pipelines/user-guides/core-functions/run-a-pipeline/
-/docs/components/pipelines/v2/version-compatibility/                        /docs/components/pipelines/reference/version-compatibility/
+/docs/components/pipelines/v2/version-compatibility/                        /docs/components/pipelines/reference/version-compatibility/
+
+# Kubeflow Trainer V2 (https://github.com/kubeflow/training-operator/issues/2214)
+/docs/components/training/installation/                       /docs/components/trainer/legacy-v1/installation/
+/docs/components/training/explanation/                        /docs/components/trainer/legacy-v1/explanation/
+/docs/components/training/explanation/fine-tuning/            /docs/components/trainer/legacy-v1/explanation/fine-tuning/
+/docs/components/training/reference/                          /docs/components/trainer/legacy-v1/reference/
+/docs/components/training/reference/architecture/             /docs/components/trainer/legacy-v1/reference/architecture/
+/docs/components/training/reference/distributed-training/     /docs/components/trainer/legacy-v1/reference/distributed-training/
+/docs/components/training/reference/fine-tuning/              /docs/components/trainer/legacy-v1/reference/fine-tuning/
+/docs/components/training/user-guides/                        /docs/components/trainer/legacy-v1/user-guides/
+/docs/components/training/user-guides/fine-tuning/            /docs/components/trainer/legacy-v1/user-guides/fine-tuning/
+/docs/components/training/user-guides/jax/                    /docs/components/trainer/legacy-v1/user-guides/jax/
+/docs/components/training/user-guides/job-scheduling/         /docs/components/trainer/legacy-v1/user-guides/job-scheduling/
+/docs/components/training/user-guides/mpi/                    /docs/components/trainer/legacy-v1/user-guides/mpi/
+/docs/components/training/user-guides/paddle/                 /docs/components/trainer/legacy-v1/user-guides/paddle/
+/docs/components/training/user-guides/prometheus/             /docs/components/trainer/legacy-v1/user-guides/prometheus/
+/docs/components/training/user-guides/tensorflow/             /docs/components/trainer/legacy-v1/user-guides/tensorflow/
+/docs/components/training/user-guides/xgboost/                /docs/components/trainer/legacy-v1/user-guides/xgboost/
diff --git a/content/en/docs/components/katib/user-guides/hp-tuning/configure-experiment.md b/content/en/docs/components/katib/user-guides/hp-tuning/configure-experiment.md
@@ -121,8 +121,8 @@ trialSpec:
           "sidecar.istio.io/inject": "false"
 ```
 
-If you use `PyTorchJob` or other Training Operator jobs in your Trial template check
-[here](/docs/components/training/user-guides/tensorflow/#what-is-tfjob) how to set the annotation.
+If you use `PyTorchJob` or other Training Operator jobs in your Trial template, check
+[here](/docs/components/trainer/legacy-v1/user-guides/tensorflow/#what-is-tfjob) how to set the annotation.
 
 ## Running the Experiment
 

diff --git a/content/en/docs/components/katib/user-guides/trial-template.md b/content/en/docs/components/katib/user-guides/trial-template.md
@@ -16,13 +16,13 @@ In Katib examples, you can find the following examples for Trial's Workers:
 
 - [Kubernetes `Job`](https://kubernetes.io/docs/concepts/workloads/controllers/job/)
 
-- [Kubeflow `TFJob`](/docs/components/training/user-guides/tensorflow)
+- [Kubeflow `TFJob`](/docs/components/trainer/legacy-v1/user-guides/tensorflow)
 
-- [Kubeflow `PyTorchJob`](/docs/components/training/user-guides/pytorch/)
+- [Kubeflow `PyTorchJob`](/docs/components/trainer/legacy-v1/user-guides/pytorch/)
 
-- [Kubeflow `XGBoostJob`](/docs/components/training/user-guides/xgboost)
+- [Kubeflow `XGBoostJob`](/docs/components/trainer/legacy-v1/user-guides/xgboost)
 
-- [Kubeflow `MPIJob`](/docs/components/training/user-guides/mpi)
+- [Kubeflow `MPIJob`](/docs/components/trainer/legacy-v1/user-guides/mpi)
 
 - [Tekton `Pipelines`](https://github.com/kubeflow/katib/tree/master/examples/v1beta1/tekton)
 

diff --git a/content/en/docs/components/training/OWNERS → content/en/docs/components/trainer/OWNERS b/content/en/docs/components/training/OWNERS → content/en/docs/components/trainer/OWNERS
diff --git a/content/en/docs/components/trainer/_index.md b/content/en/docs/components/trainer/_index.md
@@ -0,0 +1,5 @@
++++
+title = "Kubeflow Trainer"
+description = "Documentation for Kubeflow Trainer"
+weight = 20
++++
diff --git a/content/en/docs/components/trainer/contributor-guides/_index.md b/content/en/docs/components/trainer/contributor-guides/_index.md
@@ -0,0 +1,7 @@
++++
+title = "Contributor Guides"
+description = "Documentation for Kubeflow Trainer contributors"
+weight = 60
++++
+
+This doc is in progress...
diff --git a/content/en/docs/components/trainer/contributor-guides/community.md b/content/en/docs/components/trainer/contributor-guides/community.md
@@ -0,0 +1,5 @@
++++
+title = "Community Guide"
+description = "How to get involved to Kubeflow Trainer community"
+weight = 20
++++
diff --git a/content/en/docs/components/trainer/contributor-guides/contributing.md b/content/en/docs/components/trainer/contributor-guides/contributing.md
@@ -0,0 +1,7 @@
++++
+title = "Contributing Guide"
+description = "How to contribute to Kubeflow Trainer project"
+weight = 10
++++
+
+This doc is in progress...
diff --git a/content/en/docs/components/trainer/getting-started.md b/content/en/docs/components/trainer/getting-started.md
@@ -0,0 +1,29 @@
++++
+title = "Getting Started"
+description = "Get Started with Kubeflow Trainer"
+weight = 30
++++
+
+This guide describes how to get started with Kubeflow Trainer and run distributed training
+with PyTorch.
+
+## Prerequisites
+
+Ensure that you have access to a Kubernetes cluster with Kubeflow Trainer
+control plane installed. If it is not set up yet, followÍ
-control plane installed. If it is not set up yet, followÍ
+control plane installed. If it is not set up yet, follow
-control plane installed. If it is not set up yet, followÍ
+control plane installed. If it is not set up yet, follow
+[the installation guide](/docs/components/trainer/operator-guides/installation) to quickly deploy
+Kubeflow Trainer on your local Kind cluster.
-Kubeflow Trainer on your local Kind cluster.
+Kubeflow Trainer.
-Kubeflow Trainer on your local Kind cluster.
+Kubeflow Trainer.
+
+### Installing the Kubeflow Python SDK
+
+Install the latest Kubeflow Python SDK version directly from the source repository:
+
+```bash
+pip install git+https://github.com/kubeflow/training-operator.git@master#subdirectory=sdk_v2
+```
+
+TODO (andreyvelich): Add command once we release SDK to PyPI: https://pypi.org/project/kubeflow
+
+## Getting Started with PyTorch
+
+TODO (andreyvelich): Add example from the Notebook
diff --git a/content/en/docs/components/trainer/images/ml-lifecycle-trainer.drawio.svg b/content/en/docs/components/trainer/images/ml-lifecycle-trainer.drawio.svg
diff --git a/content/en/docs/components/trainer/images/user-personas.drawio.svg b/content/en/docs/components/trainer/images/user-personas.drawio.svg
diff --git a/content/en/docs/components/trainer/legacy-v1/_index.md b/content/en/docs/components/trainer/legacy-v1/_index.md
@@ -0,0 +1,12 @@
++++
+title = "Legacy Kubeflow Training Operator (v1)"
+description = "Kubeflow Training Operator V1 Documentation"
+weight = 999
++++
+
+{{% alert title="Old Version" color="warning" %}}
+This page is about **Kubeflow Training Operator V1**, for the latest information check
+[the Kubeflow Trainer V2 documentation](/docs/components/trainer).
+
+Follow [this guide for migrating to Kubeflow Trainer V2](/docs/components/trainer/operator-guides/migration)
+{{% /alert %}}
diff --git a/...components/training/explanation/_index.md → ...s/trainer/legacy-v1/explanation/_index.md b/...components/training/explanation/_index.md → ...s/trainer/legacy-v1/explanation/_index.md
diff --git a/...nents/training/explanation/fine-tuning.md → ...iner/legacy-v1/explanation/fine-tuning.md b/...nents/training/explanation/fine-tuning.md → ...iner/legacy-v1/explanation/fine-tuning.md
@@ -10,7 +10,7 @@ share your experience using the [#kubeflow-training Slack channel](/docs/about/c
 or [Kubeflow Training Operator GitHib](https://github.com/kubeflow/training-operator/issues/new).
 {{% /alert %}}
 
-This page explains how the [Training Operator fine-tuning API](/docs/components/training/user-guides/fine-tuning)
+This page explains how the [Training Operator fine-tuning API](/docs/components/trainer/legacy-v1/user-guides/fine-tuning)
 fits into the Kubeflow ecosystem.
 
 In the rapidly evolving landscape of machine learning (ML) and artificial intelligence (AI),
@@ -60,4 +60,4 @@ Different user personas can benefit from this feature:
 
 ## Next Steps
 
-- Understand [the architecture behind `train` API](/docs/components/training/reference/fine-tuning).
+- Understand [the architecture behind `train` API](/docs/components/trainer/legacy-v1/reference/fine-tuning).
diff --git a/...cs/components/training/getting-started.md → ...ents/trainer/legacy-v1/getting-started.md b/...cs/components/training/getting-started.md → ...ents/trainer/legacy-v1/getting-started.md
@@ -10,8 +10,8 @@ This guide describes how to get started with the Training Operator and run a few
 
 You need to install the following components to run examples:
 
-- The Training Operator control plane [installed](/docs/components/training/installation/#installing-the-control-plane).
-- The Training Python SDK [installed](/docs/components/training/installation/#installing-the-python-sdk).
+- The Training Operator control plane [installed](/docs/components/trainer/legacy-v1/installation/#installing-the-control-plane).
+- The Training Python SDK [installed](/docs/components/trainer/legacy-v1/installation/#installing-the-python-sdk).
 
 ## Getting Started with PyTorchJob
 
@@ -153,6 +153,6 @@ TrainingClient().get_job_logs(
 
 ## Next steps
 
-- Run the [FashionMNIST example](https://github.com/kubeflow/training-operator/blob/7345e33b333ba5084127efe027774dd7bed8f6e6/examples/pytorch/image-classification/Train-CNN-with-FashionMNIST.ipynb) with using Training Operator Python SDK.
+- Run the [FashionMNIST example](https://github.com/kubeflow/training-operator/blob/release-1.9/examples/pytorch/image-classification/Train-CNN-with-FashionMNIST.ipynb) with using Training Operator Python SDK.
 
-- Learn more about [the PyTorchJob APIs](/docs/components/training/user-guides/pytorch/).
+- Learn more about [the PyTorchJob APIs](/docs/components/trainer/legacy-v1/user-guides/pytorch/).
diff --git a/.../images/distributed-pytorchjob.drawio.svg → .../images/distributed-pytorchjob.drawio.svg b/.../images/distributed-pytorchjob.drawio.svg → .../images/distributed-pytorchjob.drawio.svg
diff --git a/...ining/images/distributed-tfjob.drawio.svg → ...cy-v1/images/distributed-tfjob.drawio.svg b/...ining/images/distributed-tfjob.drawio.svg → ...cy-v1/images/distributed-tfjob.drawio.svg
diff --git a/...ining/images/fine-tune-llm-api.drawio.svg → ...cy-v1/images/fine-tune-llm-api.drawio.svg b/...ining/images/fine-tune-llm-api.drawio.svg → ...cy-v1/images/fine-tune-llm-api.drawio.svg
diff --git a/...ml-lifecycle-training-operator.drawio.svg → ...ml-lifecycle-training-operator.drawio.svg b/...ml-lifecycle-training-operator.drawio.svg → ...ml-lifecycle-training-operator.drawio.svg
diff --git a/...ges/training-operator-overview.drawio.svg → ...ges/training-operator-overview.drawio.svg b/...ges/training-operator-overview.drawio.svg → ...ges/training-operator-overview.drawio.svg
diff --git a/...ining-operator-v1-architecture.drawio.svg → ...ining-operator-v1-architecture.drawio.svg b/...ining-operator-v1-architecture.drawio.svg → ...ining-operator-v1-architecture.drawio.svg
diff --git a/.../docs/components/training/installation.md → ...ponents/trainer/legacy-v1/installation.md b/.../docs/components/training/installation.md → ...ponents/trainer/legacy-v1/installation.md
@@ -12,8 +12,8 @@ appropriate Kubernetes workloads to perform distributed ML training and fine-tun
 
 These are the minimal requirements to install the Training Operator:
 
-- Kubernetes >= 1.27
-- `kubectl` >= 1.27
+- Kubernetes >= 1.28
+- `kubectl` >= 1.28
 - Python >= 3.7
 
 ## Installing the Training Operator
@@ -65,7 +65,7 @@ xgboostjobs.kubeflow.org                                 2023-06-09T00:31:04Z
 ### Installing the Python SDK
 
 The Training Operator [implements a Python SDK](https://pypi.org/project/kubeflow-training/)
-to simplify creation of distributed training and fine-tuning jobs for Data Scientists.
+to simplify creation of distributed training and fine-tuning jobs.
 
 Run the following command to install the latest stable release of the Training SDK:
 
@@ -96,4 +96,4 @@ pip install -U "kubeflow-training[huggingface]"
 
 ## Next steps
 
-Run your first Training Operator Job by following the [Getting Started guide](/docs/components/training/getting-started/).
+Run your first Training Operator Job by following the [Getting Started guide](/docs/components/trainer/legacy-v1/getting-started/).
diff --git a/...t/en/docs/components/training/overview.md → .../components/trainer/legacy-v1/overview.md b/...t/en/docs/components/training/overview.md → .../components/trainer/legacy-v1/overview.md
@@ -24,9 +24,9 @@ The Training Operator implements a centralized Kubernetes controller to orchestr
 You can run high-performance computing (HPC) tasks with the Training Operator and MPIJob since it
 supports running Message Passing Interface (MPI) on Kubernetes which is heavily used for HPC.
 The Training Operator implements the V1 API version of MPI Operator. For the MPI Operator V2 version,
-please follow [this guide](/docs/components/training/user-guides/mpi/) to install MPI Operator V2.
+please follow [this guide](/docs/components/trainer/legacy-v1/user-guides/mpi/) to install MPI Operator V2.
 
-<img src="/docs/components/training/images/training-operator-overview.drawio.svg"
+<img src="/docs/components/trainer/legacy-v1/images/training-operator-overview.drawio.svg"
   alt="Training Operator Overview"
   class="mt-3 mb-3">
 
@@ -38,7 +38,7 @@ various distributed training strategies for different ML frameworks.
 The Training Operator addresses the Model Training and Model Fine-Tuning steps in the AI/ML
 lifecycle as shown in diagram below:
 
-<img src="/docs/components/training/images/ml-lifecycle-training-operator.drawio.svg"
+<img src="/docs/components/trainer/legacy-v1/images/ml-lifecycle-training-operator.drawio.svg"
   alt="AI/ML Lifecycle Training Operator"
   class="mt-3 mb-3">
 
@@ -49,7 +49,7 @@ Kubernetes cluster using APIs and interfaces provided by Training Operator.
 
 - **The Training Operator is extensible and portable.**
 
-You can deploy Training Operator on any cloud where you have Kubernetes cluster and you can
+You can deploy the Training Operator on any cloud where you have Kubernetes cluster and you can
 integrate their own ML frameworks written in any programming languages with Training Operator.
 
 - **The Training Operator is integrated with the Kubernetes ecosystem.**
@@ -63,17 +63,17 @@ To perform distributed training the Training Operator implements the following
 [Custom Resources](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/)
 for each ML framework:
 
-| ML Framework | Custom Resource                                              |
-| ------------ | ------------------------------------------------------------ |
-| PyTorch      | [PyTorchJob](/docs/components/training/user-guides/pytorch/) |
-| TensorFlow   | [TFJob](/docs/components/training/user-guides/tensorflow/)   |
-| XGBoost      | [XGBoostJob](/docs/components/training/user-guides/xgboost/) |
-| MPI          | [MPIJob](/docs/components/training/user-guides/mpi/)         |
-| PaddlePaddle | [PaddleJob](/docs/components/training/user-guides/paddle/)   |
-| JAX          | [JAXJob](/docs/components/training/user-guides/jax/)   |
+| ML Framework | Custom Resource                                                       |
+| ------------ | --------------------------------------------------------------------- |
+| PyTorch      | [PyTorchJob](/docs/components/trainer/legacy-v1/user-guides/pytorch/) |
+| TensorFlow   | [TFJob](/docs/components/trainer/legacy-v1/user-guides/tensorflow/)   |
+| XGBoost      | [XGBoostJob](/docs/components/trainer/legacy-v1/user-guides/xgboost/) |
+| MPI          | [MPIJob](/docs/components/trainer/legacy-v1/user-guides/mpi/)         |
+| PaddlePaddle | [PaddleJob](/docs/components/trainer/legacy-v1/user-guides/paddle/)   |
+| JAX          | [JAXJob](/docs/components/trainer/legacy-v1/user-guides/jax/)         |
 
 ## Next steps
 
-- Follow [the installation guide](/docs/components/training/installation/) to deploy the Training Operator.
+- Follow [the installation guide](/docs/components/trainer/legacy-v1/installation/) to deploy the Training Operator.
 
-- Run examples from [getting started guide](/docs/components/training/getting-started/).
+- Run examples from [getting started guide](/docs/components/trainer/legacy-v1/getting-started/).
diff --git a/...s/components/training/reference/_index.md → ...nts/trainer/legacy-v1/reference/_index.md b/...s/components/training/reference/_index.md → ...nts/trainer/legacy-v1/reference/_index.md
diff --git a/...onents/training/reference/architecture.md → ...ainer/legacy-v1/reference/architecture.md b/...onents/training/reference/architecture.md → ...ainer/legacy-v1/reference/architecture.md
@@ -18,14 +18,15 @@ The dedicated "Backend" operator was not implemented and instead
 consolidated to the "Frontend" operator.
 
 The benefits of this approach were:
+
 1. Shared testing and release infrastructure
 2. Unlocked production grade features like manifests and metadata support
 3. Simpler Kubeflow releases
 4. A Single Source of Truth (SSOT) for other Kubeflow components to interact with
 
 The V1 Training Operator architecture diagram can be seen in the diagram below:
 
-<img src="/docs/components/training/images/training-operator-v1-architecture.drawio.svg"
+<img src="/docs/components/trainer/legacy-v1/images/training-operator-v1-architecture.drawio.svg"
   alt="Training Operator V1 Architecture"
   class="mt-3 mb-3">
 

diff --git a/...raining/reference/distributed-training.md → ...gacy-v1/reference/distributed-training.md b/...raining/reference/distributed-training.md → ...gacy-v1/reference/distributed-training.md
@@ -11,7 +11,7 @@ This page shows different distributed strategies that can be used by the Trainin
 This diagram shows how the Training Operator creates PyTorch workers for the
 [ring all-reduce algorithm](https://tech.preferred.jp/en/blog/technologies-behind-distributed-deep-learning-allreduce/).
 
-<img src="/docs/components/training/images/distributed-pytorchjob.drawio.svg"
+<img src="/docs/components/trainer/legacy-v1/images/distributed-pytorchjob.drawio.svg"
   alt="Distributed PyTorchJob"
   class="mt-3 mb-3">
 
@@ -34,7 +34,7 @@ the appropriate environment variables for `torchrun`.
 This diagram shows how the Training Operator creates the TensorFlow parameter server (PS) and workers for
 [PS distributed training](https://www.tensorflow.org/tutorials/distribute/parameter_server_training).
 
-<img src="/docs/components/training/images/distributed-tfjob.drawio.svg"
+<img src="/docs/components/trainer/legacy-v1/images/distributed-tfjob.drawio.svg"
   alt="Distributed TFJob"
   class="mt-3 mb-3">
 

diff --git a/...ponents/training/reference/fine-tuning.md → ...rainer/legacy-v1/reference/fine-tuning.md b/...ponents/training/reference/fine-tuning.md → ...rainer/legacy-v1/reference/fine-tuning.md
@@ -5,13 +5,13 @@ weight = 10
 +++
 
 This page shows how Training Operator implements the
-[API to fine-tune LLMs](/docs/components/training/user-guides/fine-tuning).
+[API to fine-tune LLMs](/docs/components/trainer/legacy-v1/user-guides/fine-tuning).
 
 ## Architecture
 
 In the following diagram you can see how `train` Python API works:
 
-<img src="/docs/components/training/images/fine-tune-llm-api.drawio.svg"
+<img src="/docs/components/trainer/legacy-v1/images/fine-tune-llm-api.drawio.svg"
   alt="Fine-Tune API for LLMs"
   class="mt-3 mb-3">
 

diff --git a/...components/training/user-guides/_index.md → ...s/trainer/legacy-v1/user-guides/_index.md b/...components/training/user-guides/_index.md → ...s/trainer/legacy-v1/user-guides/_index.md
diff --git a/...nents/training/user-guides/fine-tuning.md → ...iner/legacy-v1/user-guides/fine-tuning.md b/...nents/training/user-guides/fine-tuning.md → ...iner/legacy-v1/user-guides/fine-tuning.md
@@ -10,15 +10,15 @@ share your experience using the [#kubeflow-training Slack channel](https://cloud
 or the [Kubeflow Training Operator GitHub](https://github.com/kubeflow/training-operator/issues/new).
 {{% /alert %}}
 
-This page describes how to use a [`train` API from the Training Python SDK](https://github.com/kubeflow/training-operator/blob/6ce4d57d699a76c3d043917bd0902c931f14080f/sdk/python/kubeflow/training/api/training_client.py#L112)
+This page describes how to use a [`train` API from the Training Python SDK](https://github.com/kubeflow/training-operator/blob/release-1.9/sdk/python/kubeflow/training/api/training_client.py#L95)
 that simplifies the ability to fine-tune LLMs with distributed PyTorchJob workers.
 
 If you want to learn more about how the fine-tuning API fits in the Kubeflow ecosystem, head to
-the [explanation guide](/docs/components/training/explanation/fine-tuning).
+the [explanation guide](/docs/components/trainer/legacy-v1/explanation/fine-tuning).
 
 ## Prerequisites
 
-You need to install the Training Python SDK [with fine-tuning support](/docs/components/training/installation/#install-the-python-sdk-with-fine-tuning-capabilities)
+You need to install the Training Python SDK [with fine-tuning support](/docs/components/trainer/legacy-v1/installation/#install-the-python-sdk-with-fine-tuning-capabilities)
 to run this API.
 
 ## How to use the Fine-Tuning API?
@@ -92,6 +92,7 @@ to fine-tune the LLM.
 Platform engineers can customize the storage initializer and trainer images by setting the `STORAGE_INITIALIZER_IMAGE` and `TRAINER_TRANSFORMER_IMAGE` environment variables before executing the `train` command.
 
 For example: In your python code, set the env vars before executing `train`:
+
 ```python
 ...
 os.environ['STORAGE_INITIALIZER_IMAGE'] = 'docker.io/<username>/<custom-storage-initiailizer_image>'
@@ -102,9 +103,9 @@ TrainingClient().train(...)
 
 ## Next Steps
 
-- Run the example to [fine-tune the TinyLlama LLM](https://github.com/kubeflow/training-operator/blob/6ce4d57d699a76c3d043917bd0902c931f14080f/examples/pytorch/language-modeling/train_api_hf_dataset.ipynb)
+- Run the example to [fine-tune the TinyLlama LLM](https://github.com/kubeflow/training-operator/blob/release-1.9/examples/pytorch/language-modeling/train_api_hf_dataset.ipynb)
 
 - Check this example to compare the `create_job` and the `train` Python API for
-  [fine-tuning BERT LLM](https://github.com/kubeflow/training-operator/blob/6ce4d57d699a76c3d043917bd0902c931f14080f/examples/pytorch/text-classification/Fine-Tune-BERT-LLM.ipynb).
+  [fine-tuning BERT LLM](https://github.com/kubeflow/training-operator/blob/release-1.9/examples/pytorch/text-classification/Fine-Tune-BERT-LLM.ipynb).
 
-- Understand [the architecture behind `train` API](/docs/components/training/reference/fine-tuning).
+- Understand [the architecture behind `train` API](/docs/components/trainer/legacy-v1/reference/fine-tuning).