GA contracts for Batch Endpoints (#2779)

* fix: updating deployments schemas * fix: GA contracts * black * undo * metadata
Azure · Nov 14, 2023 · 0ca0ed8 · 0ca0ed8
1 parent 75ce9f4
commit 0ca0ed8
Show file tree

Hide file tree

Showing 7 changed files with 113 additions and 106 deletions.
diff --git a/cli/endpoints/batch/deploy-models/huggingface-text-summarization/deployment_torch113.yaml b/cli/endpoints/batch/deploy-models/huggingface-text-summarization/deployment_torch113.yaml
diff --git a/...points/batch/deploy-models/huggingface-text-summarization/environment/torch113-conda.yaml b/...points/batch/deploy-models/huggingface-text-summarization/environment/torch113-conda.yaml
diff --git a/...tch/deploy-models/huggingface-text-summarization/environment/torch113-cu101-ubuntu18.yaml b/...tch/deploy-models/huggingface-text-summarization/environment/torch113-cu101-ubuntu18.yaml
diff --git a/sdk/python/endpoints/batch/deploy-models/openai-embeddings/deploy-and-test.ipynb b/sdk/python/endpoints/batch/deploy-models/openai-embeddings/deploy-and-test.ipynb
@@ -395,6 +395,15 @@
     "We create the associated deployment. Take a look about how the `environment_variables` section is created."
    ]
   },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "openai_api_base = \"https://<deployment>.openai.azure.com/\""
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -437,8 +446,8 @@
     "        error_threshold=-1,\n",
     "        environment_variables={\n",
     "            \"OPENAI_API_TYPE\": \"azure_ad\",\n",
-    "            \"OPENAI_API_BASE\": \"https://<deployment>.openai.azure.com/\",\n",
     "            \"OPENAI_API_VERSION\": \"2023-03-15-preview\",\n",
+    "            \"OPENAI_API_BASE\": openai_api_base,\n",
     "        },\n",
     "    ),\n",
     ")"
@@ -734,6 +743,25 @@
     "ml_client.jobs.download(name=scoring_job.name, download_path=\".\", output_name=\"score\")"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The output predictions will look like the following:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "\n",
+    "embeddings = pd.read_json(\"named-outputs/score/embeddings.jsonl\", lines=True)\n",
+    "embeddings"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},

diff --git a/...dpoints/batch/deploy-pipelines/batch-scoring-with-preprocessing/sdk-deploy-and-test.ipynb b/...dpoints/batch/deploy-pipelines/batch-scoring-with-preprocessing/sdk-deploy-and-test.ipynb
@@ -542,7 +542,7 @@
    "source": [
     "### 4.1 Creating the component\n",
     "\n",
-    "Our pipeline is defined in a function. To transform it to a component, you'll use the `build()` method. Pipeline components are reusable compute graphs that can be included in batch deployments or used to compose more complex pipelines."
+    "Our pipeline is defined in a function. We are going to create a component out of it. Pipeline components are reusable compute graphs that can be included in batch deployments or used to compose more complex pipelines."
    ]
   },
   {
@@ -562,7 +562,9 @@
    },
    "outputs": [],
    "source": [
-    "pipeline_component = uci_heart_classifier_scorer._pipeline_builder.build()"
+    "pipeline_component = ml_client.components.create_or_update(\n",
+    "    uci_heart_classifier_scorer().component\n",
+    ")"
    ]
   },
   {
@@ -805,49 +807,53 @@
    ]
   },
   {
-   "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "In pipelines, each step is executed as child job. You can use `parent_job_name` to find all the child jobs from a given job:"
+    "Pipelines can define multiple outputs. When defined, use the name of the output to access its results. In our case, the pipeline contains an output called \"scores\":"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "metadata": {},
+   "metadata": {
+    "name": "download_outputs"
+   },
    "outputs": [],
    "source": [
-    "pipeline_job_steps = {\n",
-    "    step.properties[\"azureml.moduleName\"]: step\n",
-    "    for step in ml_client.jobs.list(parent_job_name=job.name)\n",
-    "}"
+    "ml_client.jobs.download(name=job.name, download_path=\".\", output_name=\"scores\")"
    ]
   },
   {
    "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "This dictonary contains the module name as key, and the job as values. This makes easier to work with them:"
+    "Read the scored data:"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "metadata": {},
+   "metadata": {
+    "name": "read_outputs"
+   },
    "outputs": [],
    "source": [
-    "preprocessing_job = pipeline_job_steps[\"uci_heart_prepare\"]\n",
-    "score_job = pipeline_job_steps[\"uci_heart_score\"]"
+    "import pandas as pd\n",
+    "import glob\n",
+    "\n",
+    "output_files = glob.glob(\"named-outputs/scores/*.csv\")\n",
+    "score = pd.concat((pd.read_csv(f) for f in output_files))\n",
+    "score"
    ]
   },
   {
    "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Confirm the jobs' statuses using the following:"
+    "Notice that the outputs of the pipeline may be different from the outputs of intermediate steps. In pipelines, each step is executed as child job. You can use `parent_job_name` to find all the child jobs from a given job:"
    ]
   },
   {
@@ -856,43 +862,53 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "print(f\"Preprocessing job: {preprocessing_job.status}\")\n",
-    "print(f\"Scoring job: {score_job.status}\")"
+    "pipeline_job_steps = {\n",
+    "    step.properties[\"azureml.moduleName\"]: step\n",
+    "    for step in ml_client.jobs.list(parent_job_name=job.name)\n",
+    "}"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This dictonary contains the module name as key, and the job as values. This makes easier to work with them:"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "metadata": {
-    "name": "download_outputs"
-   },
+   "metadata": {},
    "outputs": [],
    "source": [
-    "ml_client.jobs.download(name=score_job.name, download_path=\".\", output_name=\"scores\")"
+    "preprocessing_job = pipeline_job_steps[\"uci_heart_prepare\"]\n",
+    "score_job = pipeline_job_steps[\"uci_heart_score\"]"
    ]
   },
   {
    "attachments": {},
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Read the scored data:"
+    "Confirm the jobs' statuses using the following:"
    ]
   },
   {
    "cell_type": "code",
    "execution_count": null,
-   "metadata": {
-    "name": "read_outputs"
-   },
+   "metadata": {},
    "outputs": [],
    "source": [
-    "import pandas as pd\n",
-    "import glob\n",
-    "\n",
-    "output_files = glob.glob(\"named-outputs/scores/*.csv\")\n",
-    "score = pd.concat((pd.read_csv(f) for f in output_files))\n",
-    "score"
+    "print(f\"Preprocessing job: {preprocessing_job.status}\")\n",
+    "print(f\"Scoring job: {score_job.status}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You can also access the outputs of each of those intermediate steps as we did for the pipeline job."
    ]
   },
   {

diff --git a/sdk/python/endpoints/batch/deploy-pipelines/hello-batch/sdk-deploy-and-test.ipynb b/sdk/python/endpoints/batch/deploy-pipelines/hello-batch/sdk-deploy-and-test.ipynb
@@ -534,7 +534,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Confirm the job status by accessing the child scoring job."
+    "Confirm the job status by accessing the child job."
    ]
   },
   {
@@ -544,7 +544,7 @@
    "outputs": [],
    "source": [
     "scoring_job = list(ml_client.jobs.list(parent_job_name=job.name))[0]\n",
-    "print(f\"Scoring job: {scoring_job.status}\")"
+    "print(f\"Child job: {scoring_job.status}\")"
    ]
   },
   {
@@ -574,9 +574,9 @@
    "name": "amlv2"
   },
   "kernelspec": {
-   "display_name": "previews",
+   "display_name": "Python 3.10 - SDK v2",
    "language": "python",
-   "name": "previews"
+   "name": "python310-sdkv2"
   },
   "language_info": {
    "codemirror_mode": {
@@ -588,7 +588,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.5"
+   "version": "3.10.11"
   },
   "nteract": {
    "version": "[email protected]"