diff --git a/notebooks/multi_modal/labs/image_captioning.ipynb b/notebooks/multi_modal/labs/image_captioning.ipynb
index f4200b52..0b26ec89 100644
--- a/notebooks/multi_modal/labs/image_captioning.ipynb
+++ b/notebooks/multi_modal/labs/image_captioning.ipynb
@@ -13,7 +13,7 @@
     "Image captioning models take an image as input, and output text. Ideally, we want the output of the model to accurately describe the events/things in the image, similar to a caption a human might provide. <br>\n",
     "For example, given an image like the example below, the model is expected to generate a caption such as *\"some people are playing baseball.\"*.\n",
     "\n",
-    "<div><img src=\"./sample_images/baseball.jpeg\" width=\"500\"></div>\n",
+    "<div><img src=\"../sample_images/baseball.jpeg\" width=\"500\"></div>\n",
     "\n",
     "In order to generate text, we will build an encoder-decoder model, where the encoder output embedding of an input image, and the decoder output text from the image embedding<br>\n",
     "\n",
@@ -512,7 +512,6 @@
     "* $h_s$ is the sequence of encoder outputs being attended to (the attention \"key\" and \"value\" in transformer terminology).\n",
     "* $h_t$ is the decoder state attending to the sequence (the attention \"query\" in transformer terminology).\n",
     "* $c_t$ is the resulting context vector.\n",
-    "* $a_t$ is the final output combining the \"context\" and \"query\".\n",
     "\n",
     "The equations:\n",
     "\n",
diff --git a/notebooks/multi_modal/solutions/image_captioning.ipynb b/notebooks/multi_modal/solutions/image_captioning.ipynb
index 3ead05b7..6836088f 100644
--- a/notebooks/multi_modal/solutions/image_captioning.ipynb
+++ b/notebooks/multi_modal/solutions/image_captioning.ipynb
@@ -668,7 +668,6 @@
     "* $h_s$ is the sequence of encoder outputs being attended to (the attention \"key\" and \"value\" in transformer terminology).\n",
     "* $h_t$ is the decoder state attending to the sequence (the attention \"query\" in transformer terminology).\n",
     "* $c_t$ is the resulting context vector.\n",
-    "* $a_t$ is the final output combining the \"context\" and \"query\".\n",
     "\n",
     "The equations:\n",
     "\n",