diff --git a/examples/notebooks/beam-ml/run_inference_vllm.ipynb b/examples/notebooks/beam-ml/run_inference_vllm.ipynb
index e9f1e53a452b..bf56b3b922e1 100644
--- a/examples/notebooks/beam-ml/run_inference_vllm.ipynb
+++ b/examples/notebooks/beam-ml/run_inference_vllm.ipynb
@@ -117,10 +117,35 @@
       "source": [
         "!pip install openai>=1.52.2\n",
         "!pip install vllm>=0.6.3\n",
-        "!pip install apache-beam[gcp]==2.60.0\n",
+        "!pip install apache-beam[gcp]==2.61.0\n",
+        "!pip install nest_asyncio # only needed in colab\n",
         "!pip check"
       ]
     },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "## Colab only: allow nested asyncio\n",
+        "\n",
+        "The vLLM model handler logic below uses asyncio to feed vLLM records. This only works if we are not already in an asyncio event loop. Most of the time, this is fine, but colab already operates in an event loop. To work around this, we can use nest_asyncio to make things work smoothly in colab. Do not include this step outside of colab."
+      ],
+      "metadata": {
+        "id": "3xz8zuA7vcS3"
+      }
+    },
+    {
+      "cell_type": "code",
+      "source": [
+        "# This should not be necessary outside of colab.\n",
+        "import nest_asyncio\n",
+        "nest_asyncio.apply()\n"
+      ],
+      "metadata": {
+        "id": "sUqjOzw3wpI3"
+      },
+      "execution_count": null,
+      "outputs": []
+    },
     {
       "cell_type": "markdown",
       "source": [