diff --git a/libs/ai-endpoints/docs/chat/nvidia_ai_endpoints.ipynb b/libs/ai-endpoints/docs/chat/nvidia_ai_endpoints.ipynb
index 666c80b9..dc2236a9 100644
--- a/libs/ai-endpoints/docs/chat/nvidia_ai_endpoints.ipynb
+++ b/libs/ai-endpoints/docs/chat/nvidia_ai_endpoints.ipynb
@@ -444,92 +444,6 @@
     "llm.invoke(f'What\\'s in this image?\\n<img src=\"{base64_with_mime_type}\" />')"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "id": "3e61d868",
-   "metadata": {},
-   "source": [
-    "#### **Advanced Use Case:** Forcing Payload \n",
-    "\n",
-    "You may notice that some newer models may have strong parameter expectations that the LangChain connector may not support by default. For example, we cannot invoke the [Kosmos](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/kosmos-2) model at the time of this notebook's latest release due to the lack of a streaming argument on the server side: "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "d143e0d6",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from langchain_nvidia_ai_endpoints import ChatNVIDIA\n",
-    "\n",
-    "kosmos = ChatNVIDIA(model=\"microsoft/kosmos-2\")\n",
-    "\n",
-    "from langchain_core.messages import HumanMessage\n",
-    "\n",
-    "# kosmos.invoke(\n",
-    "#     [\n",
-    "#         HumanMessage(\n",
-    "#             content=[\n",
-    "#                 {\"type\": \"text\", \"text\": \"Describe this image:\"},\n",
-    "#                 {\"type\": \"image_url\", \"image_url\": {\"url\": image_url}},\n",
-    "#             ]\n",
-    "#         )\n",
-    "#     ]\n",
-    "# )\n",
-    "\n",
-    "# Exception: [422] Unprocessable Entity\n",
-    "# body -> stream\n",
-    "#   Extra inputs are not permitted (type=extra_forbidden)\n",
-    "# RequestID: 35538c9a-4b45-4616-8b75-7ef816fccf38"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "1e230b70",
-   "metadata": {},
-   "source": [
-    "For a simple use case like this, we can actually try to force the payload argument of our underlying client by specifying the `payload_fn` function as follows: "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "0925b2b1",
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "def drop_streaming_key(d):\n",
-    "    \"\"\"Takes in payload dictionary, outputs new payload dictionary\"\"\"\n",
-    "    if \"stream\" in d:\n",
-    "        d.pop(\"stream\")\n",
-    "    return d\n",
-    "\n",
-    "\n",
-    "## Override the payload passthrough. Default is to pass through the payload as is.\n",
-    "kosmos = ChatNVIDIA(model=\"microsoft/kosmos-2\")\n",
-    "kosmos.client.payload_fn = drop_streaming_key\n",
-    "\n",
-    "kosmos.invoke(\n",
-    "    [\n",
-    "        HumanMessage(\n",
-    "            content=[\n",
-    "                {\"type\": \"text\", \"text\": \"Describe this image:\"},\n",
-    "                {\"type\": \"image_url\", \"image_url\": {\"url\": image_url}},\n",
-    "            ]\n",
-    "        )\n",
-    "    ]\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "fe6e1758",
-   "metadata": {},
-   "source": [
-    "For more advanced or custom use-cases (i.e. supporting the diffusion models), you may be interested in leveraging the `NVEModel` client as a requests backbone. The `NVIDIAEmbeddings` class is a good source of inspiration for this. "
-   ]
-  },
   {
    "cell_type": "markdown",
    "id": "137662a6",