diff --git a/libs/ai-endpoints/docs/chat/nvidia_ai_endpoints.ipynb b/libs/ai-endpoints/docs/chat/nvidia_ai_endpoints.ipynb index 666c80b9..dc2236a9 100644 --- a/libs/ai-endpoints/docs/chat/nvidia_ai_endpoints.ipynb +++ b/libs/ai-endpoints/docs/chat/nvidia_ai_endpoints.ipynb @@ -444,92 +444,6 @@ "llm.invoke(f'What\\'s in this image?\\n')" ] }, - { - "cell_type": "markdown", - "id": "3e61d868", - "metadata": {}, - "source": [ - "#### **Advanced Use Case:** Forcing Payload \n", - "\n", - "You may notice that some newer models may have strong parameter expectations that the LangChain connector may not support by default. For example, we cannot invoke the [Kosmos](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/kosmos-2) model at the time of this notebook's latest release due to the lack of a streaming argument on the server side: " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d143e0d6", - "metadata": {}, - "outputs": [], - "source": [ - "from langchain_nvidia_ai_endpoints import ChatNVIDIA\n", - "\n", - "kosmos = ChatNVIDIA(model=\"microsoft/kosmos-2\")\n", - "\n", - "from langchain_core.messages import HumanMessage\n", - "\n", - "# kosmos.invoke(\n", - "# [\n", - "# HumanMessage(\n", - "# content=[\n", - "# {\"type\": \"text\", \"text\": \"Describe this image:\"},\n", - "# {\"type\": \"image_url\", \"image_url\": {\"url\": image_url}},\n", - "# ]\n", - "# )\n", - "# ]\n", - "# )\n", - "\n", - "# Exception: [422] Unprocessable Entity\n", - "# body -> stream\n", - "# Extra inputs are not permitted (type=extra_forbidden)\n", - "# RequestID: 35538c9a-4b45-4616-8b75-7ef816fccf38" - ] - }, - { - "cell_type": "markdown", - "id": "1e230b70", - "metadata": {}, - "source": [ - "For a simple use case like this, we can actually try to force the payload argument of our underlying client by specifying the `payload_fn` function as follows: " - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0925b2b1", - "metadata": {}, - "outputs": [], - "source": [ - "def drop_streaming_key(d):\n", - " \"\"\"Takes in payload dictionary, outputs new payload dictionary\"\"\"\n", - " if \"stream\" in d:\n", - " d.pop(\"stream\")\n", - " return d\n", - "\n", - "\n", - "## Override the payload passthrough. Default is to pass through the payload as is.\n", - "kosmos = ChatNVIDIA(model=\"microsoft/kosmos-2\")\n", - "kosmos.client.payload_fn = drop_streaming_key\n", - "\n", - "kosmos.invoke(\n", - " [\n", - " HumanMessage(\n", - " content=[\n", - " {\"type\": \"text\", \"text\": \"Describe this image:\"},\n", - " {\"type\": \"image_url\", \"image_url\": {\"url\": image_url}},\n", - " ]\n", - " )\n", - " ]\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "fe6e1758", - "metadata": {}, - "source": [ - "For more advanced or custom use-cases (i.e. supporting the diffusion models), you may be interested in leveraging the `NVEModel` client as a requests backbone. The `NVIDIAEmbeddings` class is a good source of inspiration for this. " - ] - }, { "cell_type": "markdown", "id": "137662a6",