-
Notifications
You must be signed in to change notification settings - Fork 23
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into raspawar/base_url_validation
- Loading branch information
Showing
23 changed files
with
2,351 additions
and
552 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,352 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "77d79657", | ||
"metadata": {}, | ||
"source": [ | ||
"\n", | ||
"# NVIDIA NIMs with Tool Calling for Agents" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "f8fdde48", | ||
"metadata": { | ||
"jp-MarkdownHeadingCollapsed": true | ||
}, | ||
"source": [ | ||
"This notebook will use a [NVIDIA Llama 3.1 NIM](https://developer.nvidia.com/blog/supercharging-llama-3-1-across-nvidia-platforms/) with tool-calling agent capabilities in generative AI solutions. As mentioned in this [Introductory Blog on LLM Agents](https://developer.nvidia.com/blog/introduction-to-llm-agents/), agents can be described as AI systems that use LLMs to reason through a problem, create a plan to solve the problem, execute the plan with the help of a set of tools, and use memory to store meaningful context of the system state. \n", | ||
"\n", | ||
"The notebook is designed to provide an intro to merely one of the capabilities of agent systems: **tool calling**. \n", | ||
"\n", | ||
"**Tools** are interfaces that accept input, execute an action, and then return a result of that action in a structured output according to a pre-defined schema. They often encompass external API calls that the agent can use to perform tasks that go beyond the capabilities of the LLM, but do not have to be external API calls. For example, to get the current weather in San Diego, a weather tool might be used. Or to get the current score of the 49ers game, a generic web search tool or ESPN tool might be used. \n", | ||
"\n", | ||
"## What is NVIDIA NIM and How do They Support Tool Calling for Agents?\n", | ||
"### What is NIM?\n", | ||
"NIM supports models across domains like chat, embedding, and re-ranking models \n", | ||
"from the community as well as NVIDIA. These models are optimized by NVIDIA to deliver the best performance on NVIDIA \n", | ||
"accelerated infrastructure and deployed as a NIM, an easy-to-use, prebuilt containers that deploy anywhere using a single \n", | ||
"command on NVIDIA accelerated infrastructure. If you're new to NIMs with LangChain, check out the [documentation](https://python.langchain.com/v0.2/docs/integrations/providers/nvidia/).\n", | ||
"\n", | ||
"Now, NIMs support tool calling, also known as \"function calling\" for models that have the aforementioned capability. \n", | ||
"\n", | ||
"This notebook will demonstrate a model that supports function calling, [Llama 3.1 8b-instruct](https://build.nvidia.com/meta/llama-3_1-8b-instruct). \n", | ||
"\n", | ||
"### What does it mean for NIM to support tool usage?\n", | ||
"In order to support tool usage in an agent workflow, first an LLM must be trained to detect when a function should be called and output a structured response like JSON that contains the function to be called and its arguments. \n", | ||
"\n", | ||
"Next, the model is packaged as a NIM, meaning it's optimized to deliver best performance on NVIDIA accelerated infrastructure and easy to deploy as well as use. This microservice packaging also uses OpenAI compatible APIs, so developers can build world-class generative AI agents with ease.\n", | ||
"\n", | ||
"Let's see how to use tools in a couple of examples." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "120455e4", | ||
"metadata": {}, | ||
"source": [ | ||
"## 🔨 Tool Usage -- Web Search\n", | ||
"\n", | ||
"Since a LLM does not have access to the most up-to-date information on the Internet, [Tavily Search](https://docs.tavily.com/docs/tavily-api/introduction) acts as a tool to provide a generative AI application with real-time online information. Tavily is a search engmine that is optimized for AI developers and AI agents. A singular API call abstracts searching, scraping, filtering, and extracting relevant information from online sources. \n", | ||
"\n", | ||
"We'll enhance our NIM, [Llama 3.1-8b-instruct](https://build.nvidia.com/meta/llama-3_1-8b-instruct), with Tavily search. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "1b8f8b6f", | ||
"metadata": {}, | ||
"source": [ | ||
"Install pre-requesites. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "fe4ec61f", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"%pip install -U langchain langgraph langchain-nvidia-ai-endpoints langchain-community langchain-openai tavily-python geocoder" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "6c65b376", | ||
"metadata": {}, | ||
"source": [ | ||
"If you're using NVIDIA hosted NIMs, you'll need to use an API key which you can setup below. Follow [NVIDIA NIMs LangChain documentation](https://python.langchain.com/v0.2/docs/integrations/chat/nvidia_ai_endpoints/) for more information on accessing and using NIMs. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "aaeb35a9", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import getpass\n", | ||
"import os\n", | ||
"\n", | ||
"os.environ[\"NVIDIA_API_KEY\"] = getpass.getpass(\"Enter your NVIDIA API key: \")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "e190dc5e", | ||
"metadata": {}, | ||
"source": [ | ||
"Declare your model that supports tool calling. In this example, we use [Llama 3.1-8b-instruct](https://build.nvidia.com/meta/llama-3_1-8b-instruct). " | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "579881ca", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from langchain_nvidia_ai_endpoints import ChatNVIDIA\n", | ||
"\n", | ||
"llm = ChatNVIDIA(model=\"meta/llama-3.1-8b-instruct\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "9ce17567", | ||
"metadata": {}, | ||
"source": [ | ||
"Initialize [Tavily Tool](https://python.langchain.com/v0.2/docs/integrations/tools/tavily_search/)\n", | ||
"\n", | ||
"Note that this requires an API key - they have a free tier, but if you don't have one or don't want to create one, you can always ignore this step or use a different tool. \n", | ||
"\n", | ||
"Once you create your API key, you will need to set it in the environment." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "c8832545-d3c1-404f-afdb-6a00891f84c9", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import getpass\n", | ||
"import os\n", | ||
"\n", | ||
"os.environ[\"TAVILY_API_KEY\"] = getpass.getpass(\"Enter your Tavily API key: \")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "e1d1511d", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from langchain_community.tools.tavily_search import TavilySearchResults\n", | ||
"\n", | ||
"# Declare a single tool, Tavily search\n", | ||
"tools = [TavilySearchResults(max_results=1)]" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "cd230847", | ||
"metadata": {}, | ||
"source": [ | ||
"Create [ReAct agent](https://python.langchain.com/v0.2/docs/concepts/#react-agents), prebuilt in [LangGraph](https://langchain-ai.github.io/langgraph/#overview). " | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "da73ae35", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from langgraph.prebuilt import create_react_agent\n", | ||
"from langchain.callbacks.tracers import ConsoleCallbackHandler\n", | ||
"\n", | ||
"app = create_react_agent(llm, tools)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "be70d7ee", | ||
"metadata": {}, | ||
"source": [ | ||
"Run agent; a callback is passed to provide more verbose output." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "02a109cc", | ||
"metadata": { | ||
"scrolled": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"query = \"What is LangChain?\"\n", | ||
"messages = app.invoke({\"messages\": [(\"human\", query)]}, config={'callbacks': [ConsoleCallbackHandler()]})\n", | ||
"{\n", | ||
" \"input\": query,\n", | ||
" \"output\": messages[\"messages\"][-1].content,\n", | ||
"}" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "b5e9bbb9", | ||
"metadata": {}, | ||
"source": [ | ||
"## 🔨 Tool Usage -- Adding on a Custom Tool\n", | ||
"\n", | ||
"Let's see how to [define a custom tool](https://python.langchain.com/v0.2/docs/how_to/custom_tools/) for your NIM agent and how it handles multiple tools. \n", | ||
"\n", | ||
"We'll enhance the NIM with Tavily search with some custom tools to determine a user's current location (based on IP address) and return a latitude and longitude. We will use these tools to have Tavily look up the weather in the user's current location." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "46052285-7331-44c2-a7dc-34ebbe4d6b8c", | ||
"metadata": {}, | ||
"source": [ | ||
"First, let's create a custom tool to determine a user's location based off IP address. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "e9d8ed5f-b6e9-495f-85ff-e431d39475c4", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import geocoder\n", | ||
"from langchain.tools import tool\n", | ||
"from typing import Tuple\n", | ||
"\n", | ||
"@tool\n", | ||
"def get_current_location() -> list:\n", | ||
" \"\"\"Return the current location of the user based on IP address\"\"\"\n", | ||
" loc = geocoder.ip('me')\n", | ||
" return loc.latlng " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "089e3223-50f3-4e8e-9043-24c792ca7daf", | ||
"metadata": {}, | ||
"source": [ | ||
"Let's update the tools to use the Tavily tool delcared earlier and also add the `get_current_location` tool." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "b71d7d05-d3ec-4005-911c-3e44df8102b4", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# Declare two tools: Tavily and custom get_current_location tool.\n", | ||
"tools = [TavilySearchResults(max_results=1), get_current_location]" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "cd04f130-3f9b-4a0d-a018-d954dc41ad4b", | ||
"metadata": {}, | ||
"source": [ | ||
"We already declared our LLM, so we don't need to redeclare it. However, we do want to update the agent to have the updated tools." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "64a0eead-ee86-4b0b-8ae3-fb194ea69186", | ||
"metadata": { | ||
"scrolled": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from langgraph.prebuilt import create_react_agent\n", | ||
"from langchain.globals import set_verbose\n", | ||
"from langchain.callbacks.tracers import ConsoleCallbackHandler\n", | ||
"\n", | ||
"set_verbose(True) # verbose output to follow function calling\n", | ||
"\n", | ||
"query = \"What is the current weather where I am?\"\n", | ||
"app = create_react_agent(llm, tools)\n", | ||
"\n", | ||
"\n", | ||
"messages = app.invoke({\"messages\": [(\"human\", query)]}, config={'callbacks': [ConsoleCallbackHandler()]})\n", | ||
"{\n", | ||
" \"input\": query,\n", | ||
" \"output\": messages[\"messages\"][-1].content,\n", | ||
"}" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "147c4727", | ||
"metadata": {}, | ||
"source": [ | ||
"In order to execute this query, first a tool to get the current location needs to be called. Then a tool to get the current weather at that location needs to be called. \n", | ||
"Finally, the result is returned to the user." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "8ace95bd-f2f7-469e-9d9e-ea7b4c57e8f4", | ||
"metadata": {}, | ||
"source": [ | ||
"Below, you can see a diagram of the application's graph. The agent continues to use tools until the query is resolved." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "128b55cf-5ee3-42d2-897b-173a6d696921", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from IPython.display import Image, display\n", | ||
"\n", | ||
"display(Image(app.get_graph(xray=True).draw_mermaid_png()))" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "42ce0ec8-d5bb-4ba8-b2d6-6fe3a0c0aeec", | ||
"metadata": {}, | ||
"source": [ | ||
"## Conclusion\n", | ||
"You've now seen how to use NIMs to do tool calling, an important capability of agents. As mentioned earlier, tools are just one part of agent capabilities, so check out other notebook so see how tools can be used with othe techniques to create agent workflows.\n", | ||
"\n", | ||
"If you're ready to explore more complicated agent workflows, check out [this blog](https://developer.nvidia.com/blog/build-an-agentic-rag-pipeline-with-llama-3-1-and-nvidia-nemo-retriever-nims/) on how to improve your RAG pipeline with agents with Llama 3.1 and NVIDIA NemMo Retriever NIMs." | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.10.13" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
Oops, something went wrong.