beef up agent docs (#10866)

langchain-ai · Sep 21, 2023 · 808caca · 808caca
1 parent 4b558c9
commit 808caca
Show file tree

Hide file tree

Showing 28 changed files with 1,964 additions and 619 deletions.
diff --git a/docs/api_reference/create_api_rst.py b/docs/api_reference/create_api_rst.py
@@ -284,9 +284,12 @@ def _construct_doc(pkg: str, members_by_namespace: Dict[str, ModuleMembers]) ->
 def main() -> None:
     """Generate the reference.rst file for each package."""
     lc_members = _load_package_modules(PKG_DIR)
-    # Put tools.render at the top level
+    # Put some packages at top level
     tools = _load_package_modules(PKG_DIR, "tools")
     lc_members['tools.render'] = tools['render']
+    agents = _load_package_modules(PKG_DIR, "agents")
+    lc_members['agents.output_parsers'] = agents['output_parsers']
+    lc_members['agents.format_scratchpad'] = agents['format_scratchpad']
     lc_doc = ".. _api_reference:\n\n" + _construct_doc("langchain", lc_members)
     with open(WRITE_FILE, "w") as f:
         f.write(lc_doc)

diff --git a/docs/docs_skeleton/docs/modules/agents/agent_types/chat_conversation_agent.mdx b/docs/docs_skeleton/docs/modules/agents/agent_types/chat_conversation_agent.mdx
diff --git a/docs/docs_skeleton/docs/modules/agents/agent_types/index.mdx b/docs/docs_skeleton/docs/modules/agents/agent_types/index.mdx
@@ -2,56 +2,51 @@
 sidebar_position: 0
 ---
 
-# Agent types
-
-## Action agents
+# Agent Types
 
 Agents use an LLM to determine which actions to take and in what order.
 An action can either be using a tool and observing its output, or returning a response to the user.
 Here are the agents available in LangChain.
 
-### [Zero-shot ReAct](/docs/modules/agents/agent_types/react.html)
+## [Zero-shot ReAct](/docs/modules/agents/agent_types/react.html)
 
 This agent uses the [ReAct](https://arxiv.org/pdf/2210.03629) framework to determine which tool to use
 based solely on the tool's description. Any number of tools can be provided.
 This agent requires that a description is provided for each tool.
 
 **Note**: This is the most general purpose action agent.
 
-### [Structured input ReAct](/docs/modules/agents/agent_types/structured_chat.html)
+## [Structured input ReAct](/docs/modules/agents/agent_types/structured_chat.html)
 
 The structured tool chat agent is capable of using multi-input tools.
 Older agents are configured to specify an action input as a single string, but this agent can use a tools' argument
 schema to create a structured action input. This is useful for more complex tool usage, like precisely
 navigating around a browser.
 
-### [OpenAI Functions](/docs/modules/agents/agent_types/openai_functions_agent.html)
+## [OpenAI Functions](/docs/modules/agents/agent_types/openai_functions_agent.html)
 
 Certain OpenAI models (like gpt-3.5-turbo-0613 and gpt-4-0613) have been explicitly fine-tuned to detect when a
 function should be called and respond with the inputs that should be passed to the function.
 The OpenAI Functions Agent is designed to work with these models.
 
-### [Conversational](/docs/modules/agents/agent_types/chat_conversation_agent.html)
+## [Conversational](/docs/modules/agents/agent_types/chat_conversation_agent.html)
 
 This agent is designed to be used in conversational settings.
 The prompt is designed to make the agent helpful and conversational.
 It uses the ReAct framework to decide which tool to use, and uses memory to remember the previous conversation interactions.
 
-### [Self-ask with search](/docs/modules/agents/agent_types/self_ask_with_search.html)
+## [Self-ask with search](/docs/modules/agents/agent_types/self_ask_with_search.html)
 
 This agent utilizes a single tool that should be named `Intermediate Answer`.
 This tool should be able to lookup factual answers to questions. This agent
 is equivalent to the original [self-ask with search paper](https://ofir.io/self-ask.pdf),
 where a Google search API was provided as the tool.
 
-### [ReAct document store](/docs/modules/agents/agent_types/react_docstore.html)
+## [ReAct document store](/docs/modules/agents/agent_types/react_docstore.html)
 
 This agent uses the ReAct framework to interact with a docstore. Two tools must
 be provided: a `Search` tool and a `Lookup` tool (they must be named exactly as so).
 The `Search` tool should search for a document, while the `Lookup` tool should lookup
 a term in the most recently found document.
 This agent is equivalent to the
 original [ReAct paper](https://arxiv.org/pdf/2210.03629.pdf), specifically the Wikipedia example.
-
-## [Plan-and-execute agents](/docs/modules/agents/agent_types/plan_and_execute.html)
-Plan-and-execute agents accomplish an objective by first planning what to do, then executing the sub tasks. This idea is largely inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) and then the ["Plan-and-Solve" paper](https://arxiv.org/abs/2305.04091).
diff --git a/docs/docs_skeleton/docs/modules/agents/agent_types/openai_functions_agent.mdx b/docs/docs_skeleton/docs/modules/agents/agent_types/openai_functions_agent.mdx
diff --git a/docs/docs_skeleton/docs/modules/agents/agent_types/plan_and_execute.mdx b/docs/docs_skeleton/docs/modules/agents/agent_types/plan_and_execute.mdx
diff --git a/docs/docs_skeleton/docs/modules/agents/agent_types/react.mdx b/docs/docs_skeleton/docs/modules/agents/agent_types/react.mdx
diff --git a/docs/docs_skeleton/docs/modules/agents/index.mdx b/docs/docs_skeleton/docs/modules/agents/index.mdx
@@ -7,20 +7,27 @@ The core idea of agents is to use an LLM to choose a sequence of actions to take
 In chains, a sequence of actions is hardcoded (in code).
 In agents, a language model is used as a reasoning engine to determine which actions to take and in which order.
 
+Some important terminology (and schema) to know:
+
+1. `AgentAction`: This is a dataclass that represents the action an agent should take. It has a `tool` property (which is the tool that should be invoked) and a `tool_input` property (the input to that tool)
+2. `AgentFinish`: This is a dataclass that signifies that the agent has finished and should return to the user. It has a `return_values` parameter, which is a dictionary to return. It often only has one key - `output` - that is a string, and so often it is just this key that is returned.
+3. `intermediate_steps`: These represent previous agent actions and corresponding outputs that are passed around. These are important to pass to future iteration so the agent knows what work it has already done. This is typed as a `List[Tuple[AgentAction, Any]]`. Note that observation is currently left as type `Any` to be maximally flexible. In practice, this is often a string.
+
 There are several key components here:
 
 ## Agent
 
-This is the class responsible for deciding what step to take next.
+This is the chain responsible for deciding what step to take next.
 This is powered by a language model and a prompt.
-This prompt can include things like:
+The inputs to this chain are:
+
+1. List of available tools
+2. User input
+3. Any previously executed steps (`intermediate_steps`)
 
-1. The personality of the agent (useful for having it respond in a certain way)
-2. Background context for the agent (useful for giving it more context on the types of tasks it's being asked to do)
-3. Prompting strategies to invoke better reasoning (the most famous/widely used being [ReAct](https://arxiv.org/abs/2210.03629))
+This chain then returns either the next action to take or the final response to send to the user (`AgentAction` or `AgentFinish`).
 
-LangChain provides a few different types of agents to get started.
-Even then, you will likely want to customize those agents with parts (1) and (2).
+Different agents have different prompting styles for reasoning, different ways of encoding input, and different ways of parsing the output.
 For a full list of agent types see [agent types](/docs/modules/agents/agent_types/)
 
 ## Tools
@@ -74,12 +81,22 @@ The `AgentExecutor` class is the main agent runtime supported by LangChain.
 However, there are other, more experimental runtimes we also support.
 These include:
 
-- [Plan-and-execute Agent](/docs/modules/agents/agent_types/plan_and_execute.html)
-- [Baby AGI](/docs/use_cases/autonomous_agents/baby_agi.html)
-- [Auto GPT](/docs/use_cases/autonomous_agents/autogpt.html)
+- [Plan-and-execute Agent](/docs/use_cases/more/agents/autonomous_agents/plan_and_execute)
+- [Baby AGI](/docs/use_cases/more/agents/autonomous_agents/baby_agi)
+- [Auto GPT](/docs/use_cases/more/agents/autonomous_agents/autogpt)
 
 ## Get started
 
 import GetStarted from "@snippets/modules/agents/get_started.mdx"
 
 <GetStarted/>
+
+## Next Steps
+
+Awesome! You've now run your first end-to-end agent.
+To dive deeper, you can:
+
+- Check out all the different [agent types](/docs/modules/agents/agent_types/) supported
+- Learn all the controls for [AgentExecutor](/docs/modules/agents/how_to/)
+- See a full list of all the off-the-shelf [toolkits](/docs/modules/agents/toolkits/) we provide
+- Explore all the individual [tools](/docs/modules/agents/tools/) supported