From 42421860bccced9d5087ea30703bf236125746c8 Mon Sep 17 00:00:00 2001 From: Lance Martin <122662504+rlancemartin@users.noreply.github.com> Date: Fri, 15 Dec 2023 16:00:55 -0800 Subject: [PATCH 1/5] Add image support for Ollama (#14713) Support [LLaVA](https://ollama.ai/library/llava): * Upgrade Ollama * `ollama pull llava` Ensure compatibility with [image prompt template](https://github.com/langchain-ai/langchain/pull/14263) --------- Co-authored-by: jacoblee93 --- docs/docs/integrations/chat/ollama.ipynb | 354 +++++------------ docs/docs/integrations/llms/ollama.ipynb | 362 ++++-------------- .../langchain_community/chat_models/ollama.py | 150 +++++++- .../langchain_community/llms/ollama.py | 54 ++- 4 files changed, 358 insertions(+), 562 deletions(-) diff --git a/docs/docs/integrations/chat/ollama.ipynb b/docs/docs/integrations/chat/ollama.ipynb index 911f1f30f0739..99b6fba3a0ff1 100644 --- a/docs/docs/integrations/chat/ollama.ipynb +++ b/docs/docs/integrations/chat/ollama.ipynb @@ -66,7 +66,7 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 1, "metadata": {}, "outputs": [], "source": [ @@ -76,7 +76,6 @@ "\n", "chat_model = ChatOllama(\n", " model=\"llama2:7b-chat\",\n", - " callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),\n", ")" ] }, @@ -84,41 +83,28 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "With `StreamingStdOutCallbackHandler`, you will see tokens streamed." + "Optionally, pass `StreamingStdOutCallbackHandler` to stream tokens:\n", + "\n", + "```\n", + "chat_model = ChatOllama(\n", + " model=\"llama2:7b-chat\",\n", + " callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),\n", + ")\n", + "```" ] }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 2, "metadata": {}, "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - " Artificial intelligence (AI) has a rich and varied history that spans several decades. Hinweis: The following is a brief overview of the major milestones in the history of AI, but it is by no means exhaustive.\n", - "\n", - "1. Early Beginnings (1950s-1960s): The term \"Artificial Intelligence\" was coined in 1956 by computer scientist John McCarthy. However, the concept of creating machines that can think and learn like humans dates back to ancient times. In the 1950s and 1960s, researchers began exploring the possibilities of AI using simple algorithms and machine learning techniques.\n", - "2. Rule-Based Systems (1970s-1980s): In the 1970s and 1980s, AI research focused on developing rule-based systems, which use predefined rules to reason and make decisions. This led to the development of expert systems, which were designed to mimic the decision-making abilities of human experts in specific domains.\n", - "3. Machine Learning (1980s-1990s): The 1980s saw a shift towards machine learning, which enables machines to learn from data without being explicitly programmed. This led to the development of algorithms such as decision trees, neural networks, and support vector machines.\n", - "4. Deep Learning (2000s-present): In the early 2000s, deep learning emerged as a subfield of machine learning, focusing on neural networks with multiple layers. These networks can learn complex representations of data, leading to breakthroughs in image and speech recognition, natural language processing, and other areas.\n", - "5. Natural Language Processing (NLP) (1980s-present): NLP has been an active area of research since the 1980s, with a focus on developing algorithms that can understand and generate human language. This has led to applications such as chatbots, voice assistants, and language translation systems.\n", - "6. Robotics (1970s-present): The development of robotics has been closely tied to AI research, with a focus on creating machines that can perform tasks that typically require human intelligence, such as manipulation and locomotion.\n", - "7. Computer Vision (1980s-present): Computer vision has been an active area of research since the 1980s, with a focus on enabling machines to interpret and understand visual data from the world around us. This has led to applications such as image recognition, object detection, and autonomous driving.\n", - "8. Ethics and Society (1990s-present): As AI technology has become more advanced and integrated into various aspects of society, there has been a growing concern about the ethical implications of AI. This includes issues related to privacy, bias, and job displacement.\n", - "9. Reinforcement Learning (2000s-present): Reinforcement learning is a subfield of machine learning that involves training machines to make decisions based on feedback from their environment. This has led to breakthroughs in areas such as game playing, robotics, and autonomous driving.\n", - "10. Generative Models (2010s-present): Generative models are a class of AI algorithms that can generate new data that is similar to a given dataset. This has led to applications such as image synthesis, music generation, and language creation.\n", - "\n", - "These are just a few of the many developments in the history of AI. As the field continues to evolve, we can expect even more exciting breakthroughs and innovations in the years to come." - ] - }, { "data": { "text/plain": [ - "AIMessage(content=' Artificial intelligence (AI) has a rich and varied history that spans several decades. Hinweis: The following is a brief overview of the major milestones in the history of AI, but it is by no means exhaustive.\\n\\n1. Early Beginnings (1950s-1960s): The term \"Artificial Intelligence\" was coined in 1956 by computer scientist John McCarthy. However, the concept of creating machines that can think and learn like humans dates back to ancient times. In the 1950s and 1960s, researchers began exploring the possibilities of AI using simple algorithms and machine learning techniques.\\n2. Rule-Based Systems (1970s-1980s): In the 1970s and 1980s, AI research focused on developing rule-based systems, which use predefined rules to reason and make decisions. This led to the development of expert systems, which were designed to mimic the decision-making abilities of human experts in specific domains.\\n3. Machine Learning (1980s-1990s): The 1980s saw a shift towards machine learning, which enables machines to learn from data without being explicitly programmed. This led to the development of algorithms such as decision trees, neural networks, and support vector machines.\\n4. Deep Learning (2000s-present): In the early 2000s, deep learning emerged as a subfield of machine learning, focusing on neural networks with multiple layers. These networks can learn complex representations of data, leading to breakthroughs in image and speech recognition, natural language processing, and other areas.\\n5. Natural Language Processing (NLP) (1980s-present): NLP has been an active area of research since the 1980s, with a focus on developing algorithms that can understand and generate human language. This has led to applications such as chatbots, voice assistants, and language translation systems.\\n6. Robotics (1970s-present): The development of robotics has been closely tied to AI research, with a focus on creating machines that can perform tasks that typically require human intelligence, such as manipulation and locomotion.\\n7. Computer Vision (1980s-present): Computer vision has been an active area of research since the 1980s, with a focus on enabling machines to interpret and understand visual data from the world around us. This has led to applications such as image recognition, object detection, and autonomous driving.\\n8. Ethics and Society (1990s-present): As AI technology has become more advanced and integrated into various aspects of society, there has been a growing concern about the ethical implications of AI. This includes issues related to privacy, bias, and job displacement.\\n9. Reinforcement Learning (2000s-present): Reinforcement learning is a subfield of machine learning that involves training machines to make decisions based on feedback from their environment. This has led to breakthroughs in areas such as game playing, robotics, and autonomous driving.\\n10. Generative Models (2010s-present): Generative models are a class of AI algorithms that can generate new data that is similar to a given dataset. This has led to applications such as image synthesis, music generation, and language creation.\\n\\nThese are just a few of the many developments in the history of AI. As the field continues to evolve, we can expect even more exciting breakthroughs and innovations in the years to come.', additional_kwargs={}, example=False)" + "AIMessage(content='\\nArtificial intelligence (AI) has a rich and diverse history that spans several decades. Here is a brief overview of the major milestones and events in the development of AI:\\n\\n1. 1950s: The Dartmouth Conference: The field of AI was officially launched at a conference held at Dartmouth College in 1956. Attendees included computer scientists, mathematicians, and cognitive scientists who were interested in exploring the possibilities of creating machines that could simulate human intelligence.\\n2. 1951: The Turing Test: British mathematician Alan Turing proposed a test to measure a machine\\'s ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human. The Turing Test has since become a benchmark for measuring the success of AI systems.\\n3. 1956: The First AI Program: Computer scientist John McCarthy created the first AI program, called the Logical Theorist, which was designed to reason and solve problems using logical deduction.\\n4. 1960s: Rule-Based Expert Systems: The development of rule-based expert systems, which used a set of rules to reason and make decisions, marked a significant milestone in the history of AI. These systems were widely used in industries such as banking, healthcare, and transportation.\\n5. 1970s: Machine Learning: Machine learning, which enables machines to learn from data without being explicitly programmed, emerged as a major area of research in AI. This led to the development of algorithms such as decision trees and neural networks.\\n6. 1980s: Expert Systems: The development of expert systems, which were designed to mimic the decision-making abilities of human experts, reached its peak in the 1980s. These systems were widely used in industries such as banking and healthcare.\\n7. 1990s: AI Winter: Despite the progress that had been made in AI research, the field experienced a decline in funding and interest in the 1990s, which became known as the \"AI winter.\"\\n8. 2000s: Machine Learning Resurgence: The resurgence of machine learning, driven by advances in computational power and data storage, led to a new wave of AI research and applications.\\n9. 2010s: Deep Learning: The development of deep learning algorithms, which are capable of learning complex patterns in large datasets, marked a significant breakthrough in AI research. These algorithms have been used in applications such as image and speech recognition, natural language processing, and autonomous vehicles.\\n10. Present Day: AI is now being applied to a wide range of industries and domains, including healthcare, finance, transportation, and education. The field is continuing to evolve, with new technologies and applications emerging all the time.\\n\\nOverall, the history of AI reflects a long-standing interest in creating machines that can simulate human intelligence. While the field has experienced periods of progress and setbacks, it continues to evolve and expand into new areas of research and application.')" ] }, - "execution_count": 3, + "execution_count": 2, "metadata": {}, "output_type": "execute_result" } @@ -145,7 +131,7 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 3, "metadata": {}, "outputs": [], "source": [ @@ -162,49 +148,16 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - " Sure! Here's a JSON response with the colors of the sky at different times of the day:\n", - " Begriffe und Abkürzungen:\n", - "\n", - "* `time`: The time of day (in 24-hour format)\n", - "* `sky_color`: The color of the sky at that time (as a hex code)\n", - "\n", - "Here are the colors of the sky at different times of the day:\n", - "```json\n", - "[\n", - " {\n", - " \"time\": \"6am\",\n", - " \"sky_color\": \"#0080c0\"\n", - " },\n", - " {\n", - " \"time\": \"9am\",\n", - " \"sky_color\": \"#3498db\"\n", - " },\n", - " {\n", - " \"time\": \"12pm\",\n", - " \"sky_color\": \"#ef7c00\"\n", - " },\n", - " {\n", - " \"time\": \"3pm\",\n", - " \"sky_color\": \"#9564b6\"\n", - " },\n", - " {\n", - " \"time\": \"6pm\",\n", - " \"sky_color\": \"#e78ac3\"\n", - " },\n", - " {\n", - " \"time\": \"9pm\",\n", - " \"sky_color\": \"#5f006a\"\n", - " }\n", - "]\n", - "```\n", - "In this response, the `time` property is a string in 24-hour format, representing the time of day. The `sky_color` property is a hex code representing the color of the sky at that time. For example, at 6am, the sky is blue (#0080c0), while at 9pm, it's dark blue (#5f006a)." + "{\"morning\": {\"sky\": \"pink\", \"sun\": \"rise\"}, \"daytime\": {\"sky\": \"blue\", \"sun\": \"high\"}, \"afternoon\": {\"sky\": \"gray\", \"sun\": \"peak\"}, \"evening\": {\"sky\": \"orange\", \"sun\": \"set\"}}\n", + " \t\n", + "\n" ] } ], @@ -222,30 +175,32 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ - " Sure! Based on the JSON schema you provided, here's the information we can gather about a person named John who is 35 years old and loves pizza:\n", - "\n", - "**Name:** John\n", - "\n", - "**Age:** 35 (integer)\n", - "\n", - "**Favorite food:** Pizza (string)\n", - "\n", - "So, the JSON object for John would look like this:\n", - "```json\n", "{\n", " \"name\": \"John\",\n", " \"age\": 35,\n", " \"fav_food\": \"pizza\"\n", "}\n", - "```\n", - "Note that we cannot provide additional information about John beyond what is specified in the schema. For example, we do not have any information about his gender, occupation, or address, as those fields are not included in the schema." + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n", + "\n" ] } ], @@ -287,235 +242,126 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "## RAG\n", + "## Multi-modal\n", "\n", - "We can use Olama with RAG, [just as shown here](https://python.langchain.com/docs/use_cases/question_answering/local_retrieval_qa).\n", + "Ollama has support for multi-modal LLMs, such as [bakllava](https://ollama.ai/library/bakllava) and [llava](https://ollama.ai/library/llava).\n", "\n", - "Let's use the 13b model:\n", + "Browse the full set of versions for models with `tags`, such as [here](https://ollama.ai/library/llava/tags).\n", "\n", + "Download the desired LLM:\n", "```\n", - "ollama pull llama2:13b\n", + "ollama pull bakllava\n", "```\n", "\n", - "Let's also use local embeddings from `OllamaEmbeddings` and `Chroma`." + "Be sure to update Ollama so that you have the most recent version to support multi-modal." ] }, { "cell_type": "code", "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "! pip install chromadb" - ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [], - "source": [ - "from langchain.document_loaders import WebBaseLoader\n", - "\n", - "loader = WebBaseLoader(\"https://lilianweng.github.io/posts/2023-06-23-agent/\")\n", - "data = loader.load()\n", - "\n", - "from langchain.text_splitter import RecursiveCharacterTextSplitter\n", - "\n", - "text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)\n", - "all_splits = text_splitter.split_documents(data)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, + "metadata": { + "scrolled": true + }, "outputs": [], "source": [ - "from langchain.embeddings import OllamaEmbeddings\n", - "from langchain.vectorstores import Chroma\n", - "\n", - "vectorstore = Chroma.from_documents(documents=all_splits, embedding=OllamaEmbeddings())" + "%pip install pillow" ] }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 1, "metadata": {}, "outputs": [ { "data": { + "text/html": [ + "" + ], "text/plain": [ - "4" + "" ] }, - "execution_count": 7, "metadata": {}, - "output_type": "execute_result" + "output_type": "display_data" } ], "source": [ - "question = \"What are the approaches to Task Decomposition?\"\n", - "docs = vectorstore.similarity_search(question)\n", - "len(docs)" - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "metadata": {}, - "outputs": [], - "source": [ - "from langchain.prompts import PromptTemplate\n", - "\n", - "# Prompt\n", - "template = \"\"\"[INST] <> Use the following pieces of context to answer the question at the end. \n", - "If you don't know the answer, just say that you don't know, don't try to make up an answer. \n", - "Use three sentences maximum and keep the answer as concise as possible. <>\n", - "{context}\n", - "Question: {question}\n", - "Helpful Answer:[/INST]\"\"\"\n", - "QA_CHAIN_PROMPT = PromptTemplate(\n", - " input_variables=[\"context\", \"question\"],\n", - " template=template,\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "metadata": {}, - "outputs": [], - "source": [ - "# Chat model\n", - "from langchain.callbacks.manager import CallbackManager\n", - "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n", - "from langchain.chat_models import ChatOllama\n", + "import base64\n", + "from io import BytesIO\n", "\n", - "chat_model = ChatOllama(\n", - " model=\"llama2:13b\",\n", - " verbose=True,\n", - " callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [], - "source": [ - "# QA chain\n", - "from langchain.chains import RetrievalQA\n", + "from IPython.display import HTML, display\n", + "from PIL import Image\n", "\n", - "qa_chain = RetrievalQA.from_chain_type(\n", - " chat_model,\n", - " retriever=vectorstore.as_retriever(),\n", - " chain_type_kwargs={\"prompt\": QA_CHAIN_PROMPT},\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - " Based on the provided context, there are three approaches to task decomposition for AI agents:\n", - "\n", - "1. LLM with simple prompting, such as \"Steps for XYZ.\" or \"What are the subgoals for achieving XYZ?\"\n", - "2. Task-specific instructions, such as \"Write a story outline\" for writing a novel.\n", - "3. Human inputs." - ] - } - ], - "source": [ - "question = \"What are the various approaches to Task Decomposition for AI Agents?\"\n", - "result = qa_chain({\"query\": question})" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can also get logging for tokens." - ] - }, - { - "cell_type": "code", - "execution_count": 16, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - " Based on the given context, here is the answer to the question \"What are the approaches to Task Decomposition?\"\n", - "\n", - "There are three approaches to task decomposition:\n", - "\n", - "1. LLM with simple prompting, such as \"Steps for XYZ.\" or \"What are the subgoals for achieving XYZ?\"\n", - "2. Using task-specific instructions, like \"Write a story outline\" for writing a novel.\n", - "3. With human inputs.{'model': 'llama2:13b-chat', 'created_at': '2023-08-23T15:37:51.469127Z', 'done': True, 'context': [1, 29871, 1, 29961, 25580, 29962, 518, 25580, 29962, 518, 25580, 29962, 3532, 14816, 29903, 6778, 4803, 278, 1494, 12785, 310, 3030, 304, 1234, 278, 1139, 472, 278, 1095, 29889, 29871, 13, 3644, 366, 1016, 29915, 29873, 1073, 278, 1234, 29892, 925, 1827, 393, 366, 1016, 29915, 29873, 1073, 29892, 1016, 29915, 29873, 1018, 304, 1207, 701, 385, 1234, 29889, 29871, 13, 11403, 2211, 25260, 7472, 322, 3013, 278, 1234, 408, 3022, 895, 408, 1950, 29889, 529, 829, 14816, 29903, 6778, 13, 5398, 26227, 508, 367, 2309, 313, 29896, 29897, 491, 365, 26369, 411, 2560, 9508, 292, 763, 376, 7789, 567, 363, 1060, 29979, 29999, 7790, 29876, 29896, 19602, 376, 5618, 526, 278, 1014, 1484, 1338, 363, 3657, 15387, 1060, 29979, 29999, 29973, 613, 313, 29906, 29897, 491, 773, 3414, 29899, 14940, 11994, 29936, 321, 29889, 29887, 29889, 376, 6113, 263, 5828, 27887, 1213, 363, 5007, 263, 9554, 29892, 470, 313, 29941, 29897, 411, 5199, 10970, 29889, 13, 13, 5398, 26227, 508, 367, 2309, 313, 29896, 29897, 491, 365, 26369, 411, 2560, 9508, 292, 763, 376, 7789, 567, 363, 1060, 29979, 29999, 7790, 29876, 29896, 19602, 376, 5618, 526, 278, 1014, 1484, 1338, 363, 3657, 15387, 1060, 29979, 29999, 29973, 613, 313, 29906, 29897, 491, 773, 3414, 29899, 14940, 11994, 29936, 321, 29889, 29887, 29889, 376, 6113, 263, 5828, 27887, 1213, 363, 5007, 263, 9554, 29892, 470, 313, 29941, 29897, 411, 5199, 10970, 29889, 13, 13, 1451, 16047, 267, 297, 1472, 29899, 8489, 18987, 322, 3414, 26227, 29901, 1858, 9450, 975, 263, 3309, 29891, 4955, 322, 17583, 3902, 8253, 278, 1650, 2913, 3933, 18066, 292, 29889, 365, 26369, 29879, 21117, 304, 10365, 13900, 746, 20050, 411, 15668, 4436, 29892, 3907, 963, 3109, 16424, 9401, 304, 25618, 1058, 5110, 515, 14260, 322, 1059, 29889, 13, 13, 1451, 16047, 267, 297, 1472, 29899, 8489, 18987, 322, 3414, 26227, 29901, 1858, 9450, 975, 263, 3309, 29891, 4955, 322, 17583, 3902, 8253, 278, 1650, 2913, 3933, 18066, 292, 29889, 365, 26369, 29879, 21117, 304, 10365, 13900, 746, 20050, 411, 15668, 4436, 29892, 3907, 963, 3109, 16424, 9401, 304, 25618, 1058, 5110, 515, 14260, 322, 1059, 29889, 13, 16492, 29901, 1724, 526, 278, 13501, 304, 9330, 897, 510, 3283, 29973, 13, 29648, 1319, 673, 10834, 29914, 25580, 29962, 518, 29914, 25580, 29962, 518, 29914, 25580, 29962, 29871, 16564, 373, 278, 2183, 3030, 29892, 1244, 338, 278, 1234, 304, 278, 1139, 376, 5618, 526, 278, 13501, 304, 9330, 897, 510, 3283, 3026, 13, 13, 8439, 526, 2211, 13501, 304, 3414, 26227, 29901, 13, 13, 29896, 29889, 365, 26369, 411, 2560, 9508, 292, 29892, 1316, 408, 376, 7789, 567, 363, 1060, 29979, 29999, 1213, 470, 376, 5618, 526, 278, 1014, 1484, 1338, 363, 3657, 15387, 1060, 29979, 29999, 3026, 13, 29906, 29889, 5293, 3414, 29899, 14940, 11994, 29892, 763, 376, 6113, 263, 5828, 27887, 29908, 363, 5007, 263, 9554, 29889, 13, 29941, 29889, 2973, 5199, 10970, 29889, 2], 'total_duration': 9514823750, 'load_duration': 795542, 'sample_count': 99, 'sample_duration': 68732000, 'prompt_eval_count': 146, 'prompt_eval_duration': 6206275000, 'eval_count': 98, 'eval_duration': 3229641000}\n" - ] - } - ], - "source": [ - "from langchain.callbacks.base import BaseCallbackHandler\n", - "from langchain.schema import LLMResult\n", "\n", + "def convert_to_base64(pil_image):\n", + " \"\"\"\n", + " Convert PIL images to Base64 encoded strings\n", "\n", - "class GenerationStatisticsCallback(BaseCallbackHandler):\n", - " def on_llm_end(self, response: LLMResult, **kwargs) -> None:\n", - " print(response.generations[0][0].generation_info)\n", + " :param pil_image: PIL image\n", + " :return: Re-sized Base64 string\n", + " \"\"\"\n", "\n", + " buffered = BytesIO()\n", + " pil_image.save(buffered, format=\"JPEG\") # You can change the format if needed\n", + " img_str = base64.b64encode(buffered.getvalue()).decode(\"utf-8\")\n", + " return img_str\n", "\n", - "callback_manager = CallbackManager(\n", - " [StreamingStdOutCallbackHandler(), GenerationStatisticsCallback()]\n", - ")\n", "\n", - "chat_model = ChatOllama(\n", - " model=\"llama2:13b-chat\", verbose=True, callback_manager=callback_manager\n", - ")\n", + "def plt_img_base64(img_base64):\n", + " \"\"\"\n", + " Disply base64 encoded string as image\n", "\n", - "qa_chain = RetrievalQA.from_chain_type(\n", - " chat_model,\n", - " retriever=vectorstore.as_retriever(),\n", - " chain_type_kwargs={\"prompt\": QA_CHAIN_PROMPT},\n", - ")\n", + " :param img_base64: Base64 string\n", + " \"\"\"\n", + " # Create an HTML img tag with the base64 string as the source\n", + " image_html = f''\n", + " # Display the image by rendering the HTML\n", + " display(HTML(image_html))\n", "\n", - "question = \"What are the approaches to Task Decomposition?\"\n", - "result = qa_chain({\"query\": question})" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "`eval_count` / (`eval_duration`/10e9) gets `tok / s`" + "\n", + "file_path = \"/Users/rlm/Desktop/Eval_Sets/multi_modal_presentations/DDOG/img_23.jpg\"\n", + "pil_image = Image.open(file_path)\n", + "\n", + "image_b64 = convert_to_base64(pil_image)\n", + "plt_img_base64(image_b64)" ] }, { "cell_type": "code", - "execution_count": 17, + "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "30.343929867127645" + "AIMessage(content='90%')" ] }, - "execution_count": 17, + "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "98 / (3229641000 / 1000 / 1000 / 1000)" + "from langchain.chat_models import ChatOllama\n", + "from langchain_core.messages import HumanMessage\n", + "\n", + "chat_model = ChatOllama(\n", + " model=\"bakllava\",\n", + ")\n", + "\n", + "# Call the chat model with both messages and images\n", + "content_parts = []\n", + "image_part = {\n", + " \"type\": \"image_url\",\n", + " \"image_url\": f\"data:image/jpeg;base64,{image_b64}\",\n", + "}\n", + "text_part = {\"type\": \"text\", \"text\": \"What is the Daollar-based gross retention rate?\"}\n", + "\n", + "content_parts.append(image_part)\n", + "content_parts.append(text_part)\n", + "prompt = [HumanMessage(content=content_parts)]\n", + "chat_model(prompt)" ] } ], @@ -535,7 +381,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.12" + "version": "3.9.16" } }, "nbformat": 4, diff --git a/docs/docs/integrations/llms/ollama.ipynb b/docs/docs/integrations/llms/ollama.ipynb index e6bd21944883f..adbf4eccac8ea 100644 --- a/docs/docs/integrations/llms/ollama.ipynb +++ b/docs/docs/integrations/llms/ollama.ipynb @@ -20,8 +20,8 @@ "\n", "* [Download](https://ollama.ai/download)\n", "* Fetch a model via `ollama pull `\n", - "* e.g., for `Llama-7b`: `ollama pull llama2` (see full list [here](https://github.com/jmorganca/ollama))\n", - "* This will download the most basic version of the model typically (e.g., smallest # parameters and `q4_0`)\n", + "* e.g., for `Llama-7b`: `ollama pull llama2` (see full list [here](https://ollama.ai/library)\n", + "* This will download the most basic version of the model typically (e.g., smallest # parameters)\n", "* On Mac, it will download to \n", "\n", "`~/.ollama/models/manifests/registry.ollama.ai/library//latest`\n", @@ -61,369 +61,147 @@ "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n", "from langchain.llms import Ollama\n", "\n", - "llm = Ollama(\n", - " model=\"llama2\", callback_manager=CallbackManager([StreamingStdOutCallbackHandler()])\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "With `StreamingStdOutCallbackHandler`, you will see tokens streamed." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "llm(\"Tell me about the history of AI\")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Ollama supports embeddings via `OllamaEmbeddings`:\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from langchain.embeddings import OllamaEmbeddings\n", - "\n", - "oembed = OllamaEmbeddings(base_url=\"http://localhost:11434\", model=\"llama2\")\n", - "oembed.embed_query(\"Llamas are social animals and live with others as a herd.\")" + "llm = Ollama(model=\"llama2\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ - "## RAG\n", - "\n", - "We can use Olama with RAG, [just as shown here](https://python.langchain.com/docs/use_cases/question_answering/local_retrieval_qa).\n", + "Optionally, pass `StreamingStdOutCallbackHandler` to stream tokens:\n", "\n", - "Let's use the 13b model:\n", - "\n", - "```\n", - "ollama pull llama2:13b\n", "```\n", - "\n", - "Let's also use local embeddings from `OllamaEmbeddings` and `Chroma`." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "! pip install chromadb" - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "# Load web page\n", - "from langchain.document_loaders import WebBaseLoader\n", - "\n", - "loader = WebBaseLoader(\"https://lilianweng.github.io/posts/2023-06-23-agent/\")\n", - "data = loader.load()" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [], - "source": [ - "# Split into chunks\n", - "from langchain.text_splitter import RecursiveCharacterTextSplitter\n", - "\n", - "text_splitter = RecursiveCharacterTextSplitter(chunk_size=1500, chunk_overlap=100)\n", - "all_splits = text_splitter.split_documents(data)" - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Found model file at /Users/rlm/.cache/gpt4all/ggml-all-MiniLM-L6-v2-f16.bin\n" - ] - }, - { - "name": "stderr", - "output_type": "stream", - "text": [ - "objc[77472]: Class GGMLMetalClass is implemented in both /Users/rlm/miniforge3/envs/llama2/lib/python3.9/site-packages/gpt4all/llmodel_DO_NOT_MODIFY/build/libreplit-mainline-metal.dylib (0x17f754208) and /Users/rlm/miniforge3/envs/llama2/lib/python3.9/site-packages/gpt4all/llmodel_DO_NOT_MODIFY/build/libllamamodel-mainline-metal.dylib (0x17fb80208). One of the two will be used. Which one is undefined.\n" - ] - } - ], - "source": [ - "# Embed and store\n", - "from langchain.embeddings import (\n", - " GPT4AllEmbeddings,\n", - " OllamaEmbeddings, # We can also try Ollama embeddings\n", + "llm = Ollama(\n", + " model=\"llama2\"\n", + " callback_manager=CallbackManager([StreamingStdOutCallbackHandler()\n", ")\n", - "from langchain.vectorstores import Chroma\n", - "\n", - "vectorstore = Chroma.from_documents(documents=all_splits, embedding=GPT4AllEmbeddings())" + "```" ] }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "4" + "' Artificial intelligence (AI) has a rich and varied history that spans several decades. październik 1950s and has evolved significantly over time. Here is a brief overview of the major milestones in the history of AI:\\n\\n1. 1950s: The Dartmouth Conference - Considered the birthplace of AI, this conference brought together computer scientists, mathematicians, and cognitive scientists to discuss the possibilities of creating machines that could simulate human intelligence. Attendees included John McCarthy, Marvin Minsky, Nathaniel Rochester, and Claude Shannon.\\n2. 1951: The Turing Test - Alan Turing proposed a test to measure a machine\\'s ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human. The Turing Test has since become a benchmark for measuring the success of AI systems.\\n3. 1956: The First AI Program - John McCarthy created the first AI program, called the Logical Theorist, which was designed to reason and solve problems using logical deduction.\\n4. 1960s: Rule-Based Expert Systems - Researchers developed rule-based expert systems, which used a set of rules to reason and make decisions. These systems were widely used in industries such as banking and healthcare.\\n5. 1970s: Machine Learning -Machine learning, a subfield of AI, emerged as a way for machines to learn from data without being explicitly programmed. This led to the development of algorithms such as decision trees and neural networks.\\n6. 1980s: Expert Systems - The development of expert systems, which were designed to mimic the decision-making abilities of human experts, reached its peak in the 1980s. These systems were widely used in industries such as banking and healthcare.\\n7. 1990s: AI Winter - Despite the progress made in AI research, the field experienced a decline in funding and interest in the 1990s, known as the \"AI winter.\"\\n8. 2000s: AI Resurgence - The resurgence of AI began in the early 2000s with the development of new algorithms and techniques, such as support vector machines and deep learning. This led to a renewed interest in AI research and applications.\\n9. 2010s: Rise of Deep Learning - The development of deep learning algorithms, which are capable of learning and improving on their own by analyzing large amounts of data, has been a major factor in the recent progress made in AI. These algorithms have been used in applications such as image recognition, natural language processing, and autonomous vehicles.\\n10. Present Day: AI Continues to Advance - AI is continuing to advance at a rapid pace, with new techniques and applications emerging all the time. Areas of research include natural language processing, computer vision, robotics, and more.\\n\\nSome notable people who have made significant contributions to the field of AI include:\\n\\n1. Alan Turing - Considered one of the pioneers of AI, Turing proposed the Turing Test and developed the concept of a universal machine.\\n2. John McCarthy - McCarthy is known as the \"father of AI\" for his work in developing the field of AI. He coined the term \"Artificial Intelligence\" and was instrumental in organizing the Dartmouth Conference.\\n3. Marvin Minsky - Minsky was a pioneer in the field of neural networks and co-founder of the MIT AI Laboratory.\\n4. Nathaniel Rochester - Rochester was a computer scientist and cognitive scientist who worked on early AI projects, including the development of the Logical Theorist.\\n5. Claude Shannon - Shannon was a mathematician and electrical engineer who is known for his work on information theory, which has had a significant impact on the field of AI.\\n6. Yann LeCun - LeCun is a computer scientist and the director of AI Research at Facebook. He is also the Silver Professor of Computer Science at New York University, and a professor at the Courant Institute of Mathematical Sciences.\\n7. Geoffrey Hinton - Hinton is a computer scientist and cognitive psychologist who is known for his work on artificial neural networks. He is a pioneer in the field of deep learning and has made significant contributions to the development of convolutional neural networks (CNNs).\\n8. Yoshua Bengio - Bengio is a computer scientist and a pioneer in the field of deep learning. He is known for his work on recurrent neural networks (RNNs) and has made significant contributions to the development of CNNs and RNNs.\\n9. Andrew Ng - Ng is a computer scientist and entrepreneur who has made significant contributions to the field of AI. He is known for his work on deep learning and has worked at Google, where he founded the Google Brain deep learning project, and at Baidu, where he led the company\\'s AI group.\\n10. Demis Hassabis - Hassabis is a computer scientist and entrepreneur who is known for his work on deep learning and artificial intelligence. He is the co-founder of DeepMind, which was acquired by Alphabet in 2014, and has made significant contributions to the field of AI.\\n\\nThese are just a few examples of notable people who have made significant contributions to the field of AI. There are many other researchers and scientists who have also made important advancements in the field.'" ] }, - "execution_count": 7, + "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "# Retrieve\n", - "question = \"How can Task Decomposition be done?\"\n", - "docs = vectorstore.similarity_search(question)\n", - "len(docs)" + "llm(\"Tell me about the history of AI\")" ] }, { - "cell_type": "code", - "execution_count": 9, + "cell_type": "markdown", "metadata": {}, - "outputs": [], "source": [ - "# RAG prompt\n", - "from langchain import hub\n", + "## Multi-modal\n", + "\n", + "Ollama has support for multi-modal LLMs, such as [bakllava](https://ollama.ai/library/bakllava) and [llava](https://ollama.ai/library/llava).\n", + "\n", + "```\n", + "ollama pull bakllava\n", + "```\n", "\n", - "QA_CHAIN_PROMPT = hub.pull(\"rlm/rag-prompt-llama\")" + "Be sure to update Ollama so that you have the most recent version to support multi-modal." ] }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 1, "metadata": {}, "outputs": [], "source": [ - "# LLM\n", - "from langchain.callbacks.manager import CallbackManager\n", - "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n", "from langchain.llms import Ollama\n", "\n", - "llm = Ollama(\n", - " model=\"llama2\",\n", - " verbose=True,\n", - " callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": {}, - "outputs": [], - "source": [ - "# QA chain\n", - "from langchain.chains import RetrievalQA\n", - "\n", - "qa_chain = RetrievalQA.from_chain_type(\n", - " llm,\n", - " retriever=vectorstore.as_retriever(),\n", - " chain_type_kwargs={\"prompt\": QA_CHAIN_PROMPT},\n", - ")" + "bakllava = Ollama(model=\"bakllava\")" ] }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 2, "metadata": {}, "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - " There are several approaches to task decomposition for AI agents, including:\n", - "\n", - "1. Chain of thought (CoT): This involves instructing the model to \"think step by step\" and use more test-time computation to decompose hard tasks into smaller and simpler steps.\n", - "2. Tree of thoughts (ToT): This extends CoT by exploring multiple reasoning possibilities at each step, creating a tree structure. The search process can be BFS or DFS with each state evaluated by a classifier or majority vote.\n", - "3. Using task-specific instructions: For example, \"Write a story outline.\" for writing a novel.\n", - "4. Human inputs: The agent can receive input from a human operator to perform tasks that require creativity and domain expertise.\n", - "\n", - "These approaches allow the agent to break down complex tasks into manageable subgoals, enabling efficient handling of tasks and improving the quality of final results through self-reflection and refinement." - ] + "data": { + "text/html": [ + "" + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" } ], "source": [ - "question = \"What are the various approaches to Task Decomposition for AI Agents?\"\n", - "result = qa_chain({\"query\": question})" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can also get logging for tokens." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from langchain.callbacks.base import BaseCallbackHandler\n", - "from langchain.schema import LLMResult\n", + "import base64\n", + "from io import BytesIO\n", "\n", + "from IPython.display import HTML, display\n", + "from PIL import Image\n", "\n", - "class GenerationStatisticsCallback(BaseCallbackHandler):\n", - " def on_llm_end(self, response: LLMResult, **kwargs) -> None:\n", - " print(response.generations[0][0].generation_info)\n", "\n", + "def convert_to_base64(pil_image):\n", + " \"\"\"\n", + " Convert PIL images to Base64 encoded strings\n", "\n", - "callback_manager = CallbackManager(\n", - " [StreamingStdOutCallbackHandler(), GenerationStatisticsCallback()]\n", - ")\n", + " :param pil_image: PIL image\n", + " :return: Re-sized Base64 string\n", + " \"\"\"\n", "\n", - "llm = Ollama(\n", - " base_url=\"http://localhost:11434\",\n", - " model=\"llama2\",\n", - " verbose=True,\n", - " callback_manager=callback_manager,\n", - ")\n", + " buffered = BytesIO()\n", + " pil_image.save(buffered, format=\"JPEG\") # You can change the format if needed\n", + " img_str = base64.b64encode(buffered.getvalue()).decode(\"utf-8\")\n", + " return img_str\n", "\n", - "qa_chain = RetrievalQA.from_chain_type(\n", - " llm,\n", - " retriever=vectorstore.as_retriever(),\n", - " chain_type_kwargs={\"prompt\": QA_CHAIN_PROMPT},\n", - ")\n", "\n", - "question = \"What are the approaches to Task Decomposition?\"\n", - "result = qa_chain({\"query\": question})" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "`eval_count` / (`eval_duration`/10e9) gets `tok / s`" + "def plt_img_base64(img_base64):\n", + " \"\"\"\n", + " Disply base64 encoded string as image\n", + "\n", + " :param img_base64: Base64 string\n", + " \"\"\"\n", + " # Create an HTML img tag with the base64 string as the source\n", + " image_html = f''\n", + " # Display the image by rendering the HTML\n", + " display(HTML(image_html))\n", + "\n", + "\n", + "file_path = \"/Users/rlm/Desktop/Eval_Sets/multi_modal_presentations/DDOG/img_23.jpg\"\n", + "pil_image = Image.open(file_path)\n", + "image_b64 = convert_to_base64(pil_image)\n", + "plt_img_base64(image_b64)" ] }, { "cell_type": "code", - "execution_count": 57, + "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "47.22003469910937" + "'90%'" ] }, - "execution_count": 57, + "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ - "62 / (1313002000 / 1000 / 1000 / 1000)" + "llm_with_image_context = bakllava.bind(images=[image_b64])\n", + "llm_with_image_context.invoke(\"What is the dollar based gross retention rate:\")" ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Using the Hub for prompt management\n", - " \n", - "Open-source models often benefit from specific prompts. \n", - "\n", - "For example, [Mistral 7b](https://mistral.ai/news/announcing-mistral-7b/) was fine-tuned for chat using the prompt format shown [here](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1).\n", - "\n", - "Get the model: `ollama pull mistral:7b-instruct`" - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [], - "source": [ - "# LLM\n", - "from langchain.callbacks.manager import CallbackManager\n", - "from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n", - "from langchain.llms import Ollama\n", - "\n", - "llm = Ollama(\n", - " model=\"mistral:7b-instruct\",\n", - " verbose=True,\n", - " callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [], - "source": [ - "from langchain import hub\n", - "\n", - "QA_CHAIN_PROMPT = hub.pull(\"rlm/rag-prompt-mistral\")\n", - "\n", - "# QA chain\n", - "from langchain.chains import RetrievalQA\n", - "\n", - "qa_chain = RetrievalQA.from_chain_type(\n", - " llm,\n", - " retriever=vectorstore.as_retriever(),\n", - " chain_type_kwargs={\"prompt\": QA_CHAIN_PROMPT},\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 17, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n", - "There are different approaches to Task Decomposition for AI Agents such as Chain of thought (CoT) and Tree of Thoughts (ToT). CoT breaks down big tasks into multiple manageable tasks and generates multiple thoughts per step, while ToT explores multiple reasoning possibilities at each step. Task decomposition can be done by LLM with simple prompting or using task-specific instructions or human inputs." - ] - } - ], - "source": [ - "question = \"What are the various approaches to Task Decomposition for AI Agents?\"\n", - "result = qa_chain({\"query\": question})" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] } ], "metadata": { @@ -442,7 +220,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.10.12" + "version": "3.9.16" } }, "nbformat": 4, diff --git a/libs/community/langchain_community/chat_models/ollama.py b/libs/community/langchain_community/chat_models/ollama.py index 91dda64e45e43..54aa8a8c8cf2d 100644 --- a/libs/community/langchain_community/chat_models/ollama.py +++ b/libs/community/langchain_community/chat_models/ollama.py @@ -1,6 +1,7 @@ import json -from typing import Any, Iterator, List, Optional +from typing import Any, Dict, Iterator, List, Optional, Union +from langchain_core._api import deprecated from langchain_core.callbacks import ( CallbackManagerForLLMRun, ) @@ -15,9 +16,10 @@ ) from langchain_core.outputs import ChatGeneration, ChatGenerationChunk, ChatResult -from langchain_community.llms.ollama import _OllamaCommon +from langchain_community.llms.ollama import OllamaEndpointNotFoundError, _OllamaCommon +@deprecated("0.0.3", alternative="_chat_stream_response_to_chat_generation_chunk") def _stream_response_to_chat_generation_chunk( stream_response: str, ) -> ChatGenerationChunk: @@ -30,6 +32,20 @@ def _stream_response_to_chat_generation_chunk( ) +def _chat_stream_response_to_chat_generation_chunk( + stream_response: str, +) -> ChatGenerationChunk: + """Convert a stream response to a generation chunk.""" + parsed_response = json.loads(stream_response) + generation_info = parsed_response if parsed_response.get("done") is True else None + return ChatGenerationChunk( + message=AIMessageChunk( + content=parsed_response.get("message", {}).get("content", "") + ), + generation_info=generation_info, + ) + + class ChatOllama(BaseChatModel, _OllamaCommon): """Ollama locally runs large language models. @@ -52,11 +68,15 @@ def is_lc_serializable(cls) -> bool: """Return whether this model can be serialized by Langchain.""" return False + @deprecated("0.0.3", alternative="_convert_messages_to_ollama_messages") def _format_message_as_text(self, message: BaseMessage) -> str: if isinstance(message, ChatMessage): message_text = f"\n\n{message.role.capitalize()}: {message.content}" elif isinstance(message, HumanMessage): - message_text = f"[INST] {message.content} [/INST]" + if message.content[0].get("type") == "text": + message_text = f"[INST] {message.content[0]['text']} [/INST]" + elif message.content[0].get("type") == "image_url": + message_text = message.content[0]["image_url"]["url"] elif isinstance(message, AIMessage): message_text = f"{message.content}" elif isinstance(message, SystemMessage): @@ -70,6 +90,98 @@ def _format_messages_as_text(self, messages: List[BaseMessage]) -> str: [self._format_message_as_text(message) for message in messages] ) + def _convert_messages_to_ollama_messages( + self, messages: List[BaseMessage] + ) -> List[Dict[str, Union[str, List[str]]]]: + ollama_messages = [] + for message in messages: + role = "" + if isinstance(message, HumanMessage): + role = "user" + elif isinstance(message, AIMessage): + role = "assistant" + elif isinstance(message, SystemMessage): + role = "system" + else: + raise ValueError("Received unsupported message type for Ollama.") + + content = "" + images = [] + if isinstance(message.content, str): + content = message.content + else: + for content_part in message.content: + if content_part.get("type") == "text": + content += f"\n{content_part['text']}" + elif content_part.get("type") == "image_url": + if isinstance(content_part.get("image_url"), str): + image_url_components = content_part["image_url"].split(",") + # Support data:image/jpeg;base64, format + # and base64 strings + if len(image_url_components) > 1: + images.append(image_url_components[1]) + else: + images.append(image_url_components[0]) + else: + raise ValueError( + "Only string image_url " "content parts are supported." + ) + else: + raise ValueError( + "Unsupported message content type. " + "Must either have type 'text' or type 'image_url' " + "with a string 'image_url' field." + ) + + ollama_messages.append( + { + "role": role, + "content": content, + "images": images, + } + ) + + return ollama_messages + + def _create_chat_stream( + self, + messages: List[BaseMessage], + stop: Optional[List[str]] = None, + **kwargs: Any, + ) -> Iterator[str]: + payload = { + "messages": self._convert_messages_to_ollama_messages(messages), + } + yield from self._create_stream( + payload=payload, stop=stop, api_url=f"{self.base_url}/api/chat/", **kwargs + ) + + def _chat_stream_with_aggregation( + self, + messages: List[BaseMessage], + stop: Optional[List[str]] = None, + run_manager: Optional[CallbackManagerForLLMRun] = None, + verbose: bool = False, + **kwargs: Any, + ) -> ChatGenerationChunk: + final_chunk: Optional[ChatGenerationChunk] = None + for stream_resp in self._create_chat_stream(messages, stop, **kwargs): + if stream_resp: + chunk = _chat_stream_response_to_chat_generation_chunk(stream_resp) + if final_chunk is None: + final_chunk = chunk + else: + final_chunk += chunk + if run_manager: + run_manager.on_llm_new_token( + chunk.text, + verbose=verbose, + ) + if final_chunk is None: + raise ValueError("No data received from Ollama stream.") + + return final_chunk + def _generate( self, messages: List[BaseMessage], @@ -94,9 +206,12 @@ def _generate( ]) """ - prompt = self._format_messages_as_text(messages) - final_chunk = super()._stream_with_aggregation( - prompt, stop=stop, run_manager=run_manager, verbose=self.verbose, **kwargs + final_chunk = self._chat_stream_with_aggregation( + messages, + stop=stop, + run_manager=run_manager, + verbose=self.verbose, + **kwargs, ) chat_generation = ChatGeneration( message=AIMessage(content=final_chunk.text), @@ -110,9 +225,30 @@ def _stream( stop: Optional[List[str]] = None, run_manager: Optional[CallbackManagerForLLMRun] = None, **kwargs: Any, + ) -> Iterator[ChatGenerationChunk]: + try: + for stream_resp in self._create_chat_stream(messages, stop, **kwargs): + if stream_resp: + chunk = _stream_response_to_chat_generation_chunk(stream_resp) + yield chunk + if run_manager: + run_manager.on_llm_new_token( + chunk.text, + verbose=self.verbose, + ) + except OllamaEndpointNotFoundError: + yield from self._legacy_stream(messages, stop, **kwargs) + + @deprecated("0.0.3", alternative="_stream") + def _legacy_stream( + self, + messages: List[BaseMessage], + stop: Optional[List[str]] = None, + run_manager: Optional[CallbackManagerForLLMRun] = None, + **kwargs: Any, ) -> Iterator[ChatGenerationChunk]: prompt = self._format_messages_as_text(messages) - for stream_resp in self._create_stream(prompt, stop, **kwargs): + for stream_resp in self._create_generate_stream(prompt, stop, **kwargs): if stream_resp: chunk = _stream_response_to_chat_generation_chunk(stream_resp) yield chunk diff --git a/libs/community/langchain_community/llms/ollama.py b/libs/community/langchain_community/llms/ollama.py index 3551ba446ef36..64ddf82c801cb 100644 --- a/libs/community/langchain_community/llms/ollama.py +++ b/libs/community/langchain_community/llms/ollama.py @@ -20,6 +20,10 @@ def _stream_response_to_generation_chunk( ) +class OllamaEndpointNotFoundError(Exception): + """Raised when the Ollama endpoint is not found.""" + + class _OllamaCommon(BaseLanguageModel): base_url: str = "http://localhost:11434" """Base url the model is hosted under.""" @@ -129,10 +133,26 @@ def _identifying_params(self) -> Mapping[str, Any]: """Get the identifying parameters.""" return {**{"model": self.model, "format": self.format}, **self._default_params} - def _create_stream( + def _create_generate_stream( self, prompt: str, stop: Optional[List[str]] = None, + images: Optional[List[str]] = None, + **kwargs: Any, + ) -> Iterator[str]: + payload = {"prompt": prompt, "images": images} + yield from self._create_stream( + payload=payload, + stop=stop, + api_url=f"{self.base_url}/api/generate/", + **kwargs, + ) + + def _create_stream( + self, + api_url: str, + payload: Any, + stop: Optional[List[str]] = None, **kwargs: Any, ) -> Iterator[str]: if self.stop is not None and stop is not None: @@ -156,20 +176,34 @@ def _create_stream( **kwargs, } + if payload.get("messages"): + request_payload = {"messages": payload.get("messages", []), **params} + else: + request_payload = { + "prompt": payload.get("prompt"), + "images": payload.get("images", []), + **params, + } + response = requests.post( - url=f"{self.base_url}/api/generate/", + url=api_url, headers={"Content-Type": "application/json"}, - json={"prompt": prompt, **params}, + json=request_payload, stream=True, timeout=self.timeout, ) response.encoding = "utf-8" if response.status_code != 200: - optional_detail = response.json().get("error") - raise ValueError( - f"Ollama call failed with status code {response.status_code}." - f" Details: {optional_detail}" - ) + if response.status_code == 404: + raise OllamaEndpointNotFoundError( + "Ollama call failed with status code 404." + ) + else: + optional_detail = response.json().get("error") + raise ValueError( + f"Ollama call failed with status code {response.status_code}." + f" Details: {optional_detail}" + ) return response.iter_lines(decode_unicode=True) def _stream_with_aggregation( @@ -181,7 +215,7 @@ def _stream_with_aggregation( **kwargs: Any, ) -> GenerationChunk: final_chunk: Optional[GenerationChunk] = None - for stream_resp in self._create_stream(prompt, stop, **kwargs): + for stream_resp in self._create_generate_stream(prompt, stop, **kwargs): if stream_resp: chunk = _stream_response_to_generation_chunk(stream_resp) if final_chunk is None: @@ -225,6 +259,7 @@ def _generate( self, prompts: List[str], stop: Optional[List[str]] = None, + images: Optional[List[str]] = None, run_manager: Optional[CallbackManagerForLLMRun] = None, **kwargs: Any, ) -> LLMResult: @@ -248,6 +283,7 @@ def _generate( final_chunk = super()._stream_with_aggregation( prompt, stop=stop, + images=images, run_manager=run_manager, verbose=self.verbose, **kwargs, From eca89f87d80f5d88f82a637fc5d9573e1e3a51ed Mon Sep 17 00:00:00 2001 From: Leonid Ganeline Date: Fri, 15 Dec 2023 16:03:59 -0800 Subject: [PATCH 2/5] docs: `google drive` update (#14781) The [Google Drive toolkit](https://python.langchain.com/docs/integrations/toolkits/google_drive) page is a duplicate of the [Google Drive tool](https://python.langchain.com/docs/integrations/tools/google_drive) page. - Removed the `Google Drive toolkit` page (it shouldn't be a toolkit but tool) - Removed the correspondent reference in the Google platform page - Redirected the removed page to the tool page. --- docs/docs/integrations/platforms/google.mdx | 17 -- .../integrations/toolkits/google_drive.ipynb | 218 ------------------ docs/vercel.json | 4 + 3 files changed, 4 insertions(+), 235 deletions(-) delete mode 100644 docs/docs/integrations/toolkits/google_drive.ipynb diff --git a/docs/docs/integrations/platforms/google.mdx b/docs/docs/integrations/platforms/google.mdx index 03a8b4a10fb74..350ff563baed8 100644 --- a/docs/docs/integrations/platforms/google.mdx +++ b/docs/docs/integrations/platforms/google.mdx @@ -486,23 +486,6 @@ from langchain.agents.agent_toolkits import GmailToolkit ``` -### Google Drive - -This toolkit uses the `Google Drive API`. - -We need to install several python packages. - -```bash -pip install google-api-python-client google-auth-httplib2 google-auth-oauthlib -``` - -See a [usage example and authorization instructions](/docs/integrations/toolkits/google_drive). - -```python -from langchain_googledrive.utilities.google_drive import GoogleDriveAPIWrapper -from langchain_googledrive.tools.google_drive.tool import GoogleDriveSearchTool -``` - ## Chat Loaders ### GMail diff --git a/docs/docs/integrations/toolkits/google_drive.ipynb b/docs/docs/integrations/toolkits/google_drive.ipynb deleted file mode 100644 index 9832af659c385..0000000000000 --- a/docs/docs/integrations/toolkits/google_drive.ipynb +++ /dev/null @@ -1,218 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Google Drive tool\n", - "\n", - "This notebook walks through connecting a LangChain to the Google Drive API.\n", - "\n", - "## Prerequisites\n", - "\n", - "1. Create a Google Cloud project or use an existing project\n", - "1. Enable the [Google Drive API](https://console.cloud.google.com/flows/enableapi?apiid=drive.googleapis.com)\n", - "1. [Authorize credentials for desktop app](https://developers.google.com/drive/api/quickstart/python#authorize_credentials_for_a_desktop_application)\n", - "1. `pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib`\n", - "\n", - "## Instructions for retrieving your Google Docs data\n", - "By default, the `GoogleDriveTools` and `GoogleDriveWrapper` expects the `credentials.json` file to be `~/.credentials/credentials.json`, but this is configurable using the `GOOGLE_ACCOUNT_FILE` environment variable. \n", - "The location of `token.json` use the same directory (or use the parameter `token_path`). Note that `token.json` will be created automatically the first time you use the tool.\n", - "\n", - "`GoogleDriveSearchTool` can retrieve a selection of files with some requests. \n", - "\n", - "By default, If you use a `folder_id`, all the files inside this folder can be retrieved to `Document`, if the name match the query.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#!pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can obtain your folder and document id from the URL:\n", - "\n", - "* Folder: https://drive.google.com/drive/u/0/folders/1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5 -> folder id is `\"1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5\"`\n", - "* Document: https://docs.google.com/document/d/1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw/edit -> document id is `\"1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw\"`\n", - "\n", - "The special value `root` is for your personal home." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "folder_id = \"root\"\n", - "# folder_id='1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5'" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "By default, all files with these mime-type can be converted to `Document`.\n", - "- text/text\n", - "- text/plain\n", - "- text/html\n", - "- text/csv\n", - "- text/markdown\n", - "- image/png\n", - "- image/jpeg\n", - "- application/epub+zip\n", - "- application/pdf\n", - "- application/rtf\n", - "- application/vnd.google-apps.document (GDoc)\n", - "- application/vnd.google-apps.presentation (GSlide)\n", - "- application/vnd.google-apps.spreadsheet (GSheet)\n", - "- application/vnd.google.colaboratory (Notebook colab)\n", - "- application/vnd.openxmlformats-officedocument.presentationml.presentation (PPTX)\n", - "- application/vnd.openxmlformats-officedocument.wordprocessingml.document (DOCX)\n", - "\n", - "It's possible to update or customize this. See the documentation of `GoogleDriveAPIWrapper`.\n", - "\n", - "But, the corresponding packages must installed." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "#!pip install unstructured" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "from langchain_googledrive.tools.google_drive.tool import GoogleDriveSearchTool\n", - "from langchain_googledrive.utilities.google_drive import GoogleDriveAPIWrapper\n", - "\n", - "# By default, search only in the filename.\n", - "tool = GoogleDriveSearchTool(\n", - " api_wrapper=GoogleDriveAPIWrapper(\n", - " folder_id=folder_id,\n", - " num_results=2,\n", - " template=\"gdrive-query-in-folder\", # Search in the body of documents\n", - " )\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import logging\n", - "\n", - "logging.basicConfig(level=logging.INFO)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "tool.run(\"machine learning\")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "tool.description" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "from langchain.agents import load_tools\n", - "\n", - "tools = load_tools(\n", - " [\"google-drive-search\"],\n", - " folder_id=folder_id,\n", - " template=\"gdrive-query-in-folder\",\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Use within an Agent" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "from langchain.agents import AgentType, initialize_agent\n", - "from langchain.llms import OpenAI\n", - "\n", - "llm = OpenAI(temperature=0)\n", - "agent = initialize_agent(\n", - " tools=tools,\n", - " llm=llm,\n", - " agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "tags": [] - }, - "outputs": [], - "source": [ - "agent.run(\"Search in google drive, who is 'Yann LeCun' ?\")" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.9" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} diff --git a/docs/vercel.json b/docs/vercel.json index ee9f02ad1ad4b..a12ad16193fa9 100644 --- a/docs/vercel.json +++ b/docs/vercel.json @@ -1,5 +1,9 @@ { "redirects": [ + { + "source": "/docs/integrations/toolkits/google_drive", + "destination": "/docs/integrations/tools/google_drive" + }, { "source": "/docs/use_cases/question_answering/analyze_document", "destination": "/cookbook" From dcead816df66dfb36e5baffd6cd6cb6281d32591 Mon Sep 17 00:00:00 2001 From: Dmitry Tyumentsev <56769451+tyumentsev4@users.noreply.github.com> Date: Sat, 16 Dec 2023 03:25:09 +0300 Subject: [PATCH 3/5] community[patch]: Update YandexGPT API (#14773) Update LLMand Chat model to use new api version --------- Co-authored-by: Dmitry Tyumentsev --- docs/docs/integrations/chat/yandex.ipynb | 21 ++-- docs/docs/integrations/llms/yandex.ipynb | 25 ++-- .../langchain_community/chat_models/yandex.py | 114 ++++++++++++++---- .../langchain_community/llms/yandex.py | 94 ++++++++++----- 4 files changed, 185 insertions(+), 69 deletions(-) diff --git a/docs/docs/integrations/chat/yandex.ipynb b/docs/docs/integrations/chat/yandex.ipynb index 0e1ced9b6397a..6d3a14b4eee46 100644 --- a/docs/docs/integrations/chat/yandex.ipynb +++ b/docs/docs/integrations/chat/yandex.ipynb @@ -42,13 +42,20 @@ "Next, you have two authentication options:\n", "- [IAM token](https://cloud.yandex.com/en/docs/iam/operations/iam-token/create-for-sa).\n", " You can specify the token in a constructor parameter `iam_token` or in an environment variable `YC_IAM_TOKEN`.\n", + "\n", "- [API key](https://cloud.yandex.com/en/docs/iam/operations/api-key/create)\n", - " You can specify the key in a constructor parameter `api_key` or in an environment variable `YC_API_KEY`." + " You can specify the key in a constructor parameter `api_key` or in an environment variable `YC_API_KEY`.\n", + "\n", + "In the `model_uri` parameter, specify the model used, see [the documentation](https://cloud.yandex.com/en/docs/yandexgpt/concepts/models#yandexgpt-generation) for more details.\n", + "\n", + "To specify the model you can use `model_uri` parameter, see [the documentation](https://cloud.yandex.com/en/docs/yandexgpt/concepts/models#yandexgpt-generation) for more details.\n", + "\n", + "By default, the latest version of `yandexgpt-lite` is used from the folder specified in the parameter `folder_id` or `YC_FOLDER_ID` environment variable." ] }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 1, "id": "eba2d63b-f871-4f61-b55f-f6092bdc297a", "metadata": {}, "outputs": [], @@ -59,7 +66,7 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": 2, "id": "75905d9a-dfae-43aa-95b9-a160280e43f7", "metadata": {}, "outputs": [], @@ -69,17 +76,17 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 3, "id": "40844fe7-7fe5-4679-b6c9-1b3238807bdc", "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "AIMessage(content=\"Je t'aime programmer.\")" + "AIMessage(content='Je adore le programmement.')" ] }, - "execution_count": 8, + "execution_count": 3, "metadata": {}, "output_type": "execute_result" } @@ -113,7 +120,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.18" + "version": "3.10.13" } }, "nbformat": 4, diff --git a/docs/docs/integrations/llms/yandex.ipynb b/docs/docs/integrations/llms/yandex.ipynb index de36dd4409228..019cb608a5d4f 100644 --- a/docs/docs/integrations/llms/yandex.ipynb +++ b/docs/docs/integrations/llms/yandex.ipynb @@ -29,13 +29,20 @@ "Next, you have two authentication options:\n", "- [IAM token](https://cloud.yandex.com/en/docs/iam/operations/iam-token/create-for-sa).\n", " You can specify the token in a constructor parameter `iam_token` or in an environment variable `YC_IAM_TOKEN`.\n", + "\n", "- [API key](https://cloud.yandex.com/en/docs/iam/operations/api-key/create)\n", - " You can specify the key in a constructor parameter `api_key` or in an environment variable `YC_API_KEY`." + " You can specify the key in a constructor parameter `api_key` or in an environment variable `YC_API_KEY`.\n", + "\n", + "In the `model_uri` parameter, specify the model used, see [the documentation](https://cloud.yandex.com/en/docs/yandexgpt/concepts/models#yandexgpt-generation) for more details.\n", + "\n", + "To specify the model you can use `model_uri` parameter, see [the documentation](https://cloud.yandex.com/en/docs/yandexgpt/concepts/models#yandexgpt-generation) for more details.\n", + "\n", + "By default, the latest version of `yandexgpt-lite` is used from the folder specified in the parameter `folder_id` or `YC_FOLDER_ID` environment variable." ] }, { "cell_type": "code", - "execution_count": 246, + "execution_count": 1, "metadata": {}, "outputs": [], "source": [ @@ -46,7 +53,7 @@ }, { "cell_type": "code", - "execution_count": 247, + "execution_count": 2, "metadata": {}, "outputs": [], "source": [ @@ -56,7 +63,7 @@ }, { "cell_type": "code", - "execution_count": 248, + "execution_count": 3, "metadata": {}, "outputs": [], "source": [ @@ -65,7 +72,7 @@ }, { "cell_type": "code", - "execution_count": 249, + "execution_count": 4, "metadata": {}, "outputs": [], "source": [ @@ -74,16 +81,16 @@ }, { "cell_type": "code", - "execution_count": 250, + "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ - "'Moscow'" + "'The capital of Russia is Moscow.'" ] }, - "execution_count": 250, + "execution_count": 5, "metadata": {}, "output_type": "execute_result" } @@ -111,7 +118,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.9.18" + "version": "3.10.13" } }, "nbformat": 4, diff --git a/libs/community/langchain_community/chat_models/yandex.py b/libs/community/langchain_community/chat_models/yandex.py index f94be83a89976..9dc6ede9e11e8 100644 --- a/libs/community/langchain_community/chat_models/yandex.py +++ b/libs/community/langchain_community/chat_models/yandex.py @@ -1,6 +1,6 @@ """Wrapper around YandexGPT chat models.""" import logging -from typing import Any, Dict, List, Optional, Tuple, cast +from typing import Any, Dict, List, Optional, cast from langchain_core.callbacks import ( AsyncCallbackManagerForLLMRun, @@ -25,14 +25,13 @@ def _parse_message(role: str, text: str) -> Dict: return {"role": role, "text": text} -def _parse_chat_history(history: List[BaseMessage]) -> Tuple[List[Dict[str, str]], str]: +def _parse_chat_history(history: List[BaseMessage]) -> List[Dict[str, str]]: """Parse a sequence of messages into history. Returns: - A tuple of a list of parsed messages and an instruction message for the model. + A list of parsed messages. """ chat_history = [] - instruction = "" for message in history: content = cast(str, message.content) if isinstance(message, HumanMessage): @@ -40,8 +39,8 @@ def _parse_chat_history(history: List[BaseMessage]) -> Tuple[List[Dict[str, str] if isinstance(message, AIMessage): chat_history.append(_parse_message("assistant", content)) if isinstance(message, SystemMessage): - instruction = content - return chat_history, instruction + chat_history.append(_parse_message("system", content)) + return chat_history class ChatYandexGPT(_BaseYandexGPT, BaseChatModel): @@ -84,9 +83,14 @@ def _generate( try: import grpc from google.protobuf.wrappers_pb2 import DoubleValue, Int64Value - from yandex.cloud.ai.llm.v1alpha.llm_pb2 import GenerationOptions, Message - from yandex.cloud.ai.llm.v1alpha.llm_service_pb2 import ChatRequest - from yandex.cloud.ai.llm.v1alpha.llm_service_pb2_grpc import ( + from yandex.cloud.ai.foundation_models.v1.foundation_models_pb2 import ( + CompletionOptions, + Message, + ) + from yandex.cloud.ai.foundation_models.v1.foundation_models_service_pb2 import ( # noqa: E501 + CompletionRequest, + ) + from yandex.cloud.ai.foundation_models.v1.foundation_models_service_pb2_grpc import ( # noqa: E501 TextGenerationServiceStub, ) except ImportError as e: @@ -97,25 +101,20 @@ def _generate( raise ValueError( "You should provide at least one message to start the chat!" ) - message_history, instruction = _parse_chat_history(messages) + message_history = _parse_chat_history(messages) channel_credentials = grpc.ssl_channel_credentials() channel = grpc.secure_channel(self.url, channel_credentials) - request = ChatRequest( - model=self.model_name, - generation_options=GenerationOptions( + request = CompletionRequest( + model_uri=self.model_uri, + completion_options=CompletionOptions( temperature=DoubleValue(value=self.temperature), max_tokens=Int64Value(value=self.max_tokens), ), - instruction_text=instruction, messages=[Message(**message) for message in message_history], ) stub = TextGenerationServiceStub(channel) - if self.iam_token: - metadata = (("authorization", f"Bearer {self.iam_token}"),) - else: - metadata = (("authorization", f"Api-Key {self.api_key}"),) - res = stub.Chat(request, metadata=metadata) - text = list(res)[0].message.text + res = stub.Completion(request, metadata=self._grpc_metadata) + text = list(res)[0].alternatives[0].message.text text = text if stop is None else enforce_stop_tokens(text, stop) message = AIMessage(content=text) return ChatResult(generations=[ChatGeneration(message=message)]) @@ -127,6 +126,75 @@ async def _agenerate( run_manager: Optional[AsyncCallbackManagerForLLMRun] = None, **kwargs: Any, ) -> ChatResult: - raise NotImplementedError( - """YandexGPT doesn't support async requests at the moment.""" - ) + """Async method to generate next turn in the conversation. + + Args: + messages: The history of the conversation as a list of messages. + stop: The list of stop words (optional). + run_manager: The CallbackManager for LLM run, it's not used at the moment. + + Returns: + The ChatResult that contains outputs generated by the model. + + Raises: + ValueError: if the last message in the list is not from human. + """ + try: + import asyncio + + import grpc + from google.protobuf.wrappers_pb2 import DoubleValue, Int64Value + from yandex.cloud.ai.foundation_models.v1.foundation_models_pb2 import ( + CompletionOptions, + Message, + ) + from yandex.cloud.ai.foundation_models.v1.foundation_models_service_pb2 import ( # noqa: E501 + CompletionRequest, + CompletionResponse, + ) + from yandex.cloud.ai.foundation_models.v1.foundation_models_service_pb2_grpc import ( # noqa: E501 + TextGenerationAsyncServiceStub, + ) + from yandex.cloud.operation.operation_service_pb2 import GetOperationRequest + from yandex.cloud.operation.operation_service_pb2_grpc import ( + OperationServiceStub, + ) + except ImportError as e: + raise ImportError( + "Please install YandexCloud SDK" " with `pip install yandexcloud`." + ) from e + if not messages: + raise ValueError( + "You should provide at least one message to start the chat!" + ) + message_history = _parse_chat_history(messages) + operation_api_url = "operation.api.cloud.yandex.net:443" + channel_credentials = grpc.ssl_channel_credentials() + async with grpc.aio.secure_channel(self.url, channel_credentials) as channel: + request = CompletionRequest( + model_uri=self.model_uri, + completion_options=CompletionOptions( + temperature=DoubleValue(value=self.temperature), + max_tokens=Int64Value(value=self.max_tokens), + ), + messages=[Message(**message) for message in message_history], + ) + stub = TextGenerationAsyncServiceStub(channel) + operation = await stub.Completion(request, metadata=self._grpc_metadata) + async with grpc.aio.secure_channel( + operation_api_url, channel_credentials + ) as operation_channel: + operation_stub = OperationServiceStub(operation_channel) + while not operation.done: + await asyncio.sleep(1) + operation_request = GetOperationRequest(operation_id=operation.id) + operation = await operation_stub.Get( + operation_request, metadata=self._grpc_metadata + ) + + instruct_response = CompletionResponse() + operation.response.Unpack(instruct_response) + text = instruct_response.alternatives[0].message.text + if stop is not None: + text = enforce_stop_tokens(text, stop) + return text diff --git a/libs/community/langchain_community/llms/yandex.py b/libs/community/langchain_community/llms/yandex.py index d82daeba55cd6..3f6d59c770f1d 100644 --- a/libs/community/langchain_community/llms/yandex.py +++ b/libs/community/langchain_community/llms/yandex.py @@ -14,13 +14,19 @@ class _BaseYandexGPT(Serializable): iam_token: str = "" - """Yandex Cloud IAM token for service account + """Yandex Cloud IAM token for service or user account with the `ai.languageModels.user` role""" api_key: str = "" """Yandex Cloud Api Key for service account with the `ai.languageModels.user` role""" - model_name: str = "general" + folder_id: str = "" + """Yandex Cloud folder ID""" + model_uri: str = "" + """Model uri to use.""" + model_name: str = "yandexgpt-lite" """Model name to use.""" + model_version: str = "latest" + """Model version to use.""" temperature: float = 0.6 """What sampling temperature to use. Should be a double number between 0 (inclusive) and 1 (inclusive).""" @@ -45,8 +51,27 @@ def validate_environment(cls, values: Dict) -> Dict: values["iam_token"] = iam_token api_key = get_from_dict_or_env(values, "api_key", "YC_API_KEY", "") values["api_key"] = api_key + folder_id = get_from_dict_or_env(values, "folder_id", "YC_FOLDER_ID", "") + values["folder_id"] = folder_id if api_key == "" and iam_token == "": raise ValueError("Either 'YC_API_KEY' or 'YC_IAM_TOKEN' must be provided.") + + if values["iam_token"]: + values["_grpc_metadata"] = [ + ("authorization", f"Bearer {values['iam_token']}") + ] + if values["folder_id"]: + values["_grpc_metadata"].append(("x-folder-id", values["folder_id"])) + else: + values["_grpc_metadata"] = ( + ("authorization", f"Api-Key {values['api_key']}"), + ) + if values["model_uri"] == "" and values["folder_id"] == "": + raise ValueError("Either 'model_uri' or 'folder_id' must be provided.") + if not values["model_uri"]: + values[ + "model_uri" + ] = f"gpt://{values['folder_id']}/{values['model_name']}/{values['model_version']}" return values @@ -62,18 +87,23 @@ class YandexGPT(_BaseYandexGPT, LLM): - You can specify the key in a constructor parameter `api_key` or in an environment variable `YC_API_KEY`. + To use the default model specify the folder ID in a parameter `folder_id` + or in an environment variable `YC_FOLDER_ID`. + + Or specify the model URI in a constructor parameter `model_uri` + Example: .. code-block:: python from langchain_community.llms import YandexGPT - yandex_gpt = YandexGPT(iam_token="t1.9eu...") + yandex_gpt = YandexGPT(iam_token="t1.9eu...", folder_id="b1g...") """ @property def _identifying_params(self) -> Mapping[str, Any]: """Get the identifying parameters.""" return { - "model_name": self.model_name, + "model_uri": self.model_uri, "temperature": self.temperature, "max_tokens": self.max_tokens, "stop": self.stop, @@ -103,9 +133,14 @@ def _call( try: import grpc from google.protobuf.wrappers_pb2 import DoubleValue, Int64Value - from yandex.cloud.ai.llm.v1alpha.llm_pb2 import GenerationOptions - from yandex.cloud.ai.llm.v1alpha.llm_service_pb2 import InstructRequest - from yandex.cloud.ai.llm.v1alpha.llm_service_pb2_grpc import ( + from yandex.cloud.ai.foundation_models.v1.foundation_models_pb2 import ( + CompletionOptions, + Message, + ) + from yandex.cloud.ai.foundation_models.v1.foundation_models_service_pb2 import ( # noqa: E501 + CompletionRequest, + ) + from yandex.cloud.ai.foundation_models.v1.foundation_models_service_pb2_grpc import ( # noqa: E501 TextGenerationServiceStub, ) except ImportError as e: @@ -114,21 +149,21 @@ def _call( ) from e channel_credentials = grpc.ssl_channel_credentials() channel = grpc.secure_channel(self.url, channel_credentials) - request = InstructRequest( - model=self.model_name, - request_text=prompt, - generation_options=GenerationOptions( + request = CompletionRequest( + model_uri=self.model_uri, + completion_options=CompletionOptions( temperature=DoubleValue(value=self.temperature), max_tokens=Int64Value(value=self.max_tokens), ), + messages=[Message(role="user", text=prompt)], ) stub = TextGenerationServiceStub(channel) if self.iam_token: metadata = (("authorization", f"Bearer {self.iam_token}"),) else: metadata = (("authorization", f"Api-Key {self.api_key}"),) - res = stub.Instruct(request, metadata=metadata) - text = list(res)[0].alternatives[0].text + res = stub.Completion(request, metadata=metadata) + text = list(res)[0].alternatives[0].message.text if stop is not None: text = enforce_stop_tokens(text, stop) return text @@ -154,12 +189,15 @@ async def _acall( import grpc from google.protobuf.wrappers_pb2 import DoubleValue, Int64Value - from yandex.cloud.ai.llm.v1alpha.llm_pb2 import GenerationOptions - from yandex.cloud.ai.llm.v1alpha.llm_service_pb2 import ( - InstructRequest, - InstructResponse, + from yandex.cloud.ai.foundation_models.v1.foundation_models_pb2 import ( + CompletionOptions, + Message, + ) + from yandex.cloud.ai.foundation_models.v1.foundation_models_service_pb2 import ( # noqa: E501 + CompletionRequest, + CompletionResponse, ) - from yandex.cloud.ai.llm.v1alpha.llm_service_pb2_grpc import ( + from yandex.cloud.ai.foundation_models.v1.foundation_models_service_pb2_grpc import ( # noqa: E501 TextGenerationAsyncServiceStub, ) from yandex.cloud.operation.operation_service_pb2 import GetOperationRequest @@ -173,20 +211,16 @@ async def _acall( operation_api_url = "operation.api.cloud.yandex.net:443" channel_credentials = grpc.ssl_channel_credentials() async with grpc.aio.secure_channel(self.url, channel_credentials) as channel: - request = InstructRequest( - model=self.model_name, - request_text=prompt, - generation_options=GenerationOptions( + request = CompletionRequest( + model_uri=self.model_uri, + completion_options=CompletionOptions( temperature=DoubleValue(value=self.temperature), max_tokens=Int64Value(value=self.max_tokens), ), + messages=[Message(role="user", text=prompt)], ) stub = TextGenerationAsyncServiceStub(channel) - if self.iam_token: - metadata = (("authorization", f"Bearer {self.iam_token}"),) - else: - metadata = (("authorization", f"Api-Key {self.api_key}"),) - operation = await stub.Instruct(request, metadata=metadata) + operation = await stub.Completion(request, metadata=self._grpc_metadata) async with grpc.aio.secure_channel( operation_api_url, channel_credentials ) as operation_channel: @@ -195,12 +229,12 @@ async def _acall( await asyncio.sleep(1) operation_request = GetOperationRequest(operation_id=operation.id) operation = await operation_stub.Get( - operation_request, metadata=metadata + operation_request, metadata=self._grpc_metadata ) - instruct_response = InstructResponse() + instruct_response = CompletionResponse() operation.response.Unpack(instruct_response) - text = instruct_response.alternatives[0].text + text = instruct_response.alternatives[0].message.text if stop is not None: text = enforce_stop_tokens(text, stop) return text From 34e6f3ff72067af3265341bcea7983c106f15a74 Mon Sep 17 00:00:00 2001 From: Noah Stapp Date: Fri, 15 Dec 2023 16:49:21 -0800 Subject: [PATCH 4/5] community[patch]: Implement similarity_score_threshold for MongoDB Vector Store (#14740) Adds the option for `similarity_score_threshold` when using `MongoDBAtlasVectorSearch` as a vector store retriever. Example use: ``` vector_search = MongoDBAtlasVectorSearch.from_documents(...) qa_retriever = vector_search.as_retriever( search_type="similarity_score_threshold", search_kwargs={ "score_threshold": 0.5, } ) qa = RetrievalQA.from_chain_type( llm=OpenAI(), chain_type="stuff", retriever=qa_retriever, ) docs = qa({"query": "..."}) ``` I've tested this feature locally, using a MongoDB Atlas Cluster with a vector search index. --- .../vectorstores/mongodb_atlas.py | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/libs/community/langchain_community/vectorstores/mongodb_atlas.py b/libs/community/langchain_community/vectorstores/mongodb_atlas.py index 61c901940fa95..87fa45e711cdc 100644 --- a/libs/community/langchain_community/vectorstores/mongodb_atlas.py +++ b/libs/community/langchain_community/vectorstores/mongodb_atlas.py @@ -4,6 +4,7 @@ from typing import ( TYPE_CHECKING, Any, + Callable, Dict, Generator, Iterable, @@ -60,6 +61,7 @@ def __init__( index_name: str = "default", text_key: str = "text", embedding_key: str = "embedding", + relevance_score_fn: str = "cosine", ): """ Args: @@ -70,17 +72,32 @@ def __init__( embedding_key: MongoDB field that will contain the embedding for each document. index_name: Name of the Atlas Search index. + relevance_score_fn: The similarity score used for the index. + Currently supported: Euclidean, cosine, and dot product. """ self._collection = collection self._embedding = embedding self._index_name = index_name self._text_key = text_key self._embedding_key = embedding_key + self._relevance_score_fn = relevance_score_fn @property def embeddings(self) -> Embeddings: return self._embedding + def _select_relevance_score_fn(self) -> Callable[[float], float]: + if self._relevance_score_fn == "euclidean": + return self._euclidean_relevance_score_fn + elif self._relevance_score_fn == "dotProduct": + return self._max_inner_product_relevance_score_fn + elif self._relevance_score_fn == "cosine": + return self._cosine_relevance_score_fn + else: + raise NotImplementedError( + f"No relevance score function for ${self._relevance_score_fn}" + ) + @classmethod def from_connection_string( cls, @@ -198,7 +215,6 @@ def _similarity_search_with_score( def similarity_search_with_score( self, query: str, - *, k: int = 4, pre_filter: Optional[Dict] = None, post_filter_pipeline: Optional[List[Dict]] = None, From 133971053a0b84a034fb0bc78cd1150cdb7f5dbf Mon Sep 17 00:00:00 2001 From: Erick Friis Date: Fri, 15 Dec 2023 17:46:12 -0800 Subject: [PATCH 5/5] docs[patch]: fix zoom (#14786) not sure why quarto is removing divs --- docs/docs/expression_language/why.ipynb | 81 ++++++------------------- docs/src/theme/Columns.js | 2 +- 2 files changed, 20 insertions(+), 63 deletions(-) diff --git a/docs/docs/expression_language/why.ipynb b/docs/docs/expression_language/why.ipynb index 66c78087aded1..2cdc5d63e0c1b 100644 --- a/docs/docs/expression_language/why.ipynb +++ b/docs/docs/expression_language/why.ipynb @@ -66,9 +66,7 @@ "\n", "\n", "\n", - "#### Without LCEL\n", - "\n", - "
" + "#### Without LCEL\n" ] }, { @@ -78,7 +76,6 @@ "metadata": {}, "outputs": [], "source": [ - "\n", "from typing import List\n", "\n", "import openai\n", @@ -107,14 +104,12 @@ "id": "cdc3b527-c09e-4c77-9711-c3cc4506cd95", "metadata": {}, "source": [ - "
\n", "
\n", "\n", "\n", "\n", "#### LCEL\n", - "\n", - "
" + "\n" ] }, { @@ -147,7 +142,6 @@ "id": "3c0b0513-77b8-4371-a20e-3e487cec7e7f", "metadata": {}, "source": [ - "
\n", "
\n", "\n", "\n", @@ -158,8 +152,7 @@ "\n", "\n", "#### Without LCEL\n", - "\n", - "
" + "\n" ] }, { @@ -197,14 +190,12 @@ "id": "f8e36b0e-c7dc-4130-a51b-189d4b756c7f", "metadata": {}, "source": [ - "
\n", "
\n", "\n", "\n", "\n", "#### LCEL\n", - "\n", - "
" + "\n" ] }, { @@ -223,7 +214,6 @@ "id": "b9b41e78-ddeb-44d0-a58b-a0ea0c99a761", "metadata": {}, "source": [ - "
\n", "
\n", "\n", "\n", @@ -235,8 +225,7 @@ "\n", "\n", "#### Without LCEL\n", - "\n", - "
" + "\n" ] }, { @@ -261,14 +250,12 @@ "id": "9b3e9d34-6775-43c1-93d8-684b58e341ab", "metadata": {}, "source": [ - "
\n", "
\n", "\n", "\n", "\n", "#### LCEL\n", - "\n", - "
" + "\n" ] }, { @@ -286,7 +273,6 @@ "id": "cc5ba36f-eec1-4fc1-8cfe-fa242a7f7809", "metadata": {}, "source": [ - "
\n", "
\n", "\n", "\n", @@ -298,8 +284,7 @@ "\n", "\n", "#### Without LCEL\n", - "\n", - "
" + "\n" ] }, { @@ -333,15 +318,12 @@ "await ainvoke_chain(\"ice cream\")\n", "```\n", "\n", - "
\n", "
\n", "\n", "\n", "\n", "#### LCEL\n", "\n", - "
\n", - "\n", "```python\n", "chain.ainvoke(\"ice cream\")\n", "```" @@ -352,7 +334,6 @@ "id": "f6888245-1ebe-4768-a53b-e1fef6a8b379", "metadata": {}, "source": [ - "
\n", "
\n", "\n", "\n", @@ -364,8 +345,7 @@ "\n", "\n", "#### Without LCEL\n", - "\n", - "
" + "\n" ] }, { @@ -394,14 +374,12 @@ "id": "45342cd6-58c2-4543-9392-773e05ef06e7", "metadata": {}, "source": [ - "
\n", "
\n", "\n", "\n", "\n", "#### LCEL\n", - "\n", - "
" + "\n" ] }, { @@ -429,7 +407,6 @@ "id": "ca115eaf-59ef-45c1-aac1-e8b0ce7db250", "metadata": {}, "source": [ - "
\n", "
\n", "\n", "\n", @@ -441,8 +418,7 @@ "\n", "\n", "#### Without LCEL\n", - "\n", - "
" + "\n" ] }, { @@ -477,14 +453,12 @@ "id": "52a0c9f8-e316-42e1-af85-cabeba4b7059", "metadata": {}, "source": [ - "
\n", "
\n", "\n", "\n", "\n", "#### LCEL\n", - "\n", - "
" + "\n" ] }, { @@ -512,7 +486,6 @@ "id": "d7a91eee-d017-420d-b215-f663dcbf8ed2", "metadata": {}, "source": [ - "
\n", "
\n", "\n", "\n", @@ -524,8 +497,7 @@ "\n", "\n", "#### Without LCEL\n", - "\n", - "
" + "\n" ] }, { @@ -603,14 +575,12 @@ "id": "d1530c5c-6635-4599-9483-6df357ca2d64", "metadata": {}, "source": [ - "
\n", "
\n", "\n", "\n", "\n", "#### With LCEL\n", - "\n", - "
" + "\n" ] }, { @@ -665,7 +635,6 @@ "id": "370dd4d7-b825-40c4-ae3c-2693cba2f22a", "metadata": {}, "source": [ - "
\n", "
\n", "\n", "\n", @@ -679,8 +648,7 @@ "#### Without LCEL\n", "\n", "We'll `print` intermediate steps for illustrative purposes\n", - "\n", - "
" + "\n" ] }, { @@ -706,15 +674,13 @@ "id": "16bd20fd-43cd-4aaf-866f-a53d1f20312d", "metadata": {}, "source": [ - "
\n", "\n", "\n", "\n", "\n", "#### LCEL\n", "Every component has built-in integrations with LangSmith. If we set the following two environment variables, all chain traces are logged to LangSmith.\n", - "\n", - "
" + "\n" ] }, { @@ -745,7 +711,6 @@ "id": "e25ce3c5-27a7-4954-9f0e-b94313597135", "metadata": {}, "source": [ - "
\n", "
\n", "\n", "\n", @@ -759,8 +724,7 @@ "\n", "#### Without LCEL\n", "\n", - "\n", - "
" + "\n" ] }, { @@ -800,14 +764,12 @@ "id": "f7ef59b5-2ce3-479e-a7ac-79e1e2f30e9c", "metadata": {}, "source": [ - "
\n", "\n", "\n", "\n", "\n", "#### LCEL\n", - "\n", - "
" + "\n" ] }, { @@ -829,7 +791,6 @@ "id": "3af52d36-37c6-4d89-b515-95d7270bb96a", "metadata": {}, "source": [ - "
\n", "
\n", "" ] @@ -847,8 +808,7 @@ "\n", "\n", "#### Without LCEL\n", - "\n", - "
" + "\n" ] }, { @@ -1025,14 +985,12 @@ "id": "9fb3d71d-8c69-4dc4-81b7-95cd46b271c2", "metadata": {}, "source": [ - "
\n", "
\n", "\n", "\n", "\n", "#### LCEL\n", - "\n", - "
" + "\n" ] }, { @@ -1083,7 +1041,6 @@ "id": "e3637d39", "metadata": {}, "source": [ - "
\n", "
\n", "" ] diff --git a/docs/src/theme/Columns.js b/docs/src/theme/Columns.js index 5bc5c1caf88ab..a113973dec0fd 100644 --- a/docs/src/theme/Columns.js +++ b/docs/src/theme/Columns.js @@ -10,7 +10,7 @@ export function ColumnContainer({children}) { export function Column({children}) { return ( -
+
{children}
)