diff --git a/bootcamp/tutorials/integration/milvus_rag_with_dynamiq.ipynb b/bootcamp/tutorials/integration/milvus_rag_with_dynamiq.ipynb
new file mode 100644
index 000000000..5fcfb0b6a
--- /dev/null
+++ b/bootcamp/tutorials/integration/milvus_rag_with_dynamiq.ipynb
@@ -0,0 +1,735 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "a2e712bd7d3cfee8",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ " \n",
+ " "
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "a7efdd76efcc0e4e",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "# Getting Started with Dynamiq and Milvus\n",
+ "\n",
+ "\n",
+ "\n",
+ "[Dynamiq](https://dynamiq.ai/) is a powerful Gen AI framework that streamlines the development of AI-powered applications. With robust support for retrieval-augmented generation (RAG) and large language model (LLM) agents, Dynamiq empowers developers to create intelligent, dynamic systems with ease and efficiency.\n",
+ "\n",
+ "In this tutorial, we’ll explore how to seamlessly use Dynamiq with [Milvus](https://milvus.io/), a high-performance, open-source vector database purpose-built for RAG workflows. Milvus excels at efficient storage, indexing, and retrieval of vector embeddings, making it an indispensable component for AI systems that demand fast and precise contextual data access.\n",
+ "\n",
+ "This step-by-step guide will cover two core RAG workflows:\n",
+ "\n",
+ "- **Document Indexing Flow**: Learn how to process input files (e.g., PDFs), transform their content into vector embeddings, and store them in Milvus. Leveraging Milvus’s high-performance indexing capabilities ensures your data is ready for rapid retrieval.\n",
+ "\n",
+ "- **Document Retrieval Flow**: Discover how to query Milvus for relevant document embeddings and use them to generate insightful, context-aware responses with Dynamiq’s LLM agents, creating a seamless AI-powered user experience.\n",
+ "\n",
+ "By the end of this tutorial, you’ll gain a solid understanding of how Milvus and Dynamiq work together to build scalable, context-aware AI systems tailored to your needs."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "53d16d1dd5f753bf",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "## Preparation\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b03bd9af09661ed7",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "### Download required libraries\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "82bdf922095fd55b",
+ "metadata": {
+ "collapsed": false
+ },
+ "outputs": [],
+ "source": [
+ "! pip install dynamiq pymilvus"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "59c95741251895fe",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "> If you are using Google Colab, to enable dependencies just installed, you may need to **restart the runtime** (click on the \"Runtime\" menu at the top of the screen, and select \"Restart session\" from the dropdown menu)."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "38e2801395f2273d",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "### Configure the LLM agent\n",
+ "\n",
+ "We will use OpenAI as the LLM in this example. You should prepare the [api key](https://platform.openai.com/docs/quickstart) `OPENAI_API_KEY` as an environment variable.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "id": "2daf83743889cb73",
+ "metadata": {
+ "collapsed": false,
+ "ExecuteTime": {
+ "end_time": "2024-11-20T03:13:53.013034Z",
+ "start_time": "2024-11-20T03:13:53.009842Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "import os\n",
+ "\n",
+ "os.environ[\"OPENAI_API_KEY\"] = \"sk-***********\""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "dc932e415c8f00a7",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "## RAG - Document Indexing Flow\n",
+ "\n",
+ "This tutorial demonstrates a Retrieval-Augmented Generation (RAG) workflow for indexing documents with Milvus as the vector database. The workflow takes input PDF files, processes them into smaller chunks, generates vector embeddings using OpenAI's embedding model, and stores the embeddings in a Milvus collection for efficient retrieval.\n",
+ "\n",
+ "By the end of this workflow, you will have a scalable and efficient document indexing system that supports future RAG tasks like semantic search and question answering."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "34fbf9e90c227bcb",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "### Import Required Libraries and Initialize Workflow"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "id": "initial_id",
+ "metadata": {
+ "collapsed": true,
+ "jupyter": {
+ "outputs_hidden": true
+ },
+ "ExecuteTime": {
+ "end_time": "2024-11-20T03:13:56.478488Z",
+ "start_time": "2024-11-20T03:13:54.716225Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "# Importing necessary libraries for the workflow\n",
+ "from io import BytesIO\n",
+ "from dynamiq import Workflow\n",
+ "from dynamiq.nodes import InputTransformer\n",
+ "from dynamiq.connections import (\n",
+ " OpenAI as OpenAIConnection,\n",
+ " Milvus as MilvusConnection,\n",
+ " MilvusDeploymentType,\n",
+ ")\n",
+ "from dynamiq.nodes.converters import PyPDFConverter\n",
+ "from dynamiq.nodes.splitters.document import DocumentSplitter\n",
+ "from dynamiq.nodes.embedders import OpenAIDocumentEmbedder\n",
+ "from dynamiq.nodes.writers import MilvusDocumentWriter\n",
+ "\n",
+ "# Initialize the workflow\n",
+ "rag_wf = Workflow()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "808a4ba11a5c65e7",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "### Define PDF Converter Node"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 4,
+ "id": "c7406f09f125695c",
+ "metadata": {
+ "collapsed": false,
+ "ExecuteTime": {
+ "end_time": "2024-11-20T03:13:57.175380Z",
+ "start_time": "2024-11-20T03:13:57.172991Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "converter = PyPDFConverter(document_creation_mode=\"one-doc-per-page\")\n",
+ "converter_added = rag_wf.flow.add_nodes(\n",
+ " converter\n",
+ ") # Add node to the DAG (Directed Acyclic Graph)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "13b5959b723534b8",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "### Define Document Splitter Node"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "id": "13085207a0f67a9b",
+ "metadata": {
+ "collapsed": false,
+ "ExecuteTime": {
+ "end_time": "2024-11-20T03:13:58.155849Z",
+ "start_time": "2024-11-20T03:13:58.152563Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "document_splitter = DocumentSplitter(\n",
+ " split_by=\"sentence\", # Splits documents into sentences\n",
+ " split_length=10,\n",
+ " split_overlap=1,\n",
+ " input_transformer=InputTransformer(\n",
+ " selector={\n",
+ " \"documents\": f\"${[converter.id]}.output.documents\",\n",
+ " },\n",
+ " ),\n",
+ ").depends_on(\n",
+ " converter\n",
+ ") # Set dependency on the PDF converter\n",
+ "splitter_added = rag_wf.flow.add_nodes(document_splitter) # Add to the DAG"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "cde92302e3e22c77",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "### Define Embedding Node"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "id": "441d5a0746540497",
+ "metadata": {
+ "collapsed": false,
+ "ExecuteTime": {
+ "end_time": "2024-11-20T03:14:02.040469Z",
+ "start_time": "2024-11-20T03:14:02.012285Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "embedder = OpenAIDocumentEmbedder(\n",
+ " connection=OpenAIConnection(api_key=os.environ[\"OPENAI_API_KEY\"]),\n",
+ " input_transformer=InputTransformer(\n",
+ " selector={\n",
+ " \"documents\": f\"${[document_splitter.id]}.output.documents\",\n",
+ " },\n",
+ " ),\n",
+ ").depends_on(\n",
+ " document_splitter\n",
+ ") # Set dependency on the splitter\n",
+ "document_embedder_added = rag_wf.flow.add_nodes(embedder) # Add to the DAG"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3c2d4ad79cfa7254",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "### Define Milvus Vector Store Node"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "id": "49d19a8f959e81f1",
+ "metadata": {
+ "collapsed": false,
+ "ExecuteTime": {
+ "end_time": "2024-11-20T03:14:05.588159Z",
+ "start_time": "2024-11-20T03:14:03.387110Z"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "2024-11-19 22:14:03 - WARNING - Environment variable 'MILVUS_API_TOKEN' not found\n",
+ "2024-11-19 22:14:03 - INFO - Pass in the local path ./milvus.db, and run it using milvus-lite\n",
+ "2024-11-19 22:14:04 - DEBUG - Created new connection using: 0bef2849fdb1458a85df8bb9dd27f51d\n",
+ "2024-11-19 22:14:04 - INFO - Collection my_milvus_collection does not exist. Creating a new collection.\n",
+ "2024-11-19 22:14:04 - DEBUG - Successfully created collection: my_milvus_collection\n",
+ "2024-11-19 22:14:05 - DEBUG - Successfully created an index on collection: my_milvus_collection\n",
+ "2024-11-19 22:14:05 - DEBUG - Successfully created an index on collection: my_milvus_collection\n"
+ ]
+ }
+ ],
+ "source": [
+ "vector_store = (\n",
+ " MilvusDocumentWriter(\n",
+ " connection=MilvusConnection(\n",
+ " deployment_type=MilvusDeploymentType.FILE, uri=\"./milvus.db\"\n",
+ " ),\n",
+ " index_name=\"my_milvus_collection\",\n",
+ " dimension=1536,\n",
+ " create_if_not_exist=True,\n",
+ " metric_type=\"COSINE\",\n",
+ " )\n",
+ " .inputs(documents=embedder.outputs.documents) # Connect to embedder output\n",
+ " .depends_on(embedder) # Set dependency on the embedder\n",
+ ")\n",
+ "milvus_writer_added = rag_wf.flow.add_nodes(vector_store) # Add to the DAG"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2e00a00b250c7f02",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "> #### Milvus Deployment Types\n",
+ ">\n",
+ "> Milvus offers two deployment types, catering to different use cases:\n",
+ ">\n",
+ ">\n",
+ "> ##### 1. **MilvusDeploymentType.FILE**\n",
+ ">\n",
+ "> - Ideal for **local prototyping** or **small-scale data** storage.\n",
+ "> - Set the `uri` to a local file path (e.g., `./milvus.db`) to leverage [Milvus Lite](https://milvus.io/docs/milvus_lite.md), which automatically stores all data in the specified file.\n",
+ "> - This is a convenient option for **quick setup** and **experimentation**.\n",
+ ">\n",
+ ">\n",
+ "> ##### 2. **MilvusDeploymentType.HOST**\n",
+ ">\n",
+ "> - Designed for **large-scale data** scenarios, such as managing over a million vectors.\n",
+ ">\n",
+ "> ##### **Self-Hosted Server**\n",
+ "> - Deploy a high-performance Milvus server using [Docker or Kubernetes](https://milvus.io/docs/quickstart.md).\n",
+ "> - Configure the server’s address and port as the `uri` (e.g., `http://localhost:19530`).\n",
+ "> - If authentication is enabled:\n",
+ "> - Provide `:` as the `token`.\n",
+ "> - If authentication is disabled:\n",
+ "> - Leave the `token` unset.\n",
+ ">\n",
+ "> ##### **Zilliz Cloud (Managed Service)**\n",
+ "> - For a fully managed, cloud-based Milvus experience, use [Zilliz Cloud](https://zilliz.com/cloud).\n",
+ "> - Set the `uri` and `token` according to the [Public Endpoint and API key](https://docs.zilliz.com/docs/on-zilliz-cloud-console#cluster-details) provided in the Zilliz Cloud console."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "ad82067568805962",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "### Define Input Data and Run the Workflow"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "id": "99e53f536d7b5274",
+ "metadata": {
+ "collapsed": false,
+ "ExecuteTime": {
+ "end_time": "2024-11-20T03:14:10.638574Z",
+ "start_time": "2024-11-20T03:14:09.373090Z"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "/var/folders/09/d0hx80nj35sb5hxb5cpc1q180000gn/T/ipykernel_31319/3145804345.py:4: ResourceWarning: unclosed file <_io.BufferedReader name='./pdf_files/WhatisMilvus.pdf'>\n",
+ " BytesIO(open(path, \"rb\").read()) for path in file_paths\n",
+ "ResourceWarning: Enable tracemalloc to get the object allocation traceback\n",
+ "2024-11-19 22:14:09 - INFO - Workflow 87878444-6a3d-43f3-ae32-0127564a959f: execution started.\n",
+ "2024-11-19 22:14:09 - INFO - Flow b30b48ec-d5d2-4e4c-8e25-d6976c8a9c17: execution started.\n",
+ "2024-11-19 22:14:09 - INFO - Node PyPDF File Converter - 6eb42b1f-7637-407b-a3ac-4167bcf3b5c4: execution started.\n",
+ "2024-11-19 22:14:09 - INFO - Node PyPDF File Converter - 6eb42b1f-7637-407b-a3ac-4167bcf3b5c4: execution succeeded in 58ms.\n",
+ "2024-11-19 22:14:09 - INFO - Node DocumentSplitter - 5baed580-6de0-4dcd-bace-d7d947ab6c7f: execution started.\n",
+ "/Users/jinhonglin/anaconda3/envs/myenv/lib/python3.11/site-packages/websockets/legacy/__init__.py:6: DeprecationWarning: websockets.legacy is deprecated; see https://websockets.readthedocs.io/en/stable/howto/upgrade.html for upgrade instructions\n",
+ " warnings.warn( # deprecated in 14.0 - 2024-11-09\n",
+ "/Users/jinhonglin/anaconda3/envs/myenv/lib/python3.11/site-packages/pydantic/fields.py:804: PydanticDeprecatedSince20: Using extra keyword arguments on `Field` is deprecated and will be removed. Use `json_schema_extra` instead. (Extra keys: 'is_accessible_to_agent'). Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.7/migration/\n",
+ " warn(\n",
+ "2024-11-19 22:14:09 - INFO - Node DocumentSplitter - 5baed580-6de0-4dcd-bace-d7d947ab6c7f: execution succeeded in 104ms.\n",
+ "2024-11-19 22:14:09 - INFO - Node OpenAIDocumentEmbedder - 91928f67-a00f-48f6-a864-f6e21672ec7e: execution started.\n",
+ "2024-11-19 22:14:09 - INFO - Node OpenAIDocumentEmbedder - d30a4cdc-0fab-4aff-b2e5-6161a62cb6fd: execution started.\n",
+ "2024-11-19 22:14:10 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
+ "2024-11-19 22:14:10 - INFO - Node OpenAIDocumentEmbedder - d30a4cdc-0fab-4aff-b2e5-6161a62cb6fd: execution succeeded in 724ms.\n",
+ "2024-11-19 22:14:10 - INFO - Node MilvusDocumentWriter - dddab4cc-1dae-4e7e-9101-1ec353f530da: execution started.\n",
+ "2024-11-19 22:14:10 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
+ "2024-11-19 22:14:10 - INFO - Node MilvusDocumentWriter - dddab4cc-1dae-4e7e-9101-1ec353f530da: execution succeeded in 66ms.\n",
+ "2024-11-19 22:14:10 - INFO - Node OpenAIDocumentEmbedder - 91928f67-a00f-48f6-a864-f6e21672ec7e: execution succeeded in 961ms.\n",
+ "2024-11-19 22:14:10 - INFO - Flow b30b48ec-d5d2-4e4c-8e25-d6976c8a9c17: execution succeeded in 1.3s.\n",
+ "2024-11-19 22:14:10 - INFO - Workflow 87878444-6a3d-43f3-ae32-0127564a959f: execution succeeded in 1.3s.\n"
+ ]
+ }
+ ],
+ "source": [
+ "file_paths = [\"./pdf_files/WhatisMilvus.pdf\"]\n",
+ "input_data = {\n",
+ " \"files\": [BytesIO(open(path, \"rb\").read()) for path in file_paths],\n",
+ " \"metadata\": [{\"filename\": path} for path in file_paths],\n",
+ "}\n",
+ "\n",
+ "# Run the workflow with the prepared input data\n",
+ "inserted_data = rag_wf.run(input_data=input_data)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b3fa6c0cdf9906ab",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "Through this workflow, we have successfully implemented a document indexing pipeline using Milvus as the vector database and OpenAI's embedding model for semantic representation. This setup enables fast and accurate vector-based retrieval, forming the foundation for RAG workflows like semantic search, document retrieval, and contextual AI-driven interactions.\n",
+ "\n",
+ "With Milvus's scalable storage capabilities and Dynamiq's orchestration, this solution is ready for both prototyping and large-scale production deployments. You can now extend this pipeline to include additional tasks like retrieval-based question answering or AI-driven content generation."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "945e58305b89d696",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "## RAG Document Retrieval Flow\n",
+ "\n",
+ "In this tutorial, we implement a Retrieval-Augmented Generation (RAG) document retrieval workflow. This workflow takes a user query, generates a vector embedding for it, retrieves the most relevant documents from a Milvus vector database, and uses a large language model (LLM) to generate a detailed and context-aware answer based on the retrieved documents.\n",
+ "\n",
+ "By following this workflow, you will create an end-to-end solution for semantic search and question answering, combining the power of vector-based document retrieval with the capabilities of OpenAI’s advanced LLMs. This approach enables efficient and intelligent responses to user queries by leveraging the stored knowledge in your document database."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "1083f8e11a0b6ce7",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "### Import Required Libraries and Initialize Workflow"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "id": "baa49fec5d55f96d",
+ "metadata": {
+ "collapsed": false,
+ "ExecuteTime": {
+ "end_time": "2024-11-20T03:14:17.397269Z",
+ "start_time": "2024-11-20T03:14:17.283322Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "from dynamiq import Workflow\n",
+ "from dynamiq.connections import (\n",
+ " OpenAI as OpenAIConnection,\n",
+ " Milvus as MilvusConnection,\n",
+ " MilvusDeploymentType,\n",
+ ")\n",
+ "from dynamiq.nodes.embedders import OpenAITextEmbedder\n",
+ "from dynamiq.nodes.retrievers import MilvusDocumentRetriever\n",
+ "from dynamiq.nodes.llms import OpenAI\n",
+ "from dynamiq.prompts import Message, Prompt\n",
+ "\n",
+ "# Initialize the workflow\n",
+ "retrieval_wf = Workflow()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f1cd8035d05e713f",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "### Define OpenAI Connection and Text Embedder"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "id": "ec1ca65ca47b05ed",
+ "metadata": {
+ "collapsed": false,
+ "ExecuteTime": {
+ "end_time": "2024-11-20T03:14:18.567693Z",
+ "start_time": "2024-11-20T03:14:18.542199Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "# Establish OpenAI connection\n",
+ "openai_connection = OpenAIConnection(api_key=os.environ[\"OPENAI_API_KEY\"])\n",
+ "\n",
+ "# Define the text embedder node\n",
+ "embedder = OpenAITextEmbedder(\n",
+ " connection=openai_connection,\n",
+ " model=\"text-embedding-3-small\",\n",
+ ")\n",
+ "\n",
+ "# Add the embedder node to the workflow\n",
+ "embedder_added = retrieval_wf.flow.add_nodes(embedder)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "86a8dcb5193d0c25",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "### Define Milvus Document Retriever"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "id": "8678b57edb214fba",
+ "metadata": {
+ "collapsed": false,
+ "ExecuteTime": {
+ "end_time": "2024-11-20T03:14:19.488137Z",
+ "start_time": "2024-11-20T03:14:19.478935Z"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "2024-11-19 22:14:19 - WARNING - Environment variable 'MILVUS_API_TOKEN' not found\n",
+ "2024-11-19 22:14:19 - INFO - Pass in the local path ./milvus.db, and run it using milvus-lite\n",
+ "2024-11-19 22:14:19 - DEBUG - Created new connection using: 98d1132773af4298a894ad5925845fd2\n",
+ "2024-11-19 22:14:19 - INFO - Collection my_milvus_collection already exists. Skipping creation.\n"
+ ]
+ }
+ ],
+ "source": [
+ "document_retriever = (\n",
+ " MilvusDocumentRetriever(\n",
+ " connection=MilvusConnection(\n",
+ " deployment_type=MilvusDeploymentType.FILE, uri=\"./milvus.db\"\n",
+ " ),\n",
+ " index_name=\"my_milvus_collection\",\n",
+ " dimension=1536,\n",
+ " top_k=5,\n",
+ " )\n",
+ " .inputs(embedding=embedder.outputs.embedding) # Connect to embedder output\n",
+ " .depends_on(embedder) # Dependency on the embedder node\n",
+ ")\n",
+ "\n",
+ "# Add the retriever node to the workflow\n",
+ "milvus_retriever_added = retrieval_wf.flow.add_nodes(document_retriever)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "72d314d217c9879e",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "### Define the Prompt Template"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "id": "d945959c4b8a9eb2",
+ "metadata": {
+ "collapsed": false,
+ "ExecuteTime": {
+ "end_time": "2024-11-20T03:14:20.747799Z",
+ "start_time": "2024-11-20T03:14:20.743238Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "# Define the prompt template for the LLM\n",
+ "prompt_template = \"\"\"\n",
+ "Please answer the question based on the provided context.\n",
+ "\n",
+ "Question: {{ query }}\n",
+ "\n",
+ "Context:\n",
+ "{% for document in documents %}\n",
+ "- {{ document.content }}\n",
+ "{% endfor %}\n",
+ "\"\"\"\n",
+ "\n",
+ "# Create the prompt object\n",
+ "prompt = Prompt(messages=[Message(content=prompt_template, role=\"user\")])"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "a55e4885ab22e31d",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "### Define the Answer Generator\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "id": "7bf82309937abf31",
+ "metadata": {
+ "collapsed": false,
+ "ExecuteTime": {
+ "end_time": "2024-11-20T03:14:21.709200Z",
+ "start_time": "2024-11-20T03:14:21.689787Z"
+ }
+ },
+ "outputs": [],
+ "source": [
+ "answer_generator = (\n",
+ " OpenAI(\n",
+ " connection=openai_connection,\n",
+ " model=\"gpt-4o\",\n",
+ " prompt=prompt,\n",
+ " )\n",
+ " .inputs(\n",
+ " documents=document_retriever.outputs.documents,\n",
+ " query=embedder.outputs.query,\n",
+ " )\n",
+ " .depends_on(\n",
+ " [document_retriever, embedder]\n",
+ " ) # Dependencies on retriever and embedder\n",
+ ")\n",
+ "\n",
+ "# Add the answer generator node to the workflow\n",
+ "answer_generator_added = retrieval_wf.flow.add_nodes(answer_generator)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "81495d9e624282ac",
+ "metadata": {
+ "collapsed": false
+ },
+ "source": [
+ "### Run the Workflow"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 15,
+ "id": "2b0de8620a3079a",
+ "metadata": {
+ "collapsed": false,
+ "ExecuteTime": {
+ "end_time": "2024-11-20T03:14:25.014389Z",
+ "start_time": "2024-11-20T03:14:22.658375Z"
+ }
+ },
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "2024-11-19 22:14:22 - INFO - Workflow f4a073fb-dfb6-499c-8cac-5710a7ad6d47: execution started.\n",
+ "2024-11-19 22:14:22 - INFO - Flow b30b48ec-d5d2-4e4c-8e25-d6976c8a9c17: execution started.\n",
+ "2024-11-19 22:14:22 - INFO - Node OpenAITextEmbedder - 47afb0bc-cf96-429d-b58f-11b6c935fec3: execution started.\n",
+ "2024-11-19 22:14:23 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings \"HTTP/1.1 200 OK\"\n",
+ "2024-11-19 22:14:23 - INFO - Node OpenAITextEmbedder - 47afb0bc-cf96-429d-b58f-11b6c935fec3: execution succeeded in 474ms.\n",
+ "2024-11-19 22:14:23 - INFO - Node MilvusDocumentRetriever - 51c8311b-4837-411f-ba42-21e28239a2ee: execution started.\n",
+ "2024-11-19 22:14:23 - INFO - Node MilvusDocumentRetriever - 51c8311b-4837-411f-ba42-21e28239a2ee: execution succeeded in 23ms.\n",
+ "2024-11-19 22:14:23 - INFO - Node LLM - ac722325-bece-453f-a2ed-135b0749ee7a: execution started.\n",
+ "2024-11-19 22:14:24 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions \"HTTP/1.1 200 OK\"\n",
+ "2024-11-19 22:14:24 - INFO - Node LLM - ac722325-bece-453f-a2ed-135b0749ee7a: execution succeeded in 1.8s.\n",
+ "2024-11-19 22:14:25 - INFO - Flow b30b48ec-d5d2-4e4c-8e25-d6976c8a9c17: execution succeeded in 2.4s.\n",
+ "2024-11-19 22:14:25 - INFO - Workflow f4a073fb-dfb6-499c-8cac-5710a7ad6d47: execution succeeded in 2.4s.\n"
+ ]
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "The advanced search algorithms in Milvus include a variety of in-memory and on-disk indexing/search algorithms such as IVF (Inverted File), HNSW (Hierarchical Navigable Small World), and DiskANN. These algorithms have been deeply optimized to enhance performance, delivering 30%-70% better performance compared to popular implementations like FAISS and HNSWLib. These optimizations are part of Milvus's design to ensure high efficiency and scalability in handling vector data.\n"
+ ]
+ }
+ ],
+ "source": [
+ "# Run the workflow with a sample query\n",
+ "sample_query = \"What is the Advanced Search Algorithms in Milvus?\"\n",
+ "\n",
+ "result = retrieval_wf.run(input_data={\"query\": sample_query})\n",
+ "\n",
+ "answer = result.output.get(answer_generator.id).get(\"output\", {}).get(\"content\")\n",
+ "print(answer)"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.11.10"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}