-
Notifications
You must be signed in to change notification settings - Fork 6
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
vtempest
committed
Sep 19, 2024
1 parent
20c45b4
commit a2ba031
Showing
72 changed files
with
1,967 additions
and
1,843 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
<!DOCTYPE html><html class="default" lang="en"><head><meta charset="utf-8"/><meta http-equiv="x-ua-compatible" content="IE=edge"/><title>addEmbeddingVectorsToIndex | ai-research-agent</title><meta name="description" content="Documentation for ai-research-agent"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="stylesheet" href="../assets/style.css"/><link rel="stylesheet" href="../assets/highlight.css"/><script defer src="../assets/main.js"></script><script async src="../assets/icons.js" id="tsd-icons-script"></script><script async src="../assets/search.js" id="tsd-search-script"></script><script async src="../assets/navigation.js" id="tsd-nav-script"></script></head><body><script>console.log(`Loaded ${location.href}`)</script><script>document.documentElement.dataset.theme = localStorage.getItem("tsd-theme") || "os";document.body.style.display="none";setTimeout(() => app?app.showPage():document.body.style.removeProperty("display"),500)</script><header class="tsd-page-toolbar"><div class="tsd-toolbar-contents container"><div class="table-cell" id="tsd-search" data-base=".."><div class="field"><label for="tsd-search-field" class="tsd-widget tsd-toolbar-icon search no-caption"><svg width="16" height="16" viewBox="0 0 16 16" fill="none"><use href="../assets/icons.svg#icon-search"></use></svg></label><input type="text" id="tsd-search-field" aria-label="Search"/></div><div class="field"><div id="tsd-toolbar-links"><a href="https://github.com/vtempest/ai-research-agent">Source Code</a><a href="https://qwksearch.com">Live Demo</a><a href="https://discord.gg/SJdBqBz3tV">Discord Chat</a></div></div><ul class="results"><li class="state loading">Preparing search index...</li><li class="state failure">The search index is not available</li></ul><a href="../index.html" class="title">ai-research-agent</a></div><div class="table-cell" id="tsd-widgets"><a href="#" class="tsd-widget tsd-toolbar-icon menu no-caption" data-toggle="menu" aria-label="Menu"><svg width="16" height="16" viewBox="0 0 16 16" fill="none"><use href="../assets/icons.svg#icon-menu"></use></svg></a></div></div></header><div class="container container-main"><div class="col-content"><div class="tsd-page-title"><ul class="tsd-breadcrumb"><li><a href="../modules.html">ai-research-agent</a></li><li><a href="addEmbeddingVectorsToIndex.html">addEmbeddingVectorsToIndex</a></li></ul><h1>Function addEmbeddingVectorsToIndex</h1></div><section class="tsd-panel"><ul class="tsd-signatures"><li class="tsd-signature tsd-anchor-link"><a id="addEmbeddingVectorsToIndex" class="tsd-anchor"></a><span class="tsd-kind-call-signature">add<wbr/>Embedding<wbr/>Vectors<wbr/>To<wbr/>Index</span><span class="tsd-signature-symbol">(</span><span class="tsd-kind-parameter">documentVectors</span>, <span class="tsd-kind-parameter">options</span><span class="tsd-signature-symbol">?</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">Promise</span><span class="tsd-signature-symbol"><</span><span class="tsd-signature-type">HierarchicalNSW</span><span class="tsd-signature-symbol">></span><a href="#addEmbeddingVectorsToIndex" aria-label="Permalink" class="tsd-anchor-icon"><svg viewBox="0 0 24 24"><use href="../assets/icons.svg#icon-anchor"></use></svg></a></li><li class="tsd-description"><div class="tsd-comment tsd-typography"><a id="md:vsearch-vector-similarity-embedding-approximation-in-ram-limited-cluster-heirarchy" class="tsd-anchor"></a><h3 class="tsd-anchor-link">VSEARCH: Vector Similarity Embedding Approximation in RAM-Limited Cluster Heirarchy<a href="#md:vsearch-vector-similarity-embedding-approximation-in-ram-limited-cluster-heirarchy" aria-label="Permalink" class="tsd-anchor-icon"><svg viewBox="0 0 24 24"><use href="../assets/icons.svg#icon-anchor"></use></svg></a></h3><img src="https://i.imgur.com/nvJ7fzO.png" width="350px"> | ||
<ol> | ||
<li>Compile hnswlib-node or NGT algorithm C++ to WASM JS for efficient similarity search.</li> | ||
<li>Vector index is split by K-means into regional clusters, each being a | ||
specific size to fit in RAM. This is better than popular vector engines that | ||
require costly 100gb-RAM servers because they load all the vectors at once.</li> | ||
<li>Vectors for centroids of each cluster are stored in a list in SQL, each | ||
cluster's binary quantized data is exported as base64 string to SQL, S3, etc.</li> | ||
<li>Search: Embed Query, Compare to each cluster centroid to pick top clusters, | ||
download base64 strings for those clusters, load each into WASM, find top neighbors | ||
per cluster, merge results sorted by distance.</li> | ||
</ol> | ||
<p><a href="https://github.com/yahoojapan/NGT/wiki" target="_blank" class="external">NGT Algorithm</a> | ||
<a href="https://github.com/yahoojapan/NGT/blob/main/lib/NGT/Clustering.h#L82" target="_blank" class="external">NGT Cluster</a></p> | ||
<p><a href="https://vald.vdaas.org/docs/overview/about-vald/" target="_blank" class="external">Vald Vector Engine Docs</a> | ||
<a href="https://ann-benchmarks.com" target="_blank" class="external">ANN Benchmarks</a></p> | ||
</div><div class="tsd-parameters"><h4 class="tsd-parameters-title">Parameters</h4><ul class="tsd-parameter-list"><li><span><span class="tsd-kind-parameter">documentVectors</span>: <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">[]</span></span><div class="tsd-comment tsd-typography"><p>An array of document texts to be vectorized.</p> | ||
</div><div class="tsd-comment tsd-typography"></div></li><li><span><code class="tsd-tag">Optional</code><span class="tsd-kind-parameter">options</span>: <span class="tsd-signature-symbol">{ </span><br/><span> </span><span class="tsd-kind-property">numDimensions</span><span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span><span class="tsd-signature-symbol">; </span><br/><span> </span><span class="tsd-kind-property">maxElements</span><span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span><span class="tsd-signature-symbol">; </span><br/><span class="tsd-signature-symbol">}</span><span class="tsd-signature-symbol"> = {}</span></span><div class="tsd-comment tsd-typography"><p>Optional parameters for vector generation and indexing.</p> | ||
</div><div class="tsd-comment tsd-typography"></div><ul class="tsd-parameters"><li class="tsd-parameter"><h5><span class="tsd-kind-property">num<wbr/>Dimensions</span><span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span></h5><div class="tsd-comment tsd-typography"><p>The length of data point vector that will be indexed.</p> | ||
</div><div class="tsd-comment tsd-typography"></div></li><li class="tsd-parameter"><h5><span class="tsd-kind-property">max<wbr/>Elements</span><span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span></h5><div class="tsd-comment tsd-typography"><p>The maximum number of data points.</p> | ||
</div><div class="tsd-comment tsd-typography"></div></li></ul></li></ul></div><h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">Promise</span><span class="tsd-signature-symbol"><</span><span class="tsd-signature-type">HierarchicalNSW</span><span class="tsd-signature-symbol">></span></h4><p>The created HNSW index.</p> | ||
<div class="tsd-comment tsd-typography"><h4 class="tsd-anchor-link"><a id="Author" class="tsd-anchor"></a>Author<a href="#Author" aria-label="Permalink" class="tsd-anchor-icon"><svg viewBox="0 0 24 24"><use href="../assets/icons.svg#icon-anchor"></use></svg></a></h4><p><a href="https://arxiv.org/abs/1603.09320" target="_blank" class="external">Malkov et al. (2016)</a>, | ||
*</p> | ||
</div><aside class="tsd-sources"><ul><li>Defined in <a href="https://github.com/vtempest/ai-research-agent/tree/master/src/similarity/similarity-vector.js#L141">similarity/similarity-vector.js:141</a></li></ul></aside></li></ul></section></div><div class="col-sidebar"><div class="page-menu"><div class="tsd-navigation settings"><details class="tsd-accordion"><summary class="tsd-accordion-summary"><h3><svg width="20" height="20" viewBox="0 0 24 24" fill="none"><use href="../assets/icons.svg#icon-chevronDown"></use></svg>Settings</h3></summary><div class="tsd-accordion-details"><div class="tsd-filter-visibility"><span class="settings-label">Member Visibility</span><ul id="tsd-filter-options"><li class="tsd-filter-item"><label class="tsd-filter-input"><input type="checkbox" id="tsd-filter-protected" name="protected"/><svg width="32" height="32" viewBox="0 0 32 32" aria-hidden="true"><rect class="tsd-checkbox-background" width="30" height="30" x="1" y="1" rx="6" fill="none"></rect><path class="tsd-checkbox-checkmark" d="M8.35422 16.8214L13.2143 21.75L24.6458 10.25" stroke="none" stroke-width="3.5" stroke-linejoin="round" fill="none"></path></svg><span>Protected</span></label></li><li class="tsd-filter-item"><label class="tsd-filter-input"><input type="checkbox" id="tsd-filter-inherited" name="inherited" checked/><svg width="32" height="32" viewBox="0 0 32 32" aria-hidden="true"><rect class="tsd-checkbox-background" width="30" height="30" x="1" y="1" rx="6" fill="none"></rect><path class="tsd-checkbox-checkmark" d="M8.35422 16.8214L13.2143 21.75L24.6458 10.25" stroke="none" stroke-width="3.5" stroke-linejoin="round" fill="none"></path></svg><span>Inherited</span></label></li><li class="tsd-filter-item"><label class="tsd-filter-input"><input type="checkbox" id="tsd-filter-external" name="external"/><svg width="32" height="32" viewBox="0 0 32 32" aria-hidden="true"><rect class="tsd-checkbox-background" width="30" height="30" x="1" y="1" rx="6" fill="none"></rect><path class="tsd-checkbox-checkmark" d="M8.35422 16.8214L13.2143 21.75L24.6458 10.25" stroke="none" stroke-width="3.5" stroke-linejoin="round" fill="none"></path></svg><span>External</span></label></li></ul></div><div class="tsd-theme-toggle"><label class="settings-label" for="tsd-theme">Theme</label><select id="tsd-theme"><option value="os">OS</option><option value="light">Light</option><option value="dark">Dark</option></select></div></div></details></div><details open class="tsd-accordion tsd-page-navigation"><summary class="tsd-accordion-summary"><h3><svg width="20" height="20" viewBox="0 0 24 24" fill="none"><use href="../assets/icons.svg#icon-chevronDown"></use></svg>On This Page</h3></summary><div class="tsd-accordion-details"><a href="#md:vsearch-vector-similarity-embedding-approximation-in-ram-limited-cluster-heirarchy"><span>VSEARCH: <wbr/>Vector <wbr/>Similarity <wbr/>Embedding <wbr/>Approximation in RAM-<wbr/>Limited <wbr/>Cluster <wbr/>Heirarchy</span></a></div></details></div><div class="site-menu"><nav id="tsd-sidebar-links" class="tsd-navigation"><a href="https://github.com/vtempest/ai-research-agent">Source Code</a><a href="https://qwksearch.com">Live Demo</a><a href="https://discord.gg/SJdBqBz3tV">Discord Chat</a><a href="https://github.com/vtempest/ai-research-agent" class="tsd-nav-link">Source Code</a><a href="https://qwksearch.com" class="tsd-nav-link">Live Demo</a><a href="https://discord.gg/SJdBqBz3tV" class="tsd-nav-link">Discord Chat</a></nav><nav class="tsd-navigation"><a href="../modules.html"><svg class="tsd-kind-icon" viewBox="0 0 24 24"><use href="../assets/icons.svg#icon-1"></use></svg><span>ai-research-agent</span></a><ul class="tsd-small-nested-navigation" id="tsd-nav-container" data-base=".."><li>Loading...</li></ul></nav></div></div></div><footer></footer><div class="overlay"></div><script async src="https://www.googletagmanager.com/gtag/js?id=G-E5TZ32BZDF"></script><script>window.dataLayer = window.dataLayer || []; | ||
function gtag(){dataLayer.push(arguments);} | ||
gtag('js', new Date()); | ||
gtag('config', 'G-E5TZ32BZDF');</script></body></html> |
Oops, something went wrong.