Skip to content

Commit

Permalink
addEmbeddingVectorsToIndex vsearch
Browse files Browse the repository at this point in the history
  • Loading branch information
vtempest committed Sep 19, 2024
1 parent 20c45b4 commit a2ba031
Show file tree
Hide file tree
Showing 72 changed files with 1,967 additions and 1,843 deletions.
12 changes: 0 additions & 12 deletions .github/workflows/static.yml
Original file line number Diff line number Diff line change
Expand Up @@ -41,15 +41,3 @@ jobs:
- name: Deploy Docs to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4

test:
name: Run Unit Test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2 # checkout the repo
- name: Download dependencies # execute tests in tests/
run: |
npm install
- name: Execute tests # execute tests in tests/
run: |
npm run test
28 changes: 14 additions & 14 deletions docs/assets/highlight.css
Original file line number Diff line number Diff line change
Expand Up @@ -3,20 +3,20 @@
--dark-hl-0: #569CD6;
--light-hl-1: #000000;
--dark-hl-1: #D4D4D4;
--light-hl-2: #001080;
--dark-hl-2: #9CDCFE;
--light-hl-3: #795E26;
--dark-hl-3: #DCDCAA;
--light-hl-4: #A31515;
--dark-hl-4: #CE9178;
--light-hl-5: #008000;
--dark-hl-5: #6A9955;
--light-hl-6: #0070C1;
--dark-hl-6: #4FC1FF;
--light-hl-7: #EE0000;
--dark-hl-7: #D7BA7D;
--light-hl-8: #AF00DB;
--dark-hl-8: #C586C0;
--light-hl-2: #0070C1;
--dark-hl-2: #4FC1FF;
--light-hl-3: #AF00DB;
--dark-hl-3: #C586C0;
--light-hl-4: #795E26;
--dark-hl-4: #DCDCAA;
--light-hl-5: #A31515;
--dark-hl-5: #CE9178;
--light-hl-6: #001080;
--dark-hl-6: #9CDCFE;
--light-hl-7: #008000;
--dark-hl-7: #6A9955;
--light-hl-8: #EE0000;
--dark-hl-8: #D7BA7D;
--light-hl-9: #098658;
--dark-hl-9: #B5CEA8;
--light-code-background: #FFFFFF;
Expand Down
2 changes: 1 addition & 1 deletion docs/assets/navigation.js

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion docs/assets/search.js

Large diffs are not rendered by default.

94 changes: 49 additions & 45 deletions docs/classes/torch.html

Large diffs are not rendered by default.

27 changes: 27 additions & 0 deletions docs/functions/addEmbeddingVectorsToIndex.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
<!DOCTYPE html><html class="default" lang="en"><head><meta charset="utf-8"/><meta http-equiv="x-ua-compatible" content="IE=edge"/><title>addEmbeddingVectorsToIndex | ai-research-agent</title><meta name="description" content="Documentation for ai-research-agent"/><meta name="viewport" content="width=device-width, initial-scale=1"/><link rel="stylesheet" href="../assets/style.css"/><link rel="stylesheet" href="../assets/highlight.css"/><script defer src="../assets/main.js"></script><script async src="../assets/icons.js" id="tsd-icons-script"></script><script async src="../assets/search.js" id="tsd-search-script"></script><script async src="../assets/navigation.js" id="tsd-nav-script"></script></head><body><script>console.log(`Loaded ${location.href}`)</script><script>document.documentElement.dataset.theme = localStorage.getItem("tsd-theme") || "os";document.body.style.display="none";setTimeout(() => app?app.showPage():document.body.style.removeProperty("display"),500)</script><header class="tsd-page-toolbar"><div class="tsd-toolbar-contents container"><div class="table-cell" id="tsd-search" data-base=".."><div class="field"><label for="tsd-search-field" class="tsd-widget tsd-toolbar-icon search no-caption"><svg width="16" height="16" viewBox="0 0 16 16" fill="none"><use href="../assets/icons.svg#icon-search"></use></svg></label><input type="text" id="tsd-search-field" aria-label="Search"/></div><div class="field"><div id="tsd-toolbar-links"><a href="https://github.com/vtempest/ai-research-agent">Source Code</a><a href="https://qwksearch.com">Live Demo</a><a href="https://discord.gg/SJdBqBz3tV">Discord Chat</a></div></div><ul class="results"><li class="state loading">Preparing search index...</li><li class="state failure">The search index is not available</li></ul><a href="../index.html" class="title">ai-research-agent</a></div><div class="table-cell" id="tsd-widgets"><a href="#" class="tsd-widget tsd-toolbar-icon menu no-caption" data-toggle="menu" aria-label="Menu"><svg width="16" height="16" viewBox="0 0 16 16" fill="none"><use href="../assets/icons.svg#icon-menu"></use></svg></a></div></div></header><div class="container container-main"><div class="col-content"><div class="tsd-page-title"><ul class="tsd-breadcrumb"><li><a href="../modules.html">ai-research-agent</a></li><li><a href="addEmbeddingVectorsToIndex.html">addEmbeddingVectorsToIndex</a></li></ul><h1>Function addEmbeddingVectorsToIndex</h1></div><section class="tsd-panel"><ul class="tsd-signatures"><li class="tsd-signature tsd-anchor-link"><a id="addEmbeddingVectorsToIndex" class="tsd-anchor"></a><span class="tsd-kind-call-signature">add<wbr/>Embedding<wbr/>Vectors<wbr/>To<wbr/>Index</span><span class="tsd-signature-symbol">(</span><span class="tsd-kind-parameter">documentVectors</span>, <span class="tsd-kind-parameter">options</span><span class="tsd-signature-symbol">?</span><span class="tsd-signature-symbol">)</span><span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">Promise</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">HierarchicalNSW</span><span class="tsd-signature-symbol">&gt;</span><a href="#addEmbeddingVectorsToIndex" aria-label="Permalink" class="tsd-anchor-icon"><svg viewBox="0 0 24 24"><use href="../assets/icons.svg#icon-anchor"></use></svg></a></li><li class="tsd-description"><div class="tsd-comment tsd-typography"><a id="md:vsearch-vector-similarity-embedding-approximation-in-ram-limited-cluster-heirarchy" class="tsd-anchor"></a><h3 class="tsd-anchor-link">VSEARCH: Vector Similarity Embedding Approximation in RAM-Limited Cluster Heirarchy<a href="#md:vsearch-vector-similarity-embedding-approximation-in-ram-limited-cluster-heirarchy" aria-label="Permalink" class="tsd-anchor-icon"><svg viewBox="0 0 24 24"><use href="../assets/icons.svg#icon-anchor"></use></svg></a></h3><img src="https://i.imgur.com/nvJ7fzO.png" width="350px">
<ol>
<li>Compile hnswlib-node or NGT algorithm C++ to WASM JS for efficient similarity search.</li>
<li>Vector index is split by K-means into regional clusters, each being a
specific size to fit in RAM. This is better than popular vector engines that
require costly 100gb-RAM servers because they load all the vectors at once.</li>
<li>Vectors for centroids of each cluster are stored in a list in SQL, each
cluster's binary quantized data is exported as base64 string to SQL, S3, etc.</li>
<li>Search: Embed Query, Compare to each cluster centroid to pick top clusters,
download base64 strings for those clusters, load each into WASM, find top neighbors
per cluster, merge results sorted by distance.</li>
</ol>
<p><a href="https://github.com/yahoojapan/NGT/wiki" target="_blank" class="external">NGT Algorithm</a>
<a href="https://github.com/yahoojapan/NGT/blob/main/lib/NGT/Clustering.h#L82" target="_blank" class="external">NGT Cluster</a></p>
<p><a href="https://vald.vdaas.org/docs/overview/about-vald/" target="_blank" class="external">Vald Vector Engine Docs</a>
<a href="https://ann-benchmarks.com" target="_blank" class="external">ANN Benchmarks</a></p>
</div><div class="tsd-parameters"><h4 class="tsd-parameters-title">Parameters</h4><ul class="tsd-parameter-list"><li><span><span class="tsd-kind-parameter">documentVectors</span>: <span class="tsd-signature-type">string</span><span class="tsd-signature-symbol">[]</span></span><div class="tsd-comment tsd-typography"><p>An array of document texts to be vectorized.</p>
</div><div class="tsd-comment tsd-typography"></div></li><li><span><code class="tsd-tag">Optional</code><span class="tsd-kind-parameter">options</span>: <span class="tsd-signature-symbol">{ </span><br/><span>    </span><span class="tsd-kind-property">numDimensions</span><span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span><span class="tsd-signature-symbol">; </span><br/><span>    </span><span class="tsd-kind-property">maxElements</span><span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span><span class="tsd-signature-symbol">; </span><br/><span class="tsd-signature-symbol">}</span><span class="tsd-signature-symbol"> = {}</span></span><div class="tsd-comment tsd-typography"><p>Optional parameters for vector generation and indexing.</p>
</div><div class="tsd-comment tsd-typography"></div><ul class="tsd-parameters"><li class="tsd-parameter"><h5><span class="tsd-kind-property">num<wbr/>Dimensions</span><span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span></h5><div class="tsd-comment tsd-typography"><p>The length of data point vector that will be indexed.</p>
</div><div class="tsd-comment tsd-typography"></div></li><li class="tsd-parameter"><h5><span class="tsd-kind-property">max<wbr/>Elements</span><span class="tsd-signature-symbol">: </span><span class="tsd-signature-type">number</span></h5><div class="tsd-comment tsd-typography"><p>The maximum number of data points.</p>
</div><div class="tsd-comment tsd-typography"></div></li></ul></li></ul></div><h4 class="tsd-returns-title">Returns <span class="tsd-signature-type">Promise</span><span class="tsd-signature-symbol">&lt;</span><span class="tsd-signature-type">HierarchicalNSW</span><span class="tsd-signature-symbol">&gt;</span></h4><p>The created HNSW index.</p>
<div class="tsd-comment tsd-typography"><h4 class="tsd-anchor-link"><a id="Author" class="tsd-anchor"></a>Author<a href="#Author" aria-label="Permalink" class="tsd-anchor-icon"><svg viewBox="0 0 24 24"><use href="../assets/icons.svg#icon-anchor"></use></svg></a></h4><p><a href="https://arxiv.org/abs/1603.09320" target="_blank" class="external">Malkov et al. (2016)</a>,
*</p>
</div><aside class="tsd-sources"><ul><li>Defined in <a href="https://github.com/vtempest/ai-research-agent/tree/master/src/similarity/similarity-vector.js#L141">similarity/similarity-vector.js:141</a></li></ul></aside></li></ul></section></div><div class="col-sidebar"><div class="page-menu"><div class="tsd-navigation settings"><details class="tsd-accordion"><summary class="tsd-accordion-summary"><h3><svg width="20" height="20" viewBox="0 0 24 24" fill="none"><use href="../assets/icons.svg#icon-chevronDown"></use></svg>Settings</h3></summary><div class="tsd-accordion-details"><div class="tsd-filter-visibility"><span class="settings-label">Member Visibility</span><ul id="tsd-filter-options"><li class="tsd-filter-item"><label class="tsd-filter-input"><input type="checkbox" id="tsd-filter-protected" name="protected"/><svg width="32" height="32" viewBox="0 0 32 32" aria-hidden="true"><rect class="tsd-checkbox-background" width="30" height="30" x="1" y="1" rx="6" fill="none"></rect><path class="tsd-checkbox-checkmark" d="M8.35422 16.8214L13.2143 21.75L24.6458 10.25" stroke="none" stroke-width="3.5" stroke-linejoin="round" fill="none"></path></svg><span>Protected</span></label></li><li class="tsd-filter-item"><label class="tsd-filter-input"><input type="checkbox" id="tsd-filter-inherited" name="inherited" checked/><svg width="32" height="32" viewBox="0 0 32 32" aria-hidden="true"><rect class="tsd-checkbox-background" width="30" height="30" x="1" y="1" rx="6" fill="none"></rect><path class="tsd-checkbox-checkmark" d="M8.35422 16.8214L13.2143 21.75L24.6458 10.25" stroke="none" stroke-width="3.5" stroke-linejoin="round" fill="none"></path></svg><span>Inherited</span></label></li><li class="tsd-filter-item"><label class="tsd-filter-input"><input type="checkbox" id="tsd-filter-external" name="external"/><svg width="32" height="32" viewBox="0 0 32 32" aria-hidden="true"><rect class="tsd-checkbox-background" width="30" height="30" x="1" y="1" rx="6" fill="none"></rect><path class="tsd-checkbox-checkmark" d="M8.35422 16.8214L13.2143 21.75L24.6458 10.25" stroke="none" stroke-width="3.5" stroke-linejoin="round" fill="none"></path></svg><span>External</span></label></li></ul></div><div class="tsd-theme-toggle"><label class="settings-label" for="tsd-theme">Theme</label><select id="tsd-theme"><option value="os">OS</option><option value="light">Light</option><option value="dark">Dark</option></select></div></div></details></div><details open class="tsd-accordion tsd-page-navigation"><summary class="tsd-accordion-summary"><h3><svg width="20" height="20" viewBox="0 0 24 24" fill="none"><use href="../assets/icons.svg#icon-chevronDown"></use></svg>On This Page</h3></summary><div class="tsd-accordion-details"><a href="#md:vsearch-vector-similarity-embedding-approximation-in-ram-limited-cluster-heirarchy"><span>VSEARCH: <wbr/>Vector <wbr/>Similarity <wbr/>Embedding <wbr/>Approximation in RAM-<wbr/>Limited <wbr/>Cluster <wbr/>Heirarchy</span></a></div></details></div><div class="site-menu"><nav id="tsd-sidebar-links" class="tsd-navigation"><a href="https://github.com/vtempest/ai-research-agent">Source Code</a><a href="https://qwksearch.com">Live Demo</a><a href="https://discord.gg/SJdBqBz3tV">Discord Chat</a><a href="https://github.com/vtempest/ai-research-agent" class="tsd-nav-link">Source Code</a><a href="https://qwksearch.com" class="tsd-nav-link">Live Demo</a><a href="https://discord.gg/SJdBqBz3tV" class="tsd-nav-link">Discord Chat</a></nav><nav class="tsd-navigation"><a href="../modules.html"><svg class="tsd-kind-icon" viewBox="0 0 24 24"><use href="../assets/icons.svg#icon-1"></use></svg><span>ai-research-agent</span></a><ul class="tsd-small-nested-navigation" id="tsd-nav-container" data-base=".."><li>Loading...</li></ul></nav></div></div></div><footer></footer><div class="overlay"></div><script async src="https://www.googletagmanager.com/gtag/js?id=G-E5TZ32BZDF"></script><script>window.dataLayer = window.dataLayer || [];
function gtag(){dataLayer.push(arguments);}
gtag('js', new Date());
gtag('config', 'G-E5TZ32BZDF');</script></body></html>
Loading

0 comments on commit a2ba031

Please sign in to comment.