VecDB

a very simple vector embedding database, you can say that it is a hash-table that let you find items similar to the item you're searching for.

Why!

I'm a databases enthusiast, and this is a for fun and learning project that could be used in production ;).

P.S: I like to re-invent the wheel in my free time, because it is my free time!

Data Model

I'm using the {key => value} model,

key should be a unique value that represents the item.

value should be the vector itself (List of Floats).

Configurations

by default vecdb searches for config.yml in the current working directory. but you can override it using the --config /path/to/config.yml flag by providing your own custom file path.

# http server related configs
server:
  # the address to listen on in the form of '[host]:port'
  listen: "0.0.0.0:3000"

# storage related configs
store:
  # the driver you want to use
  # currently vecdb supports "bolt" which is based on boltdb the in process embedded the database
  driver: "bolt"
  # the arguments required by the driver
  # for bolt, it requires a key called `database` points to the path you want to store the data in.
  args:
    database: "./vec.db"

# embeddings related configs
embedder:
  # whether to enable the embedder and all endpoints using it or not
  enabled: true
  # the driver you want to use, currently vecdb supports gemini
  driver: gemini
  # the arguments required by the driver
  # currently gemini driver requires `api_key` and `text_embedding_model`
  args:
    # by default vecdb will replace anything between ${..} with the actual value from the ENV var
    api_key: "${GEMINI_API_KEY}"
    text_embedding_model: "text-embedding-004"

Components

Raw Vectors Layer (low-level)
- send VectorWriteRequest to POST /v1/vectors/write when you have a vector and want to store it somewhere.
- send VectorSearchRequest to POST /v1/vectors/search when you have a vector and want to list all similar vectors' keys/ids ordered by cosine similarity in descending order.
Embedding Layer (optional)
- send TextEmbeddingWriteRequest to POST /v1/embeddings/text/write when you have a text and want vecdb to build and store the vector for you using the configured embedder (gemini for now).
- send TextEmbeddingSearchRequest to POST /v1/embeddings/text/search when you have a text and want vecdb to build a vector and search for similar vectors' keys for you ordered by cosine similarity in descending order.

Requests

VectorWriteRequest

{
  "bucket": "BUCKET_NAME", // consider it a collection or a table
  "key": "product-id-1", // should be unique and represents a valid value in your main data store (example: the row id in your mysql/postgres ... etc)
  "vector": [1.929292, 0.3848484, -1.9383838383, ... ] // the vector you want to store 
}

VectorSearchRequest

{
  "bucket": "BUCKET_NAME", // consider it a collection or a table
  "vector": [1.929292, 0.3848484, -1.9383838383, ... ], // you will get a list ordered by cosine-similarity in descending order
  "min_cosine_similarity": 0.0, // the more you increase, the fewer data you will get
  "max_result_count": 10 // max vectors to return (vecdb will first order by cosine similarity then apply the limit)
}

TextEmbeddingWriteRequest

if you set embedder.enabled to true.

{
  "bucket": "BUCKET_NAME", // consider it a collection or a table
  "key": "product-id-1", // should be unique and represents a valid value in your main data store (example: the row id in your mysql/postgres ... etc)
  "content": "This is some text representing the product" // this will be converted to a vector using the configured embedder 
}

TextEmbeddingSearchRequest

if you set embedder.enabled to true.

{
  "bucket": "BUCKET_NAME", // consider it a collection or a table
  "content": "A Product Text", // you will get a list ordered by cosine-similarity in descending order
  "min_cosine_similarity": 0.0, // the more you increase, the fewer data you will get
  "max_result_count": 10 // max vectors to return (vecdb will first order by cosine similarity then apply the limit)
}

Download/Install

Binary
Docker Image

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

VecDB

Why!

Data Model

Configurations

Components

Requests

VectorWriteRequest

VectorSearchRequest

TextEmbeddingWriteRequest

TextEmbeddingSearchRequest

Download/Install

Files

README.md

Latest commit

History

README.md

File metadata and controls

VecDB

Why!

Data Model

Configurations

Components

Requests

VectorWriteRequest

VectorSearchRequest

TextEmbeddingWriteRequest

TextEmbeddingSearchRequest

Download/Install