Question about index parameters #121

xcu · 2024-08-15T18:06:08Z

I am managing a vector DB and I'm considering switching to pgvectorscale. However, I'm a bit lost regarding what index configuration params I could use.
The table in question contains +50M embeddings of 512 dimensions, but the table is partitioned with partman in tables of 100k embeddings. So we could actually regard it as 500 small tables of 100k embeddings, with 512 dimensions each.

Would default configuration/query params for the diskANN index suit? Or do you think there are some build/query parameters that could be tweaked for better recall/search speed?

jonatas · 2024-08-19T17:17:47Z

Hey @xcu, thanks for asking! @cevian can probably help to answer this, but I also see this question as a great conversation for our discord! Join us and check what other devs are using too: https://discord.gg/KRdHVXAmkp

cevian · 2024-08-22T00:30:02Z

@xcu I think the defaults should suffice here

cevian added the question Further information is requested label Aug 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about index parameters #121

Question about index parameters #121

xcu commented Aug 15, 2024

jonatas commented Aug 19, 2024

cevian commented Aug 22, 2024

Question about index parameters #121

Question about index parameters #121

Comments

xcu commented Aug 15, 2024

jonatas commented Aug 19, 2024

cevian commented Aug 22, 2024