About us
We are Vector Database Research Group at Nanyang Technological School. Our interests lies in high-dimensional vector data management (and its applications in large models such as retrieval-augmented generative AI).
Large-scale high-dimensional vector data has become ubiquitous in contemporary times. For instance, various forms of unstructured data, such as images, videos, texts, and speeches, are typically transformed into vectors using deep learning techniques. These vectors are subsequently employed in downstream analytical tasks. Nearest neighbor (NN) search in high-dimensional vector space constitutes a fundamental problem with a wide array of applications in information retrieval, recommendations, and retrieval-based large language models. We have developed several techniques for approximate NN (ANN), including:
- SymphonyQG for combining graph-based index with quantization (SIGMOD'25)
- Extended RaBitQ for allowing more flexible quantization with varying compression rates (arXiv).
- iRangeGraph for attribute-filtered ANN (SIGMOD'25)
- RaBitQ for quantizing high-dimensional vectors (SIGMOD'24)
- ADSampling for efficient and reliable distance comparisons (SIGMOD'23)