向量数据库存储 embeddings(向量表示)并通过相似度高效搜索 — 使语义搜索、RAG 和推荐系统成为可能。它们是处理 embeddings 的现代 AI 应用程序的关键基础设施组件。
向量数据库的作用
VECTOR DATABASE → stores EMBEDDINGS (vectors) and searches them by SIMILARITY:
→ store millions of vectors (representing documents, images, etc.)
→ given a query vector, efficiently find the most SIMILAR vectors (nearest neighbors)
→ optimized for high-dimensional vector similarity search at scale
→ enables fast semantic similarity search over large embedding collections
为什么需要它们
→ semantic search/RAG need to find the most relevant items by EMBEDDING SIMILARITY
→ comparing a query against millions of vectors naively is SLOW → vector DBs use
approximate nearest neighbor (ANN) algorithms for FAST similarity search
→ purpose-built for the vector similarity search that AI applications need at scale
