Pinecone Systems
Accurate Metadata Filtering in Pinecone Vector Database
Pages
7
Time to read
19 mins
Publication
Language
English
Pages
7
Time to read
19 mins
Publication
Language
English
This technical report presents the design and implementation of metadata filtering within Pinecone’s serverless vector database. It addresses the challenges of maintaining high accuracy in filtering while managing continuous data mutations and independent updates to metadata. The report details how the architecture integrates filtering into the vector retrieval path, utilizing immutable vector slabs organized in an LSM-tree structure. The filtering model is explained, highlighting the use of (key, value) pairs for metadata and the complexities involved in applying filters during approximate nearest neighbor (ANN) searches. The report also discusses the performance of filtered ANN searches over a public dataset and production data, demonstrating the system's ability to maintain exact filtering accuracy while achieving scalable performance. Additionally, it outlines the interaction patterns between filters and ANN algorithms, identifying key challenges and future research directions in the field of vector databases.