concept

Product Quantization

Product Quantization (PQ) is a vector compression technique used in machine learning and information retrieval to efficiently store and search high-dimensional vectors, such as those from embeddings or feature descriptors. It works by splitting a vector into subvectors, quantizing each subvector independently using a small codebook, and representing the original vector as a concatenation of codebook indices. This reduces memory usage and speeds up similarity search operations like nearest neighbor retrieval.

Also known as: PQ, Product Quantisation, Vector Quantization, Subvector Quantization, Quantized Embeddings

🧊Why learn Product Quantization?

Developers should learn Product Quantization when working with large-scale similarity search systems, such as recommendation engines, image retrieval, or natural language processing applications where high-dimensional vectors are common. It is particularly useful in scenarios requiring efficient storage and fast querying of billions of vectors, as it enables approximate nearest neighbor search with reduced computational and memory costs compared to exact methods.