Class Lucene104ScalarQuantizedVectorsFormat

java.lang.Object
org.apache.lucene.codecs.KnnVectorsFormat
org.apache.lucene.codecs.hnsw.FlatVectorsFormat
org.apache.lucene.codecs.lucene104.Lucene104ScalarQuantizedVectorsFormat
All Implemented Interfaces:
NamedSPILoader.NamedSPI

public class Lucene104ScalarQuantizedVectorsFormat extends FlatVectorsFormat
The quantization format used here is a per-vector optimized scalar quantization. These ideas are evolutions of LVQ proposed in Similarity search in the blink of an eye with compressed indices by Cecilia Aguerrebere et al., the previous work on globally optimized scalar quantization in Apache Lucene, and Accelerating Large-Scale Inference with Anisotropic Vector Quantization by Ruiqi Guo et. al. Also see OptimizedScalarQuantizer. Some of key features are:
  • Estimating the distance between two vectors using their centroid centered distance. This requires some additional corrective factors, but allows for centroid centering to occur.
  • Optimized scalar quantization to single bit level of centroid centered vectors.
  • Asymmetric quantization of vectors, where query vectors are quantized to half-byte (4 bits) precision (normalized to the centroid) and then compared directly against the single bit quantized vectors in the index.
  • Transforming the half-byte quantized query vectors in such a way that the comparison with single bit vectors can be done with bit arithmetic.
A previous work related to improvements over regular LVQ is Practical and Asymptotically Optimal Quantization of High-Dimensional Vectors in Euclidean Space for Approximate Nearest Neighbor Search by Jianyang Gao, et. al.

The format is stored within two files:

.veq (vector data) file

Stores the quantized vectors in a flat format. Additionally, it stores each vector's corrective factors. At the end of the file, additional information is stored for vector ordinal to centroid ordinal mapping and sparse vector information.

  • For each vector:
    • [byte] the quantized values. Each dimension may be up to 8 bits, and multiple dimensions may be packed into a single byte.
    • [float] the optimized quantiles and an additional similarity dependent corrective factor.
    • [int] the sum of the quantized components
  • After the vectors, sparse vector information keeping track of monotonic blocks.

.vemq (vector metadata) file

Stores the metadata for the vectors. This includes the number of vectors, the number of dimensions, and file offset information.

  • int the field number
  • int the vector encoding ordinal
  • int the vector similarity ordinal
  • vint the vector dimensions
  • vlong the offset to the vector data in the .veq file
  • vlong the length of the vector data in the .veq file
  • vint the number of vectors
  • vint the wire number for ScalarEncoding
  • [float] the centroid
  • float the centroid square magnitude
  • The sparse vector information, if required, mapping vector ordinal to doc ID