You are here: Symbol Reference > MtxForLoop Namespace > Functions > MtxForLoop.ClusteredKNN Function
MtxVec VCL
ContentsIndex
PreviousUpNext
MtxForLoop.ClusteredKNN Function

Exact brute-force K-NN without multi-threading.

Pascal
procedure ClusteredKNN(const srcData: TMtx; const srcCompareData: TMtx; const K: integer; const dstNearIdx: TMtxInt; const dstNearDistance: TMtx); overload;
Parameters 
Description 
srcData 
A matrix where each row is one vector of the original data.  
srcCompareData 
A matrix where each row is one vector. We want to find K vectors from srcData for each srcCompareData vector.  
The number of nearest neighbours/rows from SrcData to find for each row in SrcCompareData.  
dstNearIdx 
Contains K indexes of rows from srcData for each row in SrcCompareData on output. The actual distances are stored in dstNearDistance parameter.  
dstNearDistance 
Contains K distance of rows from srcData for each row in SrcCompareData on output. The distances are not sorted.  

K-nearest-neighbours will be computing the square of L2-Norm (distance) between rows in srcData and rows in srcCompareData to find K nearest rows from srcData for each row in the srcCompareData matrix. The SrcData.Cols and SrcCompareData.Cols need to match in size or an exception will be raised. The sizes of dstNearIdx, dstNearDistance and dstNearLargestIdx are set automatically. 

Performance notes:

  • This variant does not use threads and is suitable for user-side multi-threading.
  • The algorithm assumes existence of 32KB L1 cache. This is true for all Intel CPUs since Sandy Bridge and AMD Zen. The performance will drop by roughly 10x, if not present.
  • The algorithms efficiency grows with the size of srcData and especially of the srcCompareData matrices.
uses MtxExpr, MtxVec, MtxExprInt; var Data, cmpData, nearDistance: Matrix; nearIdx: MatrixInt; begin data := [[1,2,3], [2,3,1], [3,2,1]]; cmpData := [[2,2,3], [2,3,4]]; ClusteredKNN(computeThreads, Data, cmpData, 2, nearIdx, nearDistance); // Results: // nearIdx = [[0, 1], // data rows 0 and 1 are closest vectors to cmpData row 0. // [0, 1]] // data rows 0 and 1 are closest vectors to cmpData row 1. // nearDistance = [[1 5], // [0,0] = sqr(2-1) + sqr(2-2) + sqr(3-3), distance from cmpData row 0 to Data row 0 is 1 // [3, 9]] // [1,0] = sqr(2-1) + sqr(3-2) + sqr(4-3), distance from cmpData row 1 to Data row 0 is 3
Examples on GitHub
Copyright (c) 1999-2025 by Dew Research. All rights reserved.
What do you think about this topic? Send feedback!