[Matrix] Matched() has a lower complexity
authorSimon Chabot <simon.chabot@logilab.fr>
Fri, 19 Oct 2012 10:46:14 +0200
changeset 29 5b1f828f36c6
parent 28 d9c1ef721418
child 30 6b2b4a8d5ff7
[Matrix] Matched() has a lower complexity Instead of reading the whole matrix and get indexes where the value is under the cutoff (O(N²)), the idea is : - Get all indexes where the value is not null (ie not exact matched) O(N) - Append all indexes that are not in the previous list (exact matched) O(N) - If cutoff > 0, for all indexes where not null, test if the O(N) çvalue < cutoff, and add or not
matrix.py
--- a/matrix.py	Fri Oct 19 16:58:27 2012 +0200
+++ b/matrix.py	Fri Oct 19 10:46:14 2012 +0200
@@ -59,8 +59,19 @@
         match = defaultdict(list)
         if normalized:
             cutoff *= self._maxdist
-        for i in xrange(len(self.input1)):
-            for j in xrange(i, len(self.input2)):
+
+        row, col = self._matrix.nonzero()
+        rowcol = zip(row, col)
+
+        #Get those that exactly matched
+        size = self._matrix.get_shape()
+        allindexes = (zip(xrange(size[0]), xrange(size[1])))
+        zeros = [index for index in allindexes if index not in rowcol]
+        for (i, j) in zeros:
+            match[i].append(j)
+
+        if cutoff > 0: #If more is wanted, return it too
+            for (i, j) in rowcol:
                 if self._matrix[i, j] <= cutoff:
                     match[i].append(j)
         return match