Yumi's Blog

Calculate Mahalanobis distance with tensorflow 2.0

I noticed that tensorflow does not have functions to compute Mahalanobis distance between two groups of samples. So here I go and provide the code with explanation. The blog is organized and explain the following topics

  • Euclidean distance with Scipy
  • Euclidean distance with Tensorflow v2
  • Mahalanobis distance with Scipy
  • Mahalanobis distance with Tensorflow v2
In [1]:
import tensorflow as tf
print("tensorflow:",tf.__version__)
tensorflow: 2.1.0

Euclidean distance with Spicy

Here is Scipy version of calculating the Euclidean distance between two group of samples:

$$ \boldsymbol{a}, R^{\textrm{M1 x n_feat}} \boldsymbol{b} \in R^{\textrm{M2 x n_feat}} $$

At the end we want a distance matrix of size

$$ npeuc \in R^{M1 x M2} $$

In the following code, there is a double loop and distance is calculated for each sample pair one by one. This is not very efficient!

In [2]:
import numpy as np
from scipy.spatial.distance import cdist
from scipy.spatial import distance

M1, M2, n_feat = 5, 4, 2

# Scipy calculation
a = np.random.rand(M1, n_feat).astype(np.float32)
b = np.random.rand(M2, n_feat).astype(np.float32)

print("a")
print(a)
print("b")
print(b)
npeuc = [[0 for _ in range(b.shape[0])]
          for _ in range(a.shape[0])]
for ib in range(b.shape[0]):
    for ia in range(a.shape[0]):
        npeuc[ia][ib] = distance.euclidean(a[ia],b[ib])
npeuc = np.array(npeuc)
print("Euclidean distances with scipy")
print(npeuc.shape)
a
[[0.3898384  0.78091735]
 [0.18484533 0.16991404]
 [0.04142497 0.0612804 ]
 [0.76031846 0.3387485 ]
 [0.86538064 0.8510765 ]]
b
[[0.9657545  0.78346694]
 [0.167351   0.7013792 ]
 [0.5052127  0.01969893]
 [0.7475627  0.29226208]]
Euclidean distances with scipy
(5, 4)

Euclidean distance with Tenworflow v2

In [3]:
def Euclidean(A,B):
        v = tf.expand_dims(tf.reduce_sum(tf.square(A), 1), 1)
        p1 = tf.reshape(tf.reduce_sum(v,axis=1),(-1,1))
        v = tf.reshape(tf.reduce_sum(tf.square(B), 1), shape=[-1, 1])
        p2 = tf.transpose(tf.reshape(tf.reduce_sum(v,axis=1),(-1,1)))
        res = tf.sqrt(tf.add(p1, p2) - 2 * tf.matmul(A, B, transpose_b=True))
        return(res)


A = tf.Variable(a)
B = tf.Variable(b)
print("A")
print(A)
print("B")
print(B)

res = Euclidean(A,B)
tfeuc = res.numpy()
print("Euclidean distances with tensorflow")
tfeuc
#with tf.Session() as sess: tensorflow v2!!
#    print(sess.run(res))
A

B

Euclidean distances with tensorflow
Out[3]:
array([[0.5759218 , 0.23627718, 0.7699122 , 0.60559934],
       [0.9931094 , 0.531753  , 0.35383588, 0.57586443],
       [1.173004  , 0.65236783, 0.46564806, 0.74295557],
       [0.48987594, 0.69506216, 0.4084993 , 0.04820526],
       [0.12102088, 0.7139011 , 0.9060406 , 0.57109946]], dtype=float32)

Let's compare the aggreement between the two Euclidean distances:

In [4]:
print("average difference in the np calculation and tf calculations are:",np.mean(np.abs(tfeuc - npeuc)))
average difference in the np calculation and tf calculations are: 7.003545761108399e-08

Maharanobis distance with Scipy

Now we are ready to calculate Mahalanobis distance. As before, let's start with Scipy version.

In [5]:
import tensorflow as tf
import numpy as np
from scipy.spatial.distance import cdist
from scipy.spatial import distance

M1, M2, n_feat = 5, 4, 2

# Scipy calculation
a = np.random.rand(M1, n_feat).astype(np.float32)
b = np.random.rand(M2, n_feat).astype(np.float32)
# inverse of the  covariance matrix
S = np.random.rand(n_feat,n_feat).astype(np.float32)
S = np.matmul(S,np.transpose(S))

print("a")
print(a)
print("b")
print(b)

npmah = [[0 for _ in range(b.shape[0])]
          for _ in range(a.shape[0])]
for ib in range(b.shape[0]):
    for ia in range(a.shape[0]):
        npmah[ia][ib] = distance.mahalanobis(a[ia],b[ib],S)
npmah = np.array(npmah)
print("Maharanobis distances with scipy")
npmah
a
[[0.9467549  0.22613871]
 [0.8387713  0.80988747]
 [0.82303214 0.51048654]
 [0.24918175 0.1219909 ]
 [0.70037013 0.82121724]]
b
[[0.37487715 0.6706169 ]
 [0.13873063 0.11496447]
 [0.7132854  0.5411484 ]
 [0.4853384  0.67873824]]
Maharanobis distances with scipy
Out[5]:
array([[0.3778871 , 0.43882653, 0.2763056 , 0.38677463],
       [0.3112256 , 0.9181496 , 0.3075011 , 0.25844806],
       [0.17797098, 0.6275558 , 0.04191556, 0.15506305],
       [0.5897171 , 0.05437765, 0.5684887 , 0.62985957],
       [0.2643244 , 0.8821087 , 0.28097183, 0.2141314 ]], dtype=float32)

Mahalanobis distance with tensorflow

The mahalanobis distances of two samples $\boldsymbol{x}$ and $\boldsymbol{y}$ $\in R^{Nfeat}$ with covariance matrix $\Sigma \in R^{\textrm{N_feat x N_feat}}$ are defined as:

\begin{array}{rl} (\boldsymbol{x}-\boldsymbol{y})^T \Sigma^{-1}(\boldsymbol{x}-\boldsymbol{y}) \in R^1 \end{array}

My calcualtion of Mahalanobis distance utilizes the calculation of Euclidean distance described above. How? Let me show you!

The key is to use Cholesky decomposition. Cholesky decomposition allows decomposing real symmetric matrix $A$ into two matrix $A=LL^T$. Fortunately covariance matrix $\Sigma$ is symmetric matrix so we can use this decomposition and get a matrix $\boldsymbol{C}$

\begin{array}{rl} \Sigma^{-1} = \boldsymbol{C}^T\boldsymbol{C}, \end{array} where $\boldsymbol{C}\in R^{\textrm{N_feat x N_feat}}$

Then consider standardized vectors:

  • $\boldsymbol{x}^*=\boldsymbol{C} \boldsymbol{x}\;\;\in R^{Nfeat}$
  • $\boldsymbol{y}^*=\boldsymbol{C} \boldsymbol{y}$

Their Euclidean distances are:

\begin{array}{rl} ||\boldsymbol{y}^* - \boldsymbol{x}^* || &= (\boldsymbol{y}^* - \boldsymbol{x}^* )^T(\boldsymbol{y}^* - \boldsymbol{x}^* )\\ &= (\boldsymbol{C}\boldsymbol{y} - \boldsymbol{C}\boldsymbol{x})^T(\boldsymbol{C}\boldsymbol{y}- \boldsymbol{C}\boldsymbol{x} )\\ &= (\boldsymbol{y} - \boldsymbol{x})^T\boldsymbol{C}^T\boldsymbol{C}(\boldsymbol{y}- \boldsymbol{x} ) &= (\boldsymbol{y} - \boldsymbol{x})^T\Sigma^{-1}(\boldsymbol{y}- \boldsymbol{x} ) \end{array}

Notice that the Euclidean distance between $\boldsymbol{x}^*$ and $\boldsymbol{y}^*$ is Mahalanobis distance between $\boldsymbol{x}$ and $\boldsymbol{y}$.

Using this idea, we calculate the Mahalanobis distances.

In [6]:
def EfficientMaharanobis(A,B,invS):
    '''
    
    A : tensor, N sample1 by N feat
    B : tensor, N sample2 by N feat
    S : tensor, N feat by N feat 
    
    Output:
    
    marahanobis distance of each pair (A[i],B[j]) with inv variance S
    
    
    '''
    S_half = tf.linalg.cholesky(invS)
    A_star = tf.matmul(A,S_half)
    B_star = tf.matmul(B,S_half)

    res = Euclidean(A_star,B_star)
    return(res)


# TF calculation
A = tf.Variable(a)
B = tf.Variable(b)
S = tf.Variable(S)


print("A")
print(A)
print("B")
print(B)
print("S")
print(S)

tfmah = EfficientMaharanobis(A,B,S)

print("Maharanobis distances with tensorflow v2")
tfmah.numpy()
A

B

S

Maharanobis distances with tensorflow v2
Out[6]:
array([[0.37788695, 0.43882656, 0.27630562, 0.38677463],
       [0.31122538, 0.9181496 , 0.3075013 , 0.25844818],
       [0.17797098, 0.6275558 , 0.04191547, 0.15506293],
       [0.58971703, 0.05437763, 0.5684887 , 0.6298596 ],
       [0.26432416, 0.8821087 , 0.28097183, 0.2141315 ]], dtype=float32)
In [7]:
print("average difference in the np calculation and tf calculations are:",np.mean(np.abs(tfmah - npmah)))
average difference in the np calculation and tf calculations are: 7.0594254e-08

Comments