Calculate Mahalanobis distance with tensorflow 2.0

I noticed that tensorflow does not have functions to compute Mahalanobis distance between two groups of samples. So here I go and provide the code with explanation. The blog is organized and explain the following topics

• Euclidean distance with Scipy
• Euclidean distance with Tensorflow v2
• Mahalanobis distance with Scipy
• Mahalanobis distance with Tensorflow v2
In [1]:
import tensorflow as tf
print("tensorflow:",tf.__version__)

tensorflow: 2.1.0


Euclidean distance with Spicy¶

Here is Scipy version of calculating the Euclidean distance between two group of samples:

$$\boldsymbol{a}, R^{\textrm{M1 x n_feat}} \boldsymbol{b} \in R^{\textrm{M2 x n_feat}}$$

At the end we want a distance matrix of size

$$npeuc \in R^{M1 x M2}$$

In the following code, there is a double loop and distance is calculated for each sample pair one by one. This is not very efficient!

In [2]:
import numpy as np
from scipy.spatial.distance import cdist
from scipy.spatial import distance

M1, M2, n_feat = 5, 4, 2

# Scipy calculation
a = np.random.rand(M1, n_feat).astype(np.float32)
b = np.random.rand(M2, n_feat).astype(np.float32)

print("a")
print(a)
print("b")
print(b)
npeuc = [[0 for _ in range(b.shape[0])]
for _ in range(a.shape[0])]
for ib in range(b.shape[0]):
for ia in range(a.shape[0]):
npeuc[ia][ib] = distance.euclidean(a[ia],b[ib])
npeuc = np.array(npeuc)
print("Euclidean distances with scipy")
print(npeuc.shape)

a
[[0.3898384  0.78091735]
[0.18484533 0.16991404]
[0.04142497 0.0612804 ]
[0.76031846 0.3387485 ]
[0.86538064 0.8510765 ]]
b
[[0.9657545  0.78346694]
[0.167351   0.7013792 ]
[0.5052127  0.01969893]
[0.7475627  0.29226208]]
Euclidean distances with scipy
(5, 4)


Euclidean distance with Tenworflow v2¶

In [3]:
def Euclidean(A,B):
v = tf.expand_dims(tf.reduce_sum(tf.square(A), 1), 1)
p1 = tf.reshape(tf.reduce_sum(v,axis=1),(-1,1))
v = tf.reshape(tf.reduce_sum(tf.square(B), 1), shape=[-1, 1])
p2 = tf.transpose(tf.reshape(tf.reduce_sum(v,axis=1),(-1,1)))
res = tf.sqrt(tf.add(p1, p2) - 2 * tf.matmul(A, B, transpose_b=True))
return(res)

A = tf.Variable(a)
B = tf.Variable(b)
print("A")
print(A)
print("B")
print(B)

res = Euclidean(A,B)
tfeuc = res.numpy()
print("Euclidean distances with tensorflow")
tfeuc
#with tf.Session() as sess: tensorflow v2!!
#    print(sess.run(res))

A

B

Euclidean distances with tensorflow

Out[3]:
array([[0.5759218 , 0.23627718, 0.7699122 , 0.60559934],
[0.9931094 , 0.531753  , 0.35383588, 0.57586443],
[1.173004  , 0.65236783, 0.46564806, 0.74295557],
[0.48987594, 0.69506216, 0.4084993 , 0.04820526],
[0.12102088, 0.7139011 , 0.9060406 , 0.57109946]], dtype=float32)

Let's compare the aggreement between the two Euclidean distances:

In [4]:
print("average difference in the np calculation and tf calculations are:",np.mean(np.abs(tfeuc - npeuc)))

average difference in the np calculation and tf calculations are: 7.003545761108399e-08


Maharanobis distance with Scipy¶

In [5]:
import tensorflow as tf
import numpy as np
from scipy.spatial.distance import cdist
from scipy.spatial import distance

M1, M2, n_feat = 5, 4, 2

# Scipy calculation
a = np.random.rand(M1, n_feat).astype(np.float32)
b = np.random.rand(M2, n_feat).astype(np.float32)
# inverse of the  covariance matrix
S = np.random.rand(n_feat,n_feat).astype(np.float32)
S = np.matmul(S,np.transpose(S))

print("a")
print(a)
print("b")
print(b)

npmah = [[0 for _ in range(b.shape[0])]
for _ in range(a.shape[0])]
for ib in range(b.shape[0]):
for ia in range(a.shape[0]):
npmah[ia][ib] = distance.mahalanobis(a[ia],b[ib],S)
npmah = np.array(npmah)
print("Maharanobis distances with scipy")
npmah

a
[[0.9467549  0.22613871]
[0.8387713  0.80988747]
[0.82303214 0.51048654]
[0.24918175 0.1219909 ]
[0.70037013 0.82121724]]
b
[[0.37487715 0.6706169 ]
[0.13873063 0.11496447]
[0.7132854  0.5411484 ]
[0.4853384  0.67873824]]
Maharanobis distances with scipy

Out[5]:
array([[0.3778871 , 0.43882653, 0.2763056 , 0.38677463],
[0.3112256 , 0.9181496 , 0.3075011 , 0.25844806],
[0.17797098, 0.6275558 , 0.04191556, 0.15506305],
[0.5897171 , 0.05437765, 0.5684887 , 0.62985957],
[0.2643244 , 0.8821087 , 0.28097183, 0.2141314 ]], dtype=float32)

Mahalanobis distance with tensorflow¶

The mahalanobis distances of two samples $\boldsymbol{x}$ and $\boldsymbol{y}$ $\in R^{Nfeat}$ with covariance matrix $\Sigma \in R^{\textrm{N_feat x N_feat}}$ are defined as:

\begin{array}{rl} (\boldsymbol{x}-\boldsymbol{y})^T \Sigma^{-1}(\boldsymbol{x}-\boldsymbol{y}) \in R^1 \end{array}

My calcualtion of Mahalanobis distance utilizes the calculation of Euclidean distance described above. How? Let me show you!

The key is to use Cholesky decomposition. Cholesky decomposition allows decomposing real symmetric matrix $A$ into two matrix $A=LL^T$. Fortunately covariance matrix $\Sigma$ is symmetric matrix so we can use this decomposition and get a matrix $\boldsymbol{C}$

\begin{array}{rl} \Sigma^{-1} = \boldsymbol{C}^T\boldsymbol{C}, \end{array} where $\boldsymbol{C}\in R^{\textrm{N_feat x N_feat}}$

Then consider standardized vectors:

• $\boldsymbol{x}^*=\boldsymbol{C} \boldsymbol{x}\;\;\in R^{Nfeat}$
• $\boldsymbol{y}^*=\boldsymbol{C} \boldsymbol{y}$

Their Euclidean distances are:

\begin{array}{rl} ||\boldsymbol{y}^* - \boldsymbol{x}^* || &= (\boldsymbol{y}^* - \boldsymbol{x}^* )^T(\boldsymbol{y}^* - \boldsymbol{x}^* )\\ &= (\boldsymbol{C}\boldsymbol{y} - \boldsymbol{C}\boldsymbol{x})^T(\boldsymbol{C}\boldsymbol{y}- \boldsymbol{C}\boldsymbol{x} )\\ &= (\boldsymbol{y} - \boldsymbol{x})^T\boldsymbol{C}^T\boldsymbol{C}(\boldsymbol{y}- \boldsymbol{x} ) &= (\boldsymbol{y} - \boldsymbol{x})^T\Sigma^{-1}(\boldsymbol{y}- \boldsymbol{x} ) \end{array}

Notice that the Euclidean distance between $\boldsymbol{x}^*$ and $\boldsymbol{y}^*$ is Mahalanobis distance between $\boldsymbol{x}$ and $\boldsymbol{y}$.

Using this idea, we calculate the Mahalanobis distances.

In [6]:
def EfficientMaharanobis(A,B,invS):
'''

A : tensor, N sample1 by N feat
B : tensor, N sample2 by N feat
S : tensor, N feat by N feat

Output:

marahanobis distance of each pair (A[i],B[j]) with inv variance S

'''
S_half = tf.linalg.cholesky(invS)
A_star = tf.matmul(A,S_half)
B_star = tf.matmul(B,S_half)

res = Euclidean(A_star,B_star)
return(res)

# TF calculation
A = tf.Variable(a)
B = tf.Variable(b)
S = tf.Variable(S)

print("A")
print(A)
print("B")
print(B)
print("S")
print(S)

tfmah = EfficientMaharanobis(A,B,S)

print("Maharanobis distances with tensorflow v2")
tfmah.numpy()

A

B

S

Maharanobis distances with tensorflow v2

Out[6]:
array([[0.37788695, 0.43882656, 0.27630562, 0.38677463],
[0.31122538, 0.9181496 , 0.3075013 , 0.25844818],
[0.17797098, 0.6275558 , 0.04191547, 0.15506293],
[0.58971703, 0.05437763, 0.5684887 , 0.6298596 ],
[0.26432416, 0.8821087 , 0.28097183, 0.2141315 ]], dtype=float32)
In [7]:
print("average difference in the np calculation and tf calculations are:",np.mean(np.abs(tfmah - npmah)))

average difference in the np calculation and tf calculations are: 7.0594254e-08