Yumi's Blog

Part 3 Object Detection using RCNN on Pascal VOC2012 - Selective Search

This is part of the blog series for Object Detection with R-CNN.

In this blog, we will review the selective sarch algorithm. The selective search is one of the most successful category-independent region proposal algorithms, and R-CNN also uses selective search to find region proposal.

J.R.R. Uijlings et al take a hierarchical grouping algorithm to form the basis of selective search, and first apply fast segmentation method of Felzenszwalb and Huttenlocher to create smallest partition of images called "initial regions". Then selective search later uses a greedy algorithm to iteratively group regions together.

So, roughly speaking, the selective search has two steps:

  • Step 1: Create initial regions by Felzenszwalb’s efficient graph based segmentation algorithm
  • Step 2: Group regions based on various criteria (Local Binary Pattern features and )

I will go over the details of each step one by one. The codes here are based a lot on AlpacaDB/selectivesearch's Github account. So please credit them when you use codes in this blog.

Reference

Reference: "Object Detection with R-CNN" series in my blog

Reference: "Object Detection with R-CNN" series in my Github

In [1]:
import os 
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd 
import scipy.misc
import skimage.segmentation
import skimage.feature
from copy import copy

## This must be the location of the PASCAL VOC data. 
img_dir          = "VOCdevkit/VOC2012/JPEGImages"

Step 1: Create initial regions by Felzenszwalb’s efficient graph based segmentation algorithm

J.R.R. Uijlings's graph segmentation is an unsupervised method to partition the image into several regions. It uses the graph representation of an image. In graph formulation, each pixel intensity is vertice and the pair of the adjacent pixels is edge (See Figure that I took from Shih-Shinh Huang's youtube tutorial: quarter DIP Efficient Graph Based Image Segmentation).

Weight measures the strength of the edge, and the absolute difference of intensities is used as a weight. Then the pixels should be combined into the same group with the idea that:

  • edges between two vertices in the same group should have lower weights
  • edges between two vertices in the different group should have higher weights

Shih-Shinh Huang's youtube tutorial (Credit: Shih-Shinh Huang's youtube tutorial: quarter DIP Efficient Graph Based Image Segmentation)

Thankfully, skimage.segmentation: Comparison of segmentation and superpixel algorithms already implemented Felzenszwalb and Huttenlocher's segmentation algorithm. So we will use this implemented algorithm.

Let's try it out and visualize the segmented images. The following codes randomly select 5 frames from PASCAL data and visualize its original image and segmented image.

In [2]:
def image_segmentation(img_8bit, scale = 1.0, sigma = 0.8, min_size = 50):
    '''
    J.R.R. Uijlings's hierarchical grouping algorithm 
    
    == input ==
    img_8bit : shape = (height, width, 3),
               8-bits degital image (each digit ranges between 0 - 255)
    
    == output ==
    img      : shape = (height, width, 4)
    '''
    # convert the image to range between 0 and 1
    img_float = skimage.util.img_as_float(img_8bit)
    im_mask   = skimage.segmentation.felzenszwalb(
                    img_float, 
                    scale    = scale, 
                    sigma    = sigma,
                    min_size = min_size)
    img       = np.dstack([img_8bit,im_mask])
    return(img)

scale    = 1.0
sigma    = 0.8
# min_size may be around 50 for better RCNN performance but for the sake of visualization, I will stick to min_size =500
min_size = 500 # 500 3000

np.random.seed(4)
listed_path = os.listdir(img_dir)
Nplot = 5
random_img_path = np.random.choice(listed_path,Nplot)
for imgnm in random_img_path:
    # import 8 bits degital image (each digit ranges between 0 - 255)
    img_8bit  = scipy.misc.imread(os.path.join(img_dir,imgnm))
    img       = image_segmentation(img_8bit, scale, sigma, min_size)
    
    fig = plt.figure(figsize=(15,30))
    ax  = fig.add_subplot(1,2,1)
    ax.imshow(img_8bit)
    ax.set_title("original image")
    ax  = fig.add_subplot(1,2,2)
    ax.imshow(img[:,:,3])
    ax.set_title("skimage.segmentation.felzenszwalb, N unique region = {}".format(len(np.unique(img[:,:,3]))))
    plt.show()
/Users/yumikondo/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:35: DeprecationWarning: `imread` is deprecated!
`imread` is deprecated in SciPy 1.0.0, and will be removed in 1.2.0.
Use ``imageio.imread`` instead.