This is the sixth blog post of Object Detection with YOLO blog series. This blog performs inference using the model in trained in Part 5 Object Detection with Yolo using VOC 2012 data - training. I will use PASCAL VOC2012 data. This blog assumes that the readers have read the previous blog posts - Part 1, Part 2, Part 3, Part 4, Part 5.

Andrew Ng's YOLO lecture¶

Reference¶

Reference in my blog¶

My GitHub repository¶

This repository contains all the ipython notebooks in this blog series and the funcitons (See backend.py).

FairyOnIce/ObjectDetectionYolo

In [1]:

import matplotlib.pyplot as plt
import numpy as np
import os, sys
print(sys.version)
%matplotlib inline

3.6.3 |Anaconda, Inc.| (default, Oct  6 2017, 12:04:38) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)]

Read in the hyperparameters to define the YOLOv2 model used during training

In [2]:

train_image_folder = "../ObjectDetectionRCNN/VOCdevkit/VOC2012/JPEGImages/"
train_annot_folder = "../ObjectDetectionRCNN/VOCdevkit/VOC2012/Annotations/"

LABELS = ['aeroplane',  'bicycle', 'bird',  'boat',      'bottle', 
          'bus',        'car',      'cat',  'chair',     'cow',
          'diningtable','dog',    'horse',  'motorbike', 'person',
          'pottedplant','sheep',  'sofa',   'train',   'tvmonitor']

ANCHORS = np.array([1.07709888,  1.78171903,  # anchor box 1, width , height
                    2.71054693,  5.12469308,  # anchor box 2, width,  height
                   10.47181473, 10.09646365,  # anchor box 3, width,  height
                    5.48531347,  8.11011331]) # anchor box 4, width,  height


BOX               = int(len(ANCHORS)/2)
TRUE_BOX_BUFFER   = 50
IMAGE_H, IMAGE_W  = 416, 416
GRID_H,  GRID_W   = 13 , 13

Load the weights trained in Part 5

In [3]:

from backend import define_YOLOv2

CLASS             = len(LABELS)
model, _          = define_YOLOv2(IMAGE_H,IMAGE_W,GRID_H,GRID_W,TRUE_BOX_BUFFER,BOX,CLASS, 
                                  trainable=False)
model.load_weights("weights_yolo_on_voc2012.h5")

/Users/yumikondo/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.

Perform detection on sample image¶

Encode the image using the ImageReader class¶

ImageReader class was created in Part 2.

In [4]:

## input encoding
from backend import ImageReader
imageReader = ImageReader(IMAGE_H,IMAGE_W=IMAGE_W, norm=lambda image : image / 255.)
out = imageReader.fit(train_image_folder + "/2007_005430.jpg")

Predict the bounding box.¶

In [5]:

print(out.shape)
X_test = np.expand_dims(out,0)
print(X_test.shape)
# handle the hack input
dummy_array = np.zeros((1,1,1,1,TRUE_BOX_BUFFER,4))
y_pred = model.predict([X_test,dummy_array])
print(y_pred.shape)

(416, 416, 3)
(1, 416, 416, 3)
(1, 13, 13, 4, 25)

Rescale the network output¶

Remind you that y_pred takes any real values. Therefore

In [6]:

class OutputRescaler(object):
    def __init__(self,ANCHORS):
        self.ANCHORS = ANCHORS

    def _sigmoid(self, x):
        return 1. / (1. + np.exp(-x))
    def _softmax(self, x, axis=-1, t=-100.):
        x = x - np.max(x)

        if np.min(x) < t:
            x = x/np.min(x)*t

        e_x = np.exp(x)
        return e_x / e_x.sum(axis, keepdims=True)
    def get_shifting_matrix(self,netout):
        
        GRID_H, GRID_W, BOX = netout.shape[:3]
        no = netout[...,0]
        
        ANCHORSw = self.ANCHORS[::2]
        ANCHORSh = self.ANCHORS[1::2]
       
        mat_GRID_W = np.zeros_like(no)
        for igrid_w in range(GRID_W):
            mat_GRID_W[:,igrid_w,:] = igrid_w

        mat_GRID_H = np.zeros_like(no)
        for igrid_h in range(GRID_H):
            mat_GRID_H[igrid_h,:,:] = igrid_h

        mat_ANCHOR_W = np.zeros_like(no)
        for ianchor in range(BOX):    
            mat_ANCHOR_W[:,:,ianchor] = ANCHORSw[ianchor]

        mat_ANCHOR_H = np.zeros_like(no) 
        for ianchor in range(BOX):    
            mat_ANCHOR_H[:,:,ianchor] = ANCHORSh[ianchor]
        return(mat_GRID_W,mat_GRID_H,mat_ANCHOR_W,mat_ANCHOR_H)

    def fit(self, netout):    
        '''
        netout  : np.array of shape (N grid h, N grid w, N anchor, 4 + 1 + N class)
        
        a single image output of model.predict()
        '''
        GRID_H, GRID_W, BOX = netout.shape[:3]
        
        (mat_GRID_W,
         mat_GRID_H,
         mat_ANCHOR_W,
         mat_ANCHOR_H) = self.get_shifting_matrix(netout)


        # bounding box parameters
        netout[..., 0]   = (self._sigmoid(netout[..., 0]) + mat_GRID_W)/GRID_W # x      unit: range between 0 and 1
        netout[..., 1]   = (self._sigmoid(netout[..., 1]) + mat_GRID_H)/GRID_H # y      unit: range between 0 and 1
        netout[..., 2]   = (np.exp(netout[..., 2]) * mat_ANCHOR_W)/GRID_W      # width  unit: range between 0 and 1
        netout[..., 3]   = (np.exp(netout[..., 3]) * mat_ANCHOR_H)/GRID_H      # height unit: range between 0 and 1
        # rescale the confidence to range 0 and 1 
        netout[..., 4]   = self._sigmoid(netout[..., 4])
        expand_conf      = np.expand_dims(netout[...,4],-1) # (N grid h , N grid w, N anchor , 1)
        # rescale the class probability to range between 0 and 1
        # Pr(object class = k) = Pr(object exists) * Pr(object class = k |object exists)
        #                      = Conf * P^c
        netout[..., 5:]  = expand_conf * self._softmax(netout[..., 5:])
        # ignore the class probability if it is less than obj_threshold 
    
        return(netout)

Experiment `OutputRescaler`¶

In [7]:

netout         = y_pred[0]
outputRescaler = OutputRescaler(ANCHORS=ANCHORS)
netout_scale   = outputRescaler.fit(netout)

Post processing the YOLOv2 object¶

YOLOv2 can potentially preoduce GRID_H x GRID_W x BOX many bounding box. However, only few of them actually contain actual objects. Some bounding box may contain the same objects. I will postprocess the predicted bounding box.

In [8]:

from backend import BoundBox

    
def find_high_class_probability_bbox(netout_scale, obj_threshold):
    '''
    == Input == 
    netout : y_pred[i] np.array of shape (GRID_H, GRID_W, BOX, 4 + 1 + N class)
    
             x, w must be a unit of image width
             y, h must be a unit of image height
             c must be in between 0 and 1
             p^c must be in between 0 and 1
    == Output ==
    
    boxes  : list containing bounding box with Pr(object is in class C) > 0 for at least in one class C 
    
             
    '''
    GRID_H, GRID_W, BOX = netout_scale.shape[:3]
    
    boxes = []
    for row in range(GRID_H):
        for col in range(GRID_W):
            for b in range(BOX):
                # from 4th element onwards are confidence and class classes
                classes = netout_scale[row,col,b,5:]
                
                if np.sum(classes) > 0:
                    # first 4 elements are x, y, w, and h
                    x, y, w, h = netout_scale[row,col,b,:4]
                    confidence = netout_scale[row,col,b,4]
                    box = BoundBox(x-w/2, y-h/2, x+w/2, y+h/2, confidence, classes)
                    if box.get_score() > obj_threshold:
                        boxes.append(box)
    return(boxes)

Experiment `find_high_class_probability_bbox`¶

In [9]:

obj_threshold = 0.015
boxes_tiny_threshold = find_high_class_probability_bbox(netout_scale,obj_threshold)
print("obj_threshold={}".format(obj_threshold))
print("In total, YOLO can produce GRID_H * GRID_W * BOX = {} bounding boxes ".format( GRID_H * GRID_W * BOX))
print("I found {} bounding boxes with top class probability > {}".format(len(boxes_tiny_threshold),obj_threshold))

obj_threshold = 0.03
boxes = find_high_class_probability_bbox(netout_scale,obj_threshold)
print("\nobj_threshold={}".format(obj_threshold))
print("In total, YOLO can produce GRID_H * GRID_W * BOX = {} bounding boxes ".format( GRID_H * GRID_W * BOX))
print("I found {} bounding boxes with top class probability > {}".format(len(boxes),obj_threshold))

obj_threshold=0.015
In total, YOLO can produce GRID_H * GRID_W * BOX = 676 bounding boxes 
I found 36 bounding boxes with top class probability > 0.015

obj_threshold=0.03
In total, YOLO can produce GRID_H * GRID_W * BOX = 676 bounding boxes 
I found 22 bounding boxes with top class probability > 0.03

Visualize many bounding box by having small obj_threshold value¶

Most of the bounding boxes do not contain objects. This shows that we really need to reduce the number of bounding box.

In [11]:

import cv2, copy
import seaborn as sns
def draw_boxes(image, boxes, labels, obj_baseline=0.05,verbose=False):
    '''
    image : np.array of shape (N height, N width, 3)
    '''
    def adjust_minmax(c,_max):
        if c < 0:
            c = 0   
        if c > _max:
            c = _max
        return c
    
    image = copy.deepcopy(image)
    image_h, image_w, _ = image.shape
    score_rescaled  = np.array([box.get_score() for box in boxes])
    score_rescaled /= obj_baseline
    
    colors = sns.color_palette("husl", 8)
    for sr, box,color in zip(score_rescaled,boxes, colors):
        xmin = adjust_minmax(int(box.xmin*image_w),image_w)
        ymin = adjust_minmax(int(box.ymin*image_h),image_h)
        xmax = adjust_minmax(int(box.xmax*image_w),image_w)
        ymax = adjust_minmax(int(box.ymax*image_h),image_h)
 
        
        text = "{:10} {:4.3f}".format(labels[box.label], box.get_score())
        if verbose:
            print("{} xmin={:4.0f},ymin={:4.0f},xmax={:4.0f},ymax={:4.0f}".format(text,xmin,ymin,xmax,ymax,text))
        cv2.rectangle(image, 
                      pt1=(xmin,ymin), 
                      pt2=(xmax,ymax), 
                      color=color, 
                      thickness=sr)
        cv2.putText(img       = image, 
                    text      = text, 
                    org       = (xmin+ 13, ymin + 13),
                    fontFace  = cv2.FONT_HERSHEY_SIMPLEX,
                    fontScale = 1e-3 * image_h,
                    color     = (1, 0, 1),
                    thickness = 1)
        
    return image


print("Plot with low object threshold")
ima = draw_boxes(X_test[0],boxes_tiny_threshold,LABELS,verbose=True)
figsize = (15,15)
plt.figure(figsize=figsize)
plt.imshow(ima); 
plt.title("Plot with low object threshold")
plt.show()

print("Plot with high object threshold")
ima = draw_boxes(X_test[0],boxes,LABELS,verbose=True)
figsize = (15,15)
plt.figure(figsize=figsize)
plt.imshow(ima); 
plt.title("Plot with high object threshold")
plt.show()

Plot with low object threshold
train      0.018 xmin=   0,ymin=   0,xmax= 310,ymax= 277
person     0.058 xmin= 185,ymin=  40,xmax= 296,ymax= 203
person     0.043 xmin= 154,ymin=   5,xmax= 324,ymax= 241
person     0.022 xmin=   0,ymin=   4,xmax= 128,ymax= 283
person     0.027 xmin=   0,ymin=   6,xmax= 169,ymax= 276
person     0.019 xmin=  12,ymin=  12,xmax= 197,ymax= 270
person     0.025 xmin= 132,ymin=  49,xmax= 221,ymax= 246
person     0.017 xmin=  84,ymin=  15,xmax= 265,ymax= 286

Plot with high object threshold
person     0.058 xmin= 185,ymin=  40,xmax= 296,ymax= 203
person     0.043 xmin= 154,ymin=   5,xmax= 324,ymax= 241
person     0.057 xmin= 136,ymin=  23,xmax= 300,ymax= 285
person     0.806 xmin= 184,ymin=  32,xmax= 296,ymax= 278
person     0.939 xmin= 182,ymin=  23,xmax= 302,ymax= 291
person     0.135 xmin= 181,ymin=  11,xmax= 338,ymax= 296
bicycle    0.107 xmin=  94,ymin= 104,xmax= 202,ymax= 259
bicycle    0.047 xmin=  55,ymin=  82,xmax= 238,ymax= 286

Nonmax surpression¶

Nonmax surpression is a way to detect a single object only once. Andrew Ng has presented the idea of nonmax supression in his lecture very well: C4W3L07 Nonmax Suppression.

The following code implement the nonmax surpression algorithm. For each object class, the algorithm picks the most promissing bounding box, and then remove (or suppress) the remaining bounding box that contain high overwrap with the most promissing bounding box. The most promissing or not is determined by the predicted class probaiblity.

In [12]:

from backend import BestAnchorBoxFinder
def nonmax_suppression(boxes,iou_threshold,obj_threshold):
    '''
    boxes : list containing "good" BoundBox of a frame
            [BoundBox(),BoundBox(),...]
    '''
    bestAnchorBoxFinder    = BestAnchorBoxFinder([])
    
    CLASS    = len(boxes[0].classes)
    index_boxes = []   
    # suppress non-maximal boxes
    for c in range(CLASS):
        # extract class probabilities of the c^th class from multiple bbox
        class_probability_from_bbxs = [box.classes[c] for box in boxes]

        #sorted_indices[i] contains the i^th largest class probabilities
        sorted_indices = list(reversed(np.argsort( class_probability_from_bbxs)))

        for i in range(len(sorted_indices)):
            index_i = sorted_indices[i]
            
            # if class probability is zero then ignore
            if boxes[index_i].classes[c] == 0:  
                continue
            else:
                index_boxes.append(index_i)
                for j in range(i+1, len(sorted_indices)):
                    index_j = sorted_indices[j]
                    
                    # check if the selected i^th bounding box has high IOU with any of the remaining bbox
                    # if so, the remaining bbox' class probabilities are set to 0.
                    bbox_iou = bestAnchorBoxFinder.bbox_iou(boxes[index_i], boxes[index_j])
                    if bbox_iou >= iou_threshold:
                        classes = boxes[index_j].classes
                        classes[c] = 0
                        boxes[index_j].set_class(classes)
                        
    newboxes = [ boxes[i] for i in index_boxes if boxes[i].get_score() > obj_threshold ]                
    
    return newboxes

Experiment `nonmax_suppression`¶

In [13]:

iou_threshold = 0.01
final_boxes = nonmax_suppression(boxes,iou_threshold=iou_threshold,obj_threshold=obj_threshold)
print("{} final number of boxes".format(len(final_boxes)))

2 final number of boxes

Finally draw the bounding box on an wapred image¶

In [14]:

ima = draw_boxes(X_test[0],final_boxes,LABELS,verbose=True)
figsize = (15,15)
plt.figure(figsize=figsize)
plt.imshow(ima); 
plt.show()

bicycle    0.595 xmin=  93,ymin= 131,xmax= 235,ymax= 285
person     0.993 xmin= 188,ymin=  15,xmax= 300,ymax= 319

More examples¶

In [15]:

np.random.seed(1)
Nsample   = 20
image_nms = list(np.random.choice(os.listdir(train_image_folder),Nsample))

In [16]:

outputRescaler = OutputRescaler(ANCHORS=ANCHORS)
imageReader    = ImageReader(IMAGE_H,IMAGE_W=IMAGE_W, norm=lambda image : image / 255.)
X_test         = []
for img_nm in image_nms:
    _path    = os.path.join(train_image_folder,img_nm)
    out      = imageReader.fit(_path)
    X_test.append(out)

X_test = np.array(X_test)

## model
dummy_array    = np.zeros((len(X_test),1,1,1,TRUE_BOX_BUFFER,4))
y_pred         = model.predict([X_test,dummy_array])

for iframe in range(len(y_pred)):
        netout         = y_pred[iframe] 
        netout_scale   = outputRescaler.fit(netout)
        boxes          = find_high_class_probability_bbox(netout_scale,obj_threshold)
        if len(boxes) > 0:
            final_boxes    = nonmax_suppression(boxes,
                                                iou_threshold=iou_threshold,
                                                obj_threshold=obj_threshold)
            ima = draw_boxes(X_test[iframe],final_boxes,LABELS,verbose=True)
            plt.figure(figsize=figsize)
            plt.imshow(ima); 
            plt.show()

person     0.032 xmin=  29,ymin= 184,xmax=  66,ymax= 241
person     0.280 xmin=   0,ymin= 233,xmax=  20,ymax= 363
pottedplant 0.103 xmin= 230,ymin= 153,xmax= 249,ymax= 193
pottedplant 0.055 xmin= 345,ymin= 246,xmax= 411,ymax= 298
pottedplant 0.049 xmin= 349,ymin=  89,xmax= 371,ymax= 151
pottedplant 0.036 xmin= 198,ymin= 149,xmax= 218,ymax= 202
pottedplant 0.046 xmin= 186,ymin= 208,xmax= 294,ymax= 330
pottedplant 0.170 xmin= 294,ymin= 221,xmax= 331,ymax= 312

person     0.225 xmin= 288,ymin= 289,xmax= 312,ymax= 364
person     0.527 xmin=   0,ymin= 270,xmax=  27,ymax= 399
person     0.046 xmin= 383,ymin= 302,xmax= 404,ymax= 346
person     0.322 xmin= 351,ymin= 287,xmax= 382,ymax= 362
person     0.225 xmin= 288,ymin= 289,xmax= 312,ymax= 364
person     0.369 xmin= 324,ymin= 279,xmax= 346,ymax= 370
person     0.527 xmin=   0,ymin= 270,xmax=  27,ymax= 399
person     0.558 xmin=  31,ymin= 260,xmax=  60,ymax= 401

chair      0.056 xmin=   0,ymin= 121,xmax=  32,ymax= 165
chair      0.061 xmin= 223,ymin= 124,xmax= 258,ymax= 158
chair      0.061 xmin= 223,ymin= 124,xmax= 258,ymax= 158
chair      0.061 xmin= 223,ymin= 124,xmax= 258,ymax= 158
chair      0.056 xmin=   0,ymin= 121,xmax=  32,ymax= 165
chair      0.061 xmin= 223,ymin= 124,xmax= 258,ymax= 158
chair      0.061 xmin= 223,ymin= 124,xmax= 258,ymax= 158
chair      0.061 xmin= 223,ymin= 124,xmax= 258,ymax= 158

sofa       0.054 xmin= 195,ymin= 176,xmax= 408,ymax= 370
sofa       0.054 xmin= 195,ymin= 176,xmax= 408,ymax= 370
chair      0.041 xmin=  15,ymin= 186,xmax= 224,ymax= 410
sofa       0.054 xmin= 195,ymin= 176,xmax= 408,ymax= 370
person     0.576 xmin=  21,ymin=  27,xmax= 191,ymax= 337
sofa       0.054 xmin= 195,ymin= 176,xmax= 408,ymax= 370
sofa       0.054 xmin= 195,ymin= 176,xmax= 408,ymax= 370

cow        0.817 xmin=  56,ymin= 111,xmax= 371,ymax= 298
cow        0.046 xmin= 190,ymin=  90,xmax= 228,ymax= 126

aeroplane  0.183 xmin= 127,ymin=  66,xmax= 344,ymax= 232
person     0.044 xmin= 385,ymin=   2,xmax= 412,ymax=  96
person     0.063 xmin=  60,ymin=  68,xmax= 401,ymax= 360
person     0.044 xmin= 385,ymin=   2,xmax= 412,ymax=  96
person     0.044 xmin= 385,ymin=   2,xmax= 412,ymax=  96
aeroplane  0.183 xmin= 127,ymin=  66,xmax= 344,ymax= 232
person     0.044 xmin= 385,ymin=   2,xmax= 412,ymax=  96
aeroplane  0.183 xmin= 127,ymin=  66,xmax= 344,ymax= 232

chair      0.097 xmin= 279,ymin= 286,xmax= 401,ymax= 395
chair      0.097 xmin= 279,ymin= 286,xmax= 401,ymax= 395
chair      0.097 xmin= 279,ymin= 286,xmax= 401,ymax= 395
chair      0.097 xmin= 279,ymin= 286,xmax= 401,ymax= 395
chair      0.097 xmin= 279,ymin= 286,xmax= 401,ymax= 395
chair      0.097 xmin= 279,ymin= 286,xmax= 401,ymax= 395
chair      0.097 xmin= 279,ymin= 286,xmax= 401,ymax= 395
person     0.645 xmin= 103,ymin=  11,xmax= 239,ymax= 231

chair      0.054 xmin= 238,ymin=  18,xmax= 416,ymax= 272
car        0.031 xmin= 153,ymin=   0,xmax= 202,ymax=  31
car        0.031 xmin=   0,ymin=  31,xmax= 174,ymax= 271
sheep      0.038 xmin= 246,ymin=   0,xmax= 413,ymax= 243
car        0.031 xmin= 153,ymin=   0,xmax= 202,ymax=  31
chair      0.054 xmin= 238,ymin=  18,xmax= 416,ymax= 272
car        0.031 xmin= 153,ymin=   0,xmax= 202,ymax=  31
car        0.031 xmin= 153,ymin=   0,xmax= 202,ymax=  31

aeroplane  0.034 xmin=   3,ymin=  49,xmax= 416,ymax= 320
person     0.075 xmin= 131,ymin= 153,xmax= 152,ymax= 194
person     0.060 xmin=  35,ymin= 155,xmax=  58,ymax= 202
person     0.077 xmin=  13,ymin= 153,xmax=  31,ymax= 195
person     0.196 xmin=  69,ymin= 150,xmax=  89,ymax= 202
person     0.075 xmin= 131,ymin= 153,xmax= 152,ymax= 194
person     0.060 xmin=  35,ymin= 155,xmax=  58,ymax= 202
person     0.077 xmin=  13,ymin= 153,xmax=  31,ymax= 195

aeroplane  0.885 xmin=  31,ymin= 115,xmax= 383,ymax= 286
aeroplane  0.885 xmin=  31,ymin= 115,xmax= 383,ymax= 286

person     0.095 xmin= 194,ymin= 194,xmax= 210,ymax= 245
person     0.036 xmin= 296,ymin= 197,xmax= 316,ymax= 231
person     0.066 xmin= 362,ymin= 214,xmax= 386,ymax= 242
person     0.036 xmin= 296,ymin= 197,xmax= 316,ymax= 231
person     0.095 xmin= 194,ymin= 194,xmax= 210,ymax= 245
person     0.075 xmin=  66,ymin= 146,xmax=  91,ymax= 204
person     0.095 xmin= 194,ymin= 194,xmax= 210,ymax= 245
person     0.036 xmin= 296,ymin= 197,xmax= 316,ymax= 231

chair      0.033 xmin= 249,ymin= 211,xmax= 406,ymax= 403
tvmonitor  0.034 xmin= 175,ymin=   0,xmax= 243,ymax=  45
dog        0.046 xmin= 117,ymin= 231,xmax= 232,ymax= 379
bottle     0.053 xmin= 315,ymin=   1,xmax= 351,ymax=  44
tvmonitor  0.034 xmin= 175,ymin=   0,xmax= 243,ymax=  45
bottle     0.053 xmin= 315,ymin=   1,xmax= 351,ymax=  44
dog        0.046 xmin= 117,ymin= 231,xmax= 232,ymax= 379
bottle     0.053 xmin= 315,ymin=   1,xmax= 351,ymax=  44

dog        0.078 xmin=  71,ymin=  69,xmax= 345,ymax= 358
sheep      0.444 xmin=  90,ymin=  86,xmax= 332,ymax= 346
dog        0.078 xmin=  71,ymin=  69,xmax= 345,ymax= 358
dog        0.078 xmin=  71,ymin=  69,xmax= 345,ymax= 358
sheep      0.444 xmin=  90,ymin=  86,xmax= 332,ymax= 346
cow        0.291 xmin=  81,ymin= 104,xmax= 340,ymax= 353
sheep      0.444 xmin=  90,ymin=  86,xmax= 332,ymax= 346
dog        0.078 xmin=  71,ymin=  69,xmax= 345,ymax= 358

sofa       0.189 xmin=  10,ymin= 188,xmax= 416,ymax= 416
chair      0.036 xmin=   0,ymin= 298,xmax= 105,ymax= 416
chair      0.040 xmin= 321,ymin= 196,xmax= 413,ymax= 398
sofa       0.189 xmin=  10,ymin= 188,xmax= 416,ymax= 416
bottle     0.078 xmin= 151,ymin= 296,xmax= 192,ymax= 382
chair      0.036 xmin=   0,ymin= 298,xmax= 105,ymax= 416
chair      0.040 xmin= 321,ymin= 196,xmax= 413,ymax= 398
chair      0.036 xmin=   0,ymin= 298,xmax= 105,ymax= 416

chair      0.040 xmin=  88,ymin=  22,xmax= 258,ymax= 329
chair      0.040 xmin=  88,ymin=  22,xmax= 258,ymax= 329
person     0.070 xmin= 279,ymin=  99,xmax= 416,ymax= 416
chair      0.040 xmin=  88,ymin=  22,xmax= 258,ymax= 329
person     0.070 xmin= 279,ymin=  99,xmax= 416,ymax= 416
person     0.070 xmin= 279,ymin=  99,xmax= 416,ymax= 416
chair      0.040 xmin=  88,ymin=  22,xmax= 258,ymax= 329
person     0.070 xmin= 279,ymin=  99,xmax= 416,ymax= 416

bicycle    0.031 xmin=  61,ymin=   1,xmax= 416,ymax= 412
bicycle    0.031 xmin=  61,ymin=   1,xmax= 416,ymax= 412
person     0.032 xmin=  65,ymin=   3,xmax= 416,ymax= 352
bicycle    0.031 xmin=  61,ymin=   1,xmax= 416,ymax= 412
bicycle    0.031 xmin=  61,ymin=   1,xmax= 416,ymax= 412
cat        0.203 xmin= 102,ymin= 219,xmax= 387,ymax= 381
chair      0.443 xmin= 244,ymin=   0,xmax= 415,ymax= 249
chair      0.308 xmin=  86,ymin= 246,xmax= 410,ymax= 410

aeroplane  0.818 xmin=   6,ymin= 129,xmax= 221,ymax= 243
aeroplane  0.230 xmin= 244,ymin= 182,xmax= 416,ymax= 255
aeroplane  0.230 xmin= 244,ymin= 182,xmax= 416,ymax= 255
aeroplane  0.230 xmin= 244,ymin= 182,xmax= 416,ymax= 255
aeroplane  0.230 xmin= 244,ymin= 182,xmax= 416,ymax= 255
aeroplane  0.230 xmin= 244,ymin= 182,xmax= 416,ymax= 255
aeroplane  0.230 xmin= 244,ymin= 182,xmax= 416,ymax= 255
aeroplane  0.230 xmin= 244,ymin= 182,xmax= 416,ymax= 255

boat       0.058 xmin= 386,ymin= 284,xmax= 407,ymax= 329
boat       0.054 xmin= 355,ymin= 347,xmax= 384,ymax= 378
boat       0.114 xmin=   6,ymin= 315,xmax=  39,ymax= 368
boat       0.058 xmin= 386,ymin= 284,xmax= 407,ymax= 329
boat       0.054 xmin= 355,ymin= 347,xmax= 384,ymax= 378
boat       0.123 xmin=  68,ymin=  59,xmax= 400,ymax= 359
boat       0.114 xmin=   6,ymin= 315,xmax=  39,ymax= 368
boat       0.114 xmin=   6,ymin= 315,xmax=  39,ymax= 368

person     0.826 xmin= 259,ymin= 271,xmax= 307,ymax= 406

horse      0.963 xmin= 121,ymin= 225,xmax= 279,ymax= 388
person     0.828 xmin= 161,ymin= 131,xmax= 212,ymax= 301

FairyOnIce/ObjectDetectionYolo contains this ipython notebook and all the functions that I defined in this notebook.

Yumi's Blog

Part 6 Object Detection with YOLOv2 using VOC 2012 data - inference on image

Andrew Ng's YOLO lecture¶

Reference¶

Reference in my blog¶

My GitHub repository¶

Perform detection on sample image¶

Encode the image using the ImageReader class¶

Predict the bounding box.¶

Rescale the network output¶

Experiment `OutputRescaler`¶

Post processing the YOLOv2 object¶

Experiment `find_high_class_probability_bbox`¶

Visualize many bounding box by having small obj_threshold value¶

Nonmax surpression¶

Experiment `nonmax_suppression`¶

Finally draw the bounding box on an wapred image¶

More examples¶

Comments

Andrew Ng's YOLO lecture¶

Reference¶

Reference in my blog¶

My GitHub repository¶

Perform detection on sample image¶

Encode the image using the ImageReader class¶

Predict the bounding box.¶

Rescale the network output¶

Experiment OutputRescaler¶

Post processing the YOLOv2 object¶

Experiment find_high_class_probability_bbox¶

Visualize many bounding box by having small obj_threshold value¶

Nonmax surpression¶

Experiment nonmax_suppression¶

Finally draw the bounding box on an wapred image¶

More examples¶

Comments

Experiment `OutputRescaler`¶

Experiment `find_high_class_probability_bbox`¶

Experiment `nonmax_suppression`¶