Yumi's Blog

Extract series of JPEG files from iPhone 6S video

In this blog, I will show how to extract image (.png) from video recorded using iphone.

Preparation

I recorded a 10 minutes and 21 second video using my iphone 6s. In my current directory, I have IMG7367.MOV file with:

  • Size: 639.3 MB
  • Dimensions: 1280 × 720

You need cv2

If you do not have cv2, please make sure to pip install it as:

Color gray scale images and manga using deep learning

In this blog post, I will try to create a deep learning model that can color a gray scale image. I follow this great blog post Colorizing B&W; Photos with Neural Networks.

I will consider two example data to train a model:

  • Flickr8K data
  • Hunter x Hunter anime data

Flickr8K data is a famous public data in computer vision community, and it was also previously analyzed in my blog. The downloading process is described at Develop an image captioning deep learning model using Flickr 8K data

Color space definitions in python, RGB and LAB

In this blog post, you will learn color spaces that are often used in image processing problems. More specifically, after reading the blog, you will be familiar with using

Let's first load two example images using keras.preprocessing.image.load_img. The list dir_data contains the path to two jpg images.

Assess the robustness of CapsNet

In the Understanding and Experimenting Capsule Networks, I experimented Hinton's Capsule Network.

Dynamic Routing Between Capsules discusses the robustness of the Capsule Networks to affine transformations:

"Experiments show that each DigitCaps capsule learns a more robust representation for each class than a traditional convolutional network. Because there is natural variance in skew, rotation, style, etc in hand written digits, the trained CapsNet is moderately robust to small affine transformations of the training data (Section 5.2, page 6)."

Learn the breed of a dog using deep learning

My friend asked me if I can figure out the breed of his dog, Loki. As I am not a dog expart, I will ask opinions from deep learning. Here, I use VGG16 trained on ImageNet dataset.

What is VGG16 and ImageNet?

According to Wikipedia,

"The ImageNet project is a large visual database designed for use in visual object recognition software research...Since 2010, the annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC) is a competition where research teams evaluate their algorithms on the given data set, and compete to achieve higher accuracy on several visual recognition tasks."

Visualization of Filters with Keras

The goal of this blog post is to understand "what my CNN model is looking at". People call this visualization of the filters. But more precisely, what I will do here is to visualize the input images that maximizes (sum of the) activation map (or feature map) of the filters. I will visualize the filters of deep learning models for two different applications: