In this blog post, I will try to create a deep learning model that can color a gray scale image. I follow this great blog post Colorizing B&W; Photos with Neural Networks.
I will consider two example data to train a model:
- Flickr8K data
- Hunter x Hunter anime data
Flickr8K data is a famous public data in computer vision community, and it was also previously analyzed in my blog. The downloading process is described at Develop an image captioning deep learning model using Flickr 8K data. I will first use this standard data to validate the method with small data size (8,000 images).
Hunter x Hunter is a Japanese manga series written and illustrated by Yoshihiro Togashi. It has been serialized in Weekly Shōnen Jump magazine and journal since March 3, 1998. Hunter × Hunter was adapted into anime television series twice, in 1999 and in 2011. While the anime ttelevision series ended, the manga is on going in Shonen Jump magazine and the new manga is produced every week (except when the author "takes a break"). While anime is colored, the manga is not. So my motivation is that if we can train a deep learning model with the colored anime images, we may be able to color the manga and enjoy colored manga for free!! With this motivation, I will try to create a deep learning model using Hunter x Hunter anime data.
Reference¶
import matplotlib.pyplot as plt
import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
import keras
import sys, time, os, warnings
import numpy as np
import pandas as pd
from collections import Counter
warnings.filterwarnings("ignore")
print("python {}".format(sys.version))
print("keras version {}".format(keras.__version__)); del keras
print("tensorflow version {}".format(tf.__version__))
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.95
config.gpu_options.visible_device_list = "0"
set_session(tf.Session(config=config))
def set_seed(sd=123):
from numpy.random import seed
from tensorflow import set_random_seed
import random as rn
## numpy random seed
seed(sd)
## core python's random number
rn.seed(sd)
## tensor flow's random number
set_random_seed(sd)
Flickr8k image¶
Load the images. Here, I load the image in LAB format, and my analysis is all based in LAB scale image. The only time I will transform the images into RGB will be when I want to plot the images using pyplot.imshow() function
To learn about the LAB format, please refer to my previous blog post Color space defenitions in python, RGB and LAB.
from keras.preprocessing.image import img_to_array, load_img
from skimage.color import rgb2lab, lab2rgb
target_size = (256,256,3)
X = []
dir_data = "../Flickr8k/Flicker8k_Dataset/"
for filenm in os.listdir(dir_data):
imgrgb = img_to_array(load_img(dir_data+filenm,target_size=target_size))/255.0
imglab = rgb2lab(imgrgb)
X.append(imglab)
X = np.array(X)
X.shape
for i in range(X.shape[-1]):
vec = X[:,:,:,i]
print("MIN={:5.3f} MAX={:5.3f}".format(np.min(vec),np.max(vec)))
def standardizeLAB(X):
## Standardize the LAB
standX = np.zeros(X.shape)
## standardized one takes values between 0 and 1
standX[:,:,:,0] = X[:,:,:,0]/100.0
## standardized one takes values between -1 and 1
standX[:,:,:,1:] = X[:,:,:,1:]/128.0
return(standX)
standX = standardizeLAB(X)
del X
print(standX.shape)
Devide the standardized image between training and testing
split = int(0.95*len(standX))
Xtrain = standX[:split]
Xtest = standX[split:]
Define model¶
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
from keras.layers import Conv2D, UpSampling2D, InputLayer, Conv2DTranspose
from keras.models import Sequential
def define_model():
#Design the neural network
model = Sequential()
model.add(InputLayer(input_shape=(256, 256, 1)))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', strides=2))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same', strides=2))
model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(256, (3, 3), activation='relu', padding='same', strides=2))
model.add(Conv2D(512, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(UpSampling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(UpSampling2D((2, 2)))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(2, (3, 3), activation='tanh', padding='same'))
#model.add(Conv2D(2, (3, 3), padding='same'))
model.add(UpSampling2D((2, 2)))
model.compile(optimizer='rmsprop', loss='mse')
return(model)
# Finish model
# Image transformer
datagen = ImageDataGenerator(
shear_range=0.2,
zoom_range=0.2,
rotation_range=20,
horizontal_flip=True)
model = define_model()
model.summary()
Training starts¶
Notice that I removed the rgb2lab line from image_a_b_gen n that existed in Colorizing B&W; Photos with Neural Networks. The conversion is not necessary because our Xtrain is already converted to lab format. Also I found that removing this line from the image_a_b_gen made the training x10 faster.
# Generate training data
batch_size = 128
def image_a_b_gen(Xtrain, batch_size):
for batch in datagen.flow(Xtrain, batch_size=batch_size):
X_batch = batch[:,:,:,[0]]
Y_batch = batch[:,:,:,1:]
yield (X_batch, Y_batch)
## create a validation data
Ntrain = int(Xtrain.shape[0]*0.8)
X_tr = Xtrain[:Ntrain]
X_val = Xtrain[Ntrain:]
hist = model.fit_generator(image_a_b_gen(X_tr, batch_size),
verbose=2,
validation_data = (X_val[:,:,:,[0]],
X_val[:,:,:,1:]),
steps_per_epoch=100, epochs=5)
plot the loss¶
for key in hist.history.keys():
plt.plot(hist.history[key],label=key)
plt.legend()
plt.show()
Evaluate the model performance using the test set¶
Ypred = model.predict(Xtest[:,:,:,[0]])
print("testing MSE={:4.3f}".format(np.mean((Ypred - Xtest[:,:,:,2:])**2)))
Plot the predicated testing image and true colored testing image¶
Unfortunately, almost all the images are brown-ish. We need to increase the training data size, investigate different models, or consider narrower space of images. Nevertheless, the model seems to learn
- grass color is green
- sky is blue
- lake is blue
from copy import copy
def plot_gray_predicted_true_images(Xtest,Ypred,index,target_size):
Npic = len(index)
fig = plt.figure(figsize=(10,Npic*3))
count = 1
for i in index:
img = copy(Xtest[i])
cur_pred = np.zeros(target_size)
cur_gray = np.zeros(target_size)
cur_pred[:,:,0] = img[:,:,0]*100
cur_gray[:,:,0] = img[:,:,0]*100
cur_pred[:,:,1:] = Ypred[i]* 128
rgb_cur_pred = lab2rgb(cur_pred)
rgb_cur_gray = lab2rgb(cur_gray)
ax = fig.add_subplot(Npic,3,count)
ax.imshow(rgb_cur_gray)
ax.set_title("ID={} original (gray scale)".format(i))
ax.axis("off")
count += 1
ax = fig.add_subplot(Npic,3,count)
ax.imshow(rgb_cur_pred)
ax.axis("off")
ax.set_title("predicted")
count += 1
## create original image
img_o = np.zeros(target_size)
img_o[:,:,0] = img[:,:,0]*100
img_o[:,:,1:] = img[:,:,1:]*128
ax = fig.add_subplot(Npic,3,count)
rgb = lab2rgb(img_o)
ax.imshow(rgb)
ax.set_title("original (color)")
ax.axis("off")
count += 1
plt.show()
Npic = 20
exampleIDs = [132,288,47,169,85,191,37,147]
randomIDs = list(np.random.choice(range(Xtest.shape[0]),Npic-len(exampleIDs)))
index = exampleIDs + randomIDs
plot_gray_predicted_true_images(Xtest,Ypred,index,target_size)
del standX
Hunter x Hunter¶
The Hunter x Hunter images are extracted from Google Image search, and the procedure is described in previous post Download all images from Google image search query using python.
I used several queries for downloading the data:
- image - Hunter x Hunter
- image - Hunter x Hunter anime
- image - Hunter x Hunter color
- image - Hunter x Hunter gon
- image - Hunter x Hunter killua
- image - Hunter x Hunter aruka
- image - Hunter x Hunter kurapika
gon, killua, aruka and kurapika are character's names. The text files containing URL links of each image are available in my Github.
from keras.preprocessing.image import img_to_array, load_img
from skimage.color import rgb2lab, lab2rgb
target_size = (256,256,3)
## try except is incdlued because
dir_data = "../HunterHunter/image/anime/"
X = []
count = 0
for folder in os.listdir(dir_data):
for image in os.listdir(dir_data + folder):
try:
imgrgb = img_to_array(load_img(dir_data+folder + "/" + image,target_size=target_size))/255.0
imglab = rgb2lab(imgrgb)
X.append(imglab)
count += 1
except Exception as e:
pass
X = np.array(X)
print("The total number of images {}".format(X.shape[0]))
Standardize the LAB data¶
standX = standardizeLAB(X)
del X
print(standX.shape)
Training the model¶
The process is the same as the previous analysis with Flikr8K data.
#Split between training and testing data
split = int(0.95*len(standX))
Xtrain = standX[:split]
Xtest = standX[split:]
## create a validation data
Ntrain = int(Xtrain.shape[0]*0.8)
X_tr = Xtrain[:Ntrain]
X_val = Xtrain[Ntrain:]
model = define_model()
batch_size = 128
## we will use the inital weights from the Flickr8K analysis
hist = model.fit_generator(image_a_b_gen(X_tr, batch_size),
verbose=2,
validation_data = (X_val[:,:,:,[0]],
X_val[:,:,:,1:]),
steps_per_epoch=100, epochs=5)
Plot the validation loss¶
for key in hist.history.keys():
plt.plot(hist.history[key],label=key)
plt.legend()
plt.show()
Model validation using test set¶
Ypred = model.predict(Xtest[:,:,:,[0]])
print("testing MSE={:4.3f}".format(np.mean((Ypred - Xtest[:,:,:,2:])**2)))
Plot the example images from test set¶
- Again, the model is not doing very good job coloring images. This time, all the images are purple-ish.
- The plot also show that there are some non-anime related images, indicating that I could do manual cleaning of the images.
Nevertheless, there are some good things too:
- Killua's hair is colored purple-ish white correctly.
- Gon's face is colored with a skin color in ID=46.
- Super-power-looking-back ground colors seems appropriate.
Nsample = 30
index = [46,69,45, 18,22, 25, 0, 56, 10, 50]
plot_gray_predicted_true_images(Xtest,Ypred,index,target_size)
Can I color manga?¶
Well, my model was not doing the best job coloring anime but what about coloring manga? Let's give it a try.
Load 5 Hunter x Hunter manga images.
target_size = (256,256,3)
## try except is incdlued because
dir_data = "../HunterHunter/manga/Hunter x Hunter manga/"
X = []
count = 0
for image in ['image00004.jpg',
'image00005.jpg',
'image00006.jpg',
'image00007.jpg',
'image00008.jpg']:
try:
imgrgb = img_to_array(load_img(dir_data+ "/" + image,target_size=target_size))/255.0
imglab = rgb2lab(imgrgb)
X.append(imglab)
count += 1
except Exception as e:
pass
X = np.array(X)
print("The total number of images {}".format(X.shape[0]))
## standardization
standX = standardizeLAB(X)
#del X
print(standX.shape)
Use the previously built model with anime to predict the color of manga.¶
Ypred = model.predict(standX[:,:,:,[0]])
Plot how it looks like¶
Ah, we need to train model with more relevant images! We need a colored manga rather than anime to train a model for this purpose.
index = range(standX.shape[0])
plot_gray_predicted_true_images(standX,Ypred,index,target_size)
Next step:¶
Consider incorporating pre-trained network.