-
Notifications
You must be signed in to change notification settings - Fork 11
/
Process.txt
246 lines (189 loc) · 9.76 KB
/
Process.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
Steps to execute program
1. Get dataset/images of persons.
2. To get better predictions we first detect faces in image and use only faces for recognition.
To do so we first detect faces in an image,for this we use 'mmod_human_face_detector' a cnn_face_detector
which identifies faces in image and returns position of each face in image.
'dlib' in python uses these weights and detect images,but dlib doesn't provide this detector as in-built.
We must download and provide to 'dlib' classe for extracting face.
Download mmod_human_face_detector from here 'http://dlib.net/files/mmod_human_face_detector.dat.bz2'
As it is .bz2 file,extract it and feed to 'dlib' class.
Ex:
$ wget http://dlib.net/files/mmod_human_face_detector.dat.bz2
$ bzip2 -dk mmod_human_face_detector.dat.bz2
3. Extract face from images,crop face and store as image in separate folder with image name as person name.
Ex: We use images with only one face
```
import cv2
import dlib
# Load cnn_face_detector with 'mmod_face_detector'
dnnFaceDetector=dlib.cnn_face_detection_model_v1("mmod_human_face_detector.dat")
# Load image
img=cv2.imread('path_to_image')
# Convert to gray scale
gray=cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Find faces in image
rects=dnnFaceDetector(gray,1)
left,top,right,bottom=0,0,0,0
# For each face 'rect' provides face location in image as pixel loaction
for (i,rect) in enumerate(rects):
left=rect.rect.left() #x1
top=rect.rect.top() #y1
right=rect.rect.right() #x2
bottom=rect.rect.bottom() #y2
width=right-left
height=bottom-top
# Crop image
img_crop=img[top:top+height,left:left+width]
#save crop image with person name as image name
cv2.imwrite(path_to_image_as_person_name,img_crop)
```
Above snippet shows how to extract face from image and save them for recognition.
Do the same for all images in train dataset and test dataset saving with person names as image names.
Store each person cropped image in a separate folder like
Ex: All 'modi_*.jpg' images are saved in 'modi' folder.
4. We create embeddings for each face/person which defines the person in numeric data. Pre-trained networks
like DeepFace,OpenFace provides embeddings in simple lines of code. But we use VGG_Face_net which trained on millions of images to recognize faces. The original model takes an image in WildFace dataset on which VGG_face_net trained and classifies/recognize person in image. It ouputs 2622 embeddings for an image,we take this 2622 embeddings for each cropped_image for later classification of image.
VGG_face_net weights are not avialble for tensorflow or keras models in official site,but there are bloggers outside who converted those not keras/tensorflow suppported weights to .h5 files which can easily use to define a model in keras/tensorflow.
Donwnload .h5 file for VGG_Face_net for this like
'https://drive.google.com/uc?id=1CPSeum3HpopfomUEK1gybeuIVoeJT_Eo'
As it stored in google drive,we can download to our local storage with python 'gdown' package.
Ex: $ gdown https://drive.google.com/uc?id=1CPSeum3HpopfomUEK1gybeuIVoeJT_Eo
We have .h5 vgg_face_net weights now and use it to build vgg_face_net model in keras/tensorflow.
5. To loal weights for model we must define model architecture.
VGG_Face model in keras as
```
# Tensorflow version == 2.0.0
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential,Model
from tensorflow.keras.layers import ZeroPadding2D,Convolution2D,MaxPooling2D
from tensorflow.keras.layers import Dense,Dropout,Softmax,Flatten,Activation,BatchNormalization
from tensorflow.keras.preprocessing.image import load_img,img_to_array
from tensorflow.keras.applications.imagenet_utils import preprocess_input
import tensorflow.keras.backend as K
# Define VGG_FACE_MODEL architecture
model = Sequential()
model.add(ZeroPadding2D((1,1),input_shape=(224,224, 3)))
model.add(Convolution2D(64, (3, 3), activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(128, (3, 3), activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(256, (3, 3), activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(256, (3, 3), activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(256, (3, 3), activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, (3, 3), activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, (3, 3), activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, (3, 3), activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, (3, 3), activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, (3, 3), activation='relu'))
model.add(ZeroPadding2D((1,1)))
model.add(Convolution2D(512, (3, 3), activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(Convolution2D(4096, (7, 7), activation='relu'))
model.add(Dropout(0.5))
model.add(Convolution2D(4096, (1, 1), activation='relu'))
model.add(Dropout(0.5))
model.add(Convolution2D(2622, (1, 1)))
model.add(Flatten())
model.add(Activation('softmax'))
# Load VGG Face model weights
model.load_weights('vgg_face_weights.h5')
```
Here in the output layer they used softmax layer for recognising image in WildFaces dataset. We do only
require embeddings which are output for last but one layer i.e,. Flatten() layer.
So our model requires upto last Flatten() layer.
```
# Remove Last Softmax layer and get model upto last flatten layer with outputs 2622 units
vgg_face=Model(inputs=model.layers[0].input,outputs=model.layers[-2].outpu
```
In the above line we defined model upto Flatten() layer.
Now we can feed any image to get embeddings which will be used to train our own classifier/recognizer.
6. Prepare train data and test data which contains embeddings as data for each face and label as person
name.
```
# Prepare Train Data
x_train=[]
y_train=[]
person_rep=dict()
person_folders=os.listdir(path+'/Images_crop/')
for i,person in enumerate(person_folders):
person_rep[i]=person
image_names=os.listdir('Images_crop/'+person+'/')
for image_name in image_names:
img=load_img(path+'/Images_crop/'+person+'/'+image_name,target_size=(224,224))
img=img_to_array(img)
img=np.expand_dims(img,axis=0)
img=preprocess_input(img)
img_encode=vgg_face(img)
x_train.append(np.squeeze(K.eval(img_encode)).tolist())
y_train.append(i)
# Prepare Test Data
x_test=[]
y_test=[]
person_folders=os.listdir(path+'/Test_Images_crop/')
for image_name in test_image_names:
img=load_img(path+'/Test_Images_crop/'+person+'/'+image_name,target_size=(224,224))
img=img_to_array(img)
img=np.expand_dims(img,axis=0)
img=preprocess_input(img)
img_encode=vgg_face(img)
x_test.append(np.squeeze(K.eval(img_encode)).tolist())
y_test.append(i)
```
Earlier we stored each cropped face image in corresponding person folder, walk through each folder and in each folder for each image, load image from keras in-built function load_img() which is PIL image with target_size=(224,224) since VGG_face_net expects image shape in (224,224) format.
For each loaded image it is preprocessed into scale of [-1,1] and feed into vgg_face() model which outputs (1,2262) dimensional Tensor, it is converted into list and append to train and test data.
Also,for each person we label that person in numeric number like
Ex:
{
0: 'modi',
1: 'trump',
2: 'angelamerkel',
3: 'jinping',
....
....
}
We got (x_train, y_train) and (x_test, y_test) as lists ,to use in keras models we first convert them to numpy arrays.
Ex:
```
x_train=np.array(x_train)
y_train=np.array(y_train)
x_test=np.array(x_test)
y_test=np.array(y_test)
```
7. We end up with train data and test data with face embeddings and labels as person encodings.
Now we train simple softmax classifier and save model to get predictions for unseen image face in dataset.
```
# Softmax regressor to classify images based on encoding
classifier_model=Sequential()
classifier_model.add(Dense(units=100,input_dim=x_train.shape[1],kernel_initializer='glorot_uniform'))
classifier_model.add(BatchNormalization())
classifier_model.add(Activation('tanh'))
classifier_model.add(Dropout(0.3))
classifier_model.add(Dense(units=10,kernel_initializer='glorot_uniform'))
classifier_model.add(BatchNormalization())
classifier_model.add(Activation('tanh'))
classifier_model.add(Dropout(0.2))
classifier_model.add(Dense(units=6,kernel_initializer='he_uniform'))
classifier_model.add(Activation('softmax'))
classifier_model.compile(
loss=tf.keras.losses.SparseCategoricalCrossentropy(),optimizer='nadam',metrics=['accuracy'])
```
We trained softmax classifier to classify image, it takes face embeddings as input and outputs corresponding image number which is encode for person name.
8. Now we can recognize any face in image if we get embeddings for face with help of vgg_face model and
feed into to classifier then get person name. With opencv draw rectangle box around face and write person name for each face in image.