What is the exact output of the Inception ResNet V2's feature extraction layer?

Question

I am working with the Inception ResNet V2 model, pre-trained with ImageNet, for face recognition.

However, I'm so confused about what the exact output of the feature extraction layer (i.e. the layer just before the fully connected layer) of Inception ResNet V2 is. Can someone clarify exactly this?

(By the way, if you know some resource that explains Inception ResNet V2 clearly, let me know).

score 0 · Answer 1 · answered Jul 30 '19 at 02:28

Due to this article: https://arxiv.org/pdf/1512.00567v3.pdf?source=post_page--------------------------- ,

I try to flatten the 3-d tensor in to 1d vector: 8*8*2048, because in the article, the pool layer of inception resnet v2 at page 6 is Pool: 8 * 8 * 2048.

But at the end, my code showed the error: ValueError: cannot reshape array of size 33423360 into shape (340,131072)

This is all my code:

from keras.applications.inception_resnet_v2 import InceptionResNetV2
from keras.applications.inception_resnet_v2 import preprocess_input
from keras.models import Model
from keras.preprocessing.image import load_img
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import classification_report
from imutils import paths
from keras.applications import imagenet_utils
from keras.preprocessing.image import img_to_array
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from keras.preprocessing import image
import random
import os
import numpy as np 
import cv2


# Path to image 
image_path = list(paths.list_images('/content/drive/My Drive/casia-299-small'))
# Random image path
random.shuffle(image_path)
# Get image name
labels = [p.split(os.path.sep)[-2] for p in image_path]


# Encode face name in to number
le = LabelEncoder()
labels = le.fit_transform(labels)

# Load model inception v2, include_top = Fale to ignore Fully Connected layer
model = InceptionResNetV2(include_top = False, weights = 'imagenet')


# Load images and resize into required input size of Inception Resnet v2 299x299
list_image = []
for (j, imagePath) in enumerate(image_path):
    image = load_img(imagePath, target_size = (299, 299, 3))
    image = img_to_array(image)

    image = np.expand_dims(image, 0)
    image = imagenet_utils.preprocess_input(image)

    list_image.append(image)

# Use pre-trained model to extract feature
list_image = np.vstack(list_image)
print("LIst image: ", list_image)
features = model.predict(list_image)
print("feature: ", features)
print("feature shape[0]: ", features.shape[0])
print("feature shape: ", features.shape)
features = features.reshape((features.shape[0], 8*8*2048))

# Split training set and test set n ratio of 80-20
x_train, x_test, y_train, y_test = train_test_split(features, labels, test_size = 0.2, random_state =42)

params = {'C': [0.1, 1.0, 10.0, 100.0]}
model = GridSearchCV(LogisticRegression(), params)
model.fit(x_train,y_train)
model.save('/content/drive/My Drive/casia-299-small/myweight1.h5')
print('Best parameter for the model {}'.format(model.best_params_))

preds = model.predict(x_test)
print(classification_report(y_test, preds))
```

score 0 · Answer 2 · answered Nov 12 '19 at 13:25

You can use this to view the Keras Resnet Inception V2 network.

from keras.applications.inception_resnet_v2 import InceptionResNetV2, preprocess_input
from keras.layers import Input
model = InceptionResNetV2(weights='imagenet', include_top=True)
print(model.summary())

This will Output (im showing only the last few layers):

__________________________________________________________________________________________________
conv_7b_ac (Activation)         (None, 8, 8, 1536)   0           conv_7b_bn[0][0]                 
__________________________________________________________________________________________________
avg_pool (GlobalAveragePooling2 (None, 1536)         0           conv_7b_ac[0][0]                 
__________________________________________________________________________________________________
predictions (Dense)             (None, 1000)         1537000     avg_pool[0][0]                   
==================================================================================================
Total params: 55,873,736
Trainable params: 55,813,192
Non-trainable params: 60,544
__________________________________________________________________________________________________
None

If we look at the output of the 'avg_pool' layer from 'Top'. There will be 1536 features at the output.

You can make a model in this way:

from keras.applications.inception_resnet_v2 import InceptionResNetV2, preprocess_input
from keras.layers import Input
import numpy as np
def extract(image_path):
    base_model = InceptionResNetV2(weights='imagenet', include_top=True)
    model = Model(inputs=base_model.input,outputs=base_model.get_layer('avg_pool').output)

    img = image.load_img(image_path, target_size=(299, 299))
    x = image.img_to_array(img)
    x = np.expand_dims(x, axis=0)
    x = preprocess_input(x)

    # Get the prediction.
    features = model.predict(x)    
    features = features[0]
    return features


features=[]
features = extract(image)

I couldn't try the code as, right now, I don't have an environment to test this code.

What is the exact output of the Inception ResNet V2's feature extraction layer?

2 Answers2