keras提供了VGG19在ImageNet上的预训练权重模型文件,其他可用的模型还有VGG16、Xception、ResNet50、InceptionV3 4个。
VGG19在keras中的定义:
-
def VGG19(include_top=True, weights='imagenet', -
input_tensor=None, input_shape=None, -
pooling=None, -
classes=1000) -
参数介绍:
include_top: 是否包含最后的3个全连接层
weights: 定义为‘imagenet’,表示加载在imagenet数据库上训练的预训练权重,定义为None则不加载权重,参数随机初始化
1)使用VGG19预训练模型分类图片的例子
-
# coding: utf-8 -
from keras.applications.vgg19 import VGG19 -
from keras.preprocessing import image -
from keras.applications.vgg19 import preprocess_input -
from keras.models import Model -
import numpy as np -
-
base_model = VGG19(weights='imagenet', include_top=True) -
img_path = 'cat.jpg' -
img = image.load_img(img_path, target_size=(224, 224)) # 加载图像,归一化大小 -
x = image.img_to_array(img) # 序列化 -
x = np.expand_dims(x, axis=0) # 扩展维度[batch_size,H,W,Channel] -
x = preprocess_input(x) # 预处理,默认mode="caffe" -
""" -
def preprocess_input(x, data_format=None, mode='caffe'): -
Preprocesses a tensor or Numpy array encoding a batch of images. -
-
# Arguments -
x: Input Numpy or symbolic tensor, 3D or 4D. -
data_format: Data format of the image tensor/array. -
mode: One of "caffe", "tf". -
- caffe: will convert the images from RGB to BGR, -
then will zero-center each color channel with -
respect to the ImageNet dataset, -
without scaling. -
- tf: will scale pixels between -1 and 1, -
sample-wise. -
-
# Returns -
Preprocessed tensor or Numpy array. -
-
# Raises -
ValueError: In case of unknown `data_format` argument. -
""" -
out = base_model.predict(x) # 预测结果,1000维的向量 -
print(out.shape) # 预测结果out是一个1000维的向量,代表了预测结果分别属于10000个分类的概率,形状是(1,1000)
2) 使用VGG19预训练模型提取VGG19网络中任意层的输出特征的例子
上个例子可以看到keras对VGG网络的封装异常好,简单几行代码就可以分类图片。keras中VGG预训练参数模型另一个更常用的应用是可以提取VGG网络中任意一层的特征。
以下例子提取的是VGG19网络中第5个卷积层的输出特征(也是第1个全连接层的输入特征)
-
# coding: utf-8 -
from keras.applications.vgg19 import VGG19 -
from keras.preprocessing import image -
from keras.applications.vgg19 import preprocess_input -
from keras.models import Model -
import numpy as np -
-
base_model = VGG19(weights='imagenet', include_top=False) -
model = Model(inputs=base_model.input, outputs=base_model.get_layer('block5_pool').output) -
img_path = 'cat.jpg' -
img = image.load_img(img_path, target_size=(224, 224)) -
x = image.img_to_array(img) -
x = np.expand_dims(x, axis=0) -
x = preprocess_input(x) -
block5_pool_features = model.predict(x) -
print(block5_pool_features.shape) #(1, 7, 7, 512)
也可以设置为加载最后3个全连接层的VGG19网络,就可以获取最后3个全连接层的输出了:
-
# coding: utf-8 -
from keras.applications.vgg19 import VGG19 -
from keras.preprocessing import image -
from keras.applications.vgg19 import preprocess_input -
from keras.models import Model -
import numpy as np -
base_model = VGG19(weights='imagenet', include_top=True) -
model = Model(inputs=base_model.input, outputs=base_model.get_layer('fc2').output) -
img_path = 'cat.jpg' -
img = image.load_img(img_path, target_size=(224, 224)) -
x = image.img_to_array(img) -
x = np.expand_dims(x, axis=0) -
x = preprocess_input(x) -
fc2 = model.predict(x) -
print(fc2.shape) #(1, 4096)
加了全连接层,所以base_model.get_layer('fc2') 里参数也可以是 flatten、fc1、fc2和predictions 。
VGG19各个模块在keras中定义的名称可以通过以下代码获得,然后根据名称可以轻松获取该层特征::
-
from keras.applications.vgg19 import VGG19 -
from keras.preprocessing import image -
from keras.applications.vgg19 import preprocess_input -
from keras.models import Model -
import numpy as np -
import os -
import tensorflow as tf -
os.environ["CUDA_VISIBLE_DEVICES"] = "0" -
gpu_options = tf.GPUOptions(allow_growth=True) -
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)) -
from keras.utils import plot_model -
from matplotlib import pyplot as plt -
-
#【0】VGG19模型,加载预训练权重 -
base_model = VGG19(weights='imagenet') -
-
#【1】创建一个新model, 使得它的输出(outputs)是 VGG19 中任意层的输出(output) -
model = Model(inputs=base_model.input, outputs=base_model.get_layer('block4_pool').output) -
print(model.summary()) # 打印模型概况,如下所示 -
plot_model(model,to_file = 'a simple convnet.png') # 画出模型结构图,并保存成图片,如下图所示 -
""" -
_________________________________________________________________ -
Layer (type) Output Shape Param # -
================================================================= -
input_1 (InputLayer) (None, 224, 224, 3) 0 -
_________________________________________________________________ -
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792 -
_________________________________________________________________ -
block1_conv2 (Conv2D) (None, 224, 224, 64) 36928 -
_________________________________________________________________ -
block1_pool (MaxPooling2D) (None, 112, 112, 64) 0 -
_________________________________________________________________ -
block2_conv1 (Conv2D) (None, 112, 112, 128) 73856 -
_________________________________________________________________ -
block2_conv2 (Conv2D) (None, 112, 112, 128) 147584 -
_________________________________________________________________ -
block2_pool (MaxPooling2D) (None, 56, 56, 128) 0 -
_________________________________________________________________ -
block3_conv1 (Conv2D) (None, 56, 56, 256) 295168 -
_________________________________________________________________ -
block3_conv2 (Conv2D) (None, 56, 56, 256) 590080 -
_________________________________________________________________ -
block3_conv3 (Conv2D) (None, 56, 56, 256) 590080 -
_________________________________________________________________ -
block3_conv4 (Conv2D) (None, 56, 56, 256) 590080 -
_________________________________________________________________ -
block3_pool (MaxPooling2D) (None, 28, 28, 256) 0 -
_________________________________________________________________ -
block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160 -
_________________________________________________________________ -
block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808 -
_________________________________________________________________ -
block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808 -
_________________________________________________________________ -
block4_conv4 (Conv2D) (None, 28, 28, 512) 2359808 -
_________________________________________________________________ -
block4_pool (MaxPooling2D) (None, 14, 14, 512) 0 -
================================================================= -
Total params: 10,585,152 -
Trainable params: 10,585,152 -
Non-trainable params: 0 -
_________________________________________________________________ -
-
"""
3)最终的特征提取代码
-
# -*- coding: UTF-8 -*- -
#------------------------------------------- -
#任 务:利用VGG19提取任意中间层特征 -
#数 据:网上下载的测试图片‘elephant.jpg’ -
#------------------------------------------- -
-
from keras.applications.vgg19 import VGG19 -
from keras.preprocessing import image -
from keras.applications.vgg19 import preprocess_input -
from keras.models import Model -
import numpy as np -
import os -
import tensorflow as tf -
os.environ["CUDA_VISIBLE_DEVICES"] = "6" -
gpu_options = tf.GPUOptions(allow_growth=True) -
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options)) -
from keras.utils import plot_model -
from matplotlib import pyplot as plt -
-
#【0】VGG19模型,加载预训练权重 -
base_model = VGG19(weights='imagenet') -
-
#【1】创建一个新model, 使得它的输出(outputs)是 VGG19 中任意层的输出(output) -
model = Model(inputs=base_model.input, outputs=base_model.get_layer('block4_pool').output) -
print(model.summary()) # 打印模型概况 -
plot_model(model,to_file = 'a simple convnet.png') # 画出模型结构图,并保存成图片 -
-
''' -
_________________________________________________________________ -
Layer (type) Output Shape Param # -
================================================================= -
input_1 (InputLayer) (None, 224, 224, 3) 0 -
_________________________________________________________________ -
block1_conv1 (Conv2D) (None, 224, 224, 64) 1792 -
_________________________________________________________________ -
block1_conv2 (Conv2D) (None, 224, 224, 64) 36928 -
_________________________________________________________________ -
block1_pool (MaxPooling2D) (None, 112, 112, 64) 0 -
_________________________________________________________________ -
block2_conv1 (Conv2D) (None, 112, 112, 128) 73856 -
_________________________________________________________________ -
block2_conv2 (Conv2D) (None, 112, 112, 128) 147584 -
_________________________________________________________________ -
block2_pool (MaxPooling2D) (None, 56, 56, 128) 0 -
_________________________________________________________________ -
block3_conv1 (Conv2D) (None, 56, 56, 256) 295168 -
_________________________________________________________________ -
block3_conv2 (Conv2D) (None, 56, 56, 256) 590080 -
_________________________________________________________________ -
block3_conv3 (Conv2D) (None, 56, 56, 256) 590080 -
_________________________________________________________________ -
block3_conv4 (Conv2D) (None, 56, 56, 256) 590080 -
_________________________________________________________________ -
block3_pool (MaxPooling2D) (None, 28, 28, 256) 0 -
_________________________________________________________________ -
block4_conv1 (Conv2D) (None, 28, 28, 512) 1180160 -
_________________________________________________________________ -
block4_conv2 (Conv2D) (None, 28, 28, 512) 2359808 -
_________________________________________________________________ -
block4_conv3 (Conv2D) (None, 28, 28, 512) 2359808 -
_________________________________________________________________ -
block4_conv4 (Conv2D) (None, 28, 28, 512) 2359808 -
_________________________________________________________________ -
block4_pool (MaxPooling2D) (None, 14, 14, 512) 0 -
================================================================= -
Total params: 10,585,152 -
Trainable params: 10,585,152 -
Non-trainable params: 0 -
_________________________________________________________________ -
-
''' -
-
#【2】从网上下载一张图片,保存在当前路径下 -
img_path = 'elephant.jpg' -
img = image.load_img(img_path, target_size=(224, 224)) # 加载图片并resize成224x224 -
-
#【3】将图片转化为4d tensor形式 -
x = image.img_to_array(img) -
x = np.expand_dims(x, axis=0) -
-
#【4】数据预处理 -
x = preprocess_input(x) #去均值中心化,preprocess_input函数详细功能见注释 -
""" -
def preprocess_input(x, data_format=None, mode='caffe'): -
Preprocesses a tensor or Numpy array encoding a batch of images. -
-
# Arguments -
x: Input Numpy or symbolic tensor, 3D or 4D. -
data_format: Data format of the image tensor/array. -
mode: One of "caffe", "tf". -
- caffe: will convert the images from RGB to BGR, -
then will zero-center each color channel with -
respect to the ImageNet dataset, -
without scaling. -
- tf: will scale pixels between -1 and 1, -
sample-wise. -
-
# Returns -
Preprocessed tensor or Numpy array. -
-
# Raises -
ValueError: In case of unknown `data_format` argument. -
""" -
#【5】提取特征 -
block4_pool_features = model.predict(x) -
print(block4_pool_features.shape) #(1, 14, 14, 512)
示例图片:
