6-1,构建模型的3种方法

可以使用以下3种方式构建模型:使用Sequential按层顺序构建模型,使用函数式API构建任意结构模型,继承Model基类构建自定义模型。

对于顺序结构的模型,优先使用Sequential方法构建。

如果模型有多输入或者多输出,或者模型需要共享权重,或者模型具有残差连接等非顺序结构,推荐使用函数式API进行创建。

如果无特定必要,尽可能避免使用Model子类化的方式构建模型,这种方式提供了极大的灵活性,但也有更大的概率出错。

下面以IMDB电影评论的分类问题为例,演示3种创建模型的方法。

  1. import numpy as np
  2. import pandas as pd
  3. import tensorflow as tf
  4. from tqdm import tqdm
  5. from tensorflow.keras import *
  6. train_token_path = "./data/imdb/train_token.csv"
  7. test_token_path = "./data/imdb/test_token.csv"
  8. MAX_WORDS = 10000 # We will only consider the top 10,000 words in the dataset
  9. MAX_LEN = 200 # We will cut reviews after 200 words
  10. BATCH_SIZE = 20
  11. # 构建管道
  12. def parse_line(line):
  13. t = tf.strings.split(line,"\t")
  14. label = tf.reshape(tf.cast(tf.strings.to_number(t[0]),tf.int32),(-1,))
  15. features = tf.cast(tf.strings.to_number(tf.strings.split(t[1]," ")),tf.int32)
  16. return (features,label)
  17. ds_train= tf.data.TextLineDataset(filenames = [train_token_path]) \
  18. .map(parse_line,num_parallel_calls = tf.data.experimental.AUTOTUNE) \
  19. .shuffle(buffer_size = 1000).batch(BATCH_SIZE) \
  20. .prefetch(tf.data.experimental.AUTOTUNE)
  21. ds_test= tf.data.TextLineDataset(filenames = [test_token_path]) \
  22. .map(parse_line,num_parallel_calls = tf.data.experimental.AUTOTUNE) \
  23. .shuffle(buffer_size = 1000).batch(BATCH_SIZE) \
  24. .prefetch(tf.data.experimental.AUTOTUNE)

一,Sequential按层顺序创建模型

  1. tf.keras.backend.clear_session()
  2. model = models.Sequential()
  3. model.add(layers.Embedding(MAX_WORDS,7,input_length=MAX_LEN))
  4. model.add(layers.Conv1D(filters = 64,kernel_size = 5,activation = "relu"))
  5. model.add(layers.MaxPool1D(2))
  6. model.add(layers.Conv1D(filters = 32,kernel_size = 3,activation = "relu"))
  7. model.add(layers.MaxPool1D(2))
  8. model.add(layers.Flatten())
  9. model.add(layers.Dense(1,activation = "sigmoid"))
  10. model.compile(optimizer='Nadam',
  11. loss='binary_crossentropy',
  12. metrics=['accuracy',"AUC"])
  13. model.summary()

6-1,构建模型的3种方法 - 图1

  1. import datetime
  2. baselogger = callbacks.BaseLogger(stateful_metrics=["AUC"])
  3. logdir = "./data/keras_model/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
  4. tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)
  5. history = model.fit(ds_train,validation_data = ds_test,
  6. epochs = 6,callbacks=[baselogger,tensorboard_callback])
  1. %matplotlib inline
  2. %config InlineBackend.figure_format = 'svg'
  3. import matplotlib.pyplot as plt
  4. def plot_metric(history, metric):
  5. train_metrics = history.history[metric]
  6. val_metrics = history.history['val_'+metric]
  7. epochs = range(1, len(train_metrics) + 1)
  8. plt.plot(epochs, train_metrics, 'bo--')
  9. plt.plot(epochs, val_metrics, 'ro-')
  10. plt.title('Training and validation '+ metric)
  11. plt.xlabel("Epochs")
  12. plt.ylabel(metric)
  13. plt.legend(["train_"+metric, 'val_'+metric])
  14. plt.show()
  1. plot_metric(history,"AUC")

6-1,构建模型的3种方法 - 图2

二,函数式API创建任意结构模型

  1. tf.keras.backend.clear_session()
  2. inputs = layers.Input(shape=[MAX_LEN])
  3. x = layers.Embedding(MAX_WORDS,7)(inputs)
  4. branch1 = layers.SeparableConv1D(64,3,activation="relu")(x)
  5. branch1 = layers.MaxPool1D(3)(branch1)
  6. branch1 = layers.SeparableConv1D(32,3,activation="relu")(branch1)
  7. branch1 = layers.GlobalMaxPool1D()(branch1)
  8. branch2 = layers.SeparableConv1D(64,5,activation="relu")(x)
  9. branch2 = layers.MaxPool1D(5)(branch2)
  10. branch2 = layers.SeparableConv1D(32,5,activation="relu")(branch2)
  11. branch2 = layers.GlobalMaxPool1D()(branch2)
  12. branch3 = layers.SeparableConv1D(64,7,activation="relu")(x)
  13. branch3 = layers.MaxPool1D(7)(branch3)
  14. branch3 = layers.SeparableConv1D(32,7,activation="relu")(branch3)
  15. branch3 = layers.GlobalMaxPool1D()(branch3)
  16. concat = layers.Concatenate()([branch1,branch2,branch3])
  17. outputs = layers.Dense(1,activation = "sigmoid")(concat)
  18. model = models.Model(inputs = inputs,outputs = outputs)
  19. model.compile(optimizer='Nadam',
  20. loss='binary_crossentropy',
  21. metrics=['accuracy',"AUC"])
  22. model.summary()
  1. Model: "model"
  2. __________________________________________________________________________________________________
  3. Layer (type) Output Shape Param # Connected to
  4. ==================================================================================================
  5. input_1 (InputLayer) [(None, 200)] 0
  6. __________________________________________________________________________________________________
  7. embedding (Embedding) (None, 200, 7) 70000 input_1[0][0]
  8. __________________________________________________________________________________________________
  9. separable_conv1d (SeparableConv (None, 198, 64) 533 embedding[0][0]
  10. __________________________________________________________________________________________________
  11. separable_conv1d_2 (SeparableCo (None, 196, 64) 547 embedding[0][0]
  12. __________________________________________________________________________________________________
  13. separable_conv1d_4 (SeparableCo (None, 194, 64) 561 embedding[0][0]
  14. __________________________________________________________________________________________________
  15. max_pooling1d (MaxPooling1D) (None, 66, 64) 0 separable_conv1d[0][0]
  16. __________________________________________________________________________________________________
  17. max_pooling1d_1 (MaxPooling1D) (None, 39, 64) 0 separable_conv1d_2[0][0]
  18. __________________________________________________________________________________________________
  19. max_pooling1d_2 (MaxPooling1D) (None, 27, 64) 0 separable_conv1d_4[0][0]
  20. __________________________________________________________________________________________________
  21. separable_conv1d_1 (SeparableCo (None, 64, 32) 2272 max_pooling1d[0][0]
  22. __________________________________________________________________________________________________
  23. separable_conv1d_3 (SeparableCo (None, 35, 32) 2400 max_pooling1d_1[0][0]
  24. __________________________________________________________________________________________________
  25. separable_conv1d_5 (SeparableCo (None, 21, 32) 2528 max_pooling1d_2[0][0]
  26. __________________________________________________________________________________________________
  27. global_max_pooling1d (GlobalMax (None, 32) 0 separable_conv1d_1[0][0]
  28. __________________________________________________________________________________________________
  29. global_max_pooling1d_1 (GlobalM (None, 32) 0 separable_conv1d_3[0][0]
  30. __________________________________________________________________________________________________
  31. global_max_pooling1d_2 (GlobalM (None, 32) 0 separable_conv1d_5[0][0]
  32. __________________________________________________________________________________________________
  33. concatenate (Concatenate) (None, 96) 0 global_max_pooling1d[0][0]
  34. global_max_pooling1d_1[0][0]
  35. global_max_pooling1d_2[0][0]
  36. __________________________________________________________________________________________________
  37. dense (Dense) (None, 1) 97 concatenate[0][0]
  38. ==================================================================================================
  39. Total params: 78,938
  40. Trainable params: 78,938
  41. Non-trainable params: 0
  42. __________________________________________________________________________________________________

6-1,构建模型的3种方法 - 图3

  1. import datetime
  2. logdir = "./data/keras_model/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
  3. tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)
  4. history = model.fit(ds_train,validation_data = ds_test,epochs = 6,callbacks=[tensorboard_callback])
  1. Epoch 1/6
  2. 1000/1000 [==============================] - 32s 32ms/step - loss: 0.5527 - accuracy: 0.6758 - AUC: 0.7731 - val_loss: 0.3646 - val_accuracy: 0.8426 - val_AUC: 0.9192
  3. Epoch 2/6
  4. 1000/1000 [==============================] - 24s 24ms/step - loss: 0.3024 - accuracy: 0.8737 - AUC: 0.9444 - val_loss: 0.3281 - val_accuracy: 0.8644 - val_AUC: 0.9350
  5. Epoch 3/6
  6. 1000/1000 [==============================] - 24s 24ms/step - loss: 0.2158 - accuracy: 0.9159 - AUC: 0.9715 - val_loss: 0.3461 - val_accuracy: 0.8666 - val_AUC: 0.9363
  7. Epoch 4/6
  8. 1000/1000 [==============================] - 24s 24ms/step - loss: 0.1492 - accuracy: 0.9464 - AUC: 0.9859 - val_loss: 0.4017 - val_accuracy: 0.8568 - val_AUC: 0.9311
  9. Epoch 5/6
  10. 1000/1000 [==============================] - 24s 24ms/step - loss: 0.0944 - accuracy: 0.9696 - AUC: 0.9939 - val_loss: 0.4998 - val_accuracy: 0.8550 - val_AUC: 0.9233
  11. Epoch 6/6
  12. 1000/1000 [==============================] - 26s 26ms/step - loss: 0.0526 - accuracy: 0.9865 - AUC: 0.9977 - val_loss: 0.6463 - val_accuracy: 0.8462 - val_AUC: 0.9138
  1. plot_metric(history,"AUC")

6-1,构建模型的3种方法 - 图4

三,Model子类化创建自定义模型

  1. # 先自定义一个残差模块,为自定义Layer
  2. class ResBlock(layers.Layer):
  3. def __init__(self, kernel_size, **kwargs):
  4. super(ResBlock, self).__init__(**kwargs)
  5. self.kernel_size = kernel_size
  6. def build(self,input_shape):
  7. self.conv1 = layers.Conv1D(filters=64,kernel_size=self.kernel_size,
  8. activation = "relu",padding="same")
  9. self.conv2 = layers.Conv1D(filters=32,kernel_size=self.kernel_size,
  10. activation = "relu",padding="same")
  11. self.conv3 = layers.Conv1D(filters=input_shape[-1],
  12. kernel_size=self.kernel_size,activation = "relu",padding="same")
  13. self.maxpool = layers.MaxPool1D(2)
  14. super(ResBlock,self).build(input_shape) # 相当于设置self.built = True
  15. def call(self, inputs):
  16. x = self.conv1(inputs)
  17. x = self.conv2(x)
  18. x = self.conv3(x)
  19. x = layers.Add()([inputs,x])
  20. x = self.maxpool(x)
  21. return x
  22. #如果要让自定义的Layer通过Functional API 组合成模型时可以序列化,需要自定义get_config方法。
  23. def get_config(self):
  24. config = super(ResBlock, self).get_config()
  25. config.update({'kernel_size': self.kernel_size})
  26. return config
  1. # 测试ResBlock
  2. resblock = ResBlock(kernel_size = 3)
  3. resblock.build(input_shape = (None,200,7))
  4. resblock.compute_output_shape(input_shape=(None,200,7))
  1. TensorShape([None, 100, 7])
  1. # 自定义模型,实际上也可以使用Sequential或者FunctionalAPI
  2. class ImdbModel(models.Model):
  3. def __init__(self):
  4. super(ImdbModel, self).__init__()
  5. def build(self,input_shape):
  6. self.embedding = layers.Embedding(MAX_WORDS,7)
  7. self.block1 = ResBlock(7)
  8. self.block2 = ResBlock(5)
  9. self.dense = layers.Dense(1,activation = "sigmoid")
  10. super(ImdbModel,self).build(input_shape)
  11. def call(self, x):
  12. x = self.embedding(x)
  13. x = self.block1(x)
  14. x = self.block2(x)
  15. x = layers.Flatten()(x)
  16. x = self.dense(x)
  17. return(x)
  1. tf.keras.backend.clear_session()
  2. model = ImdbModel()
  3. model.build(input_shape =(None,200))
  4. model.summary()
  5. model.compile(optimizer='Nadam',
  6. loss='binary_crossentropy',
  7. metrics=['accuracy',"AUC"])
  1. Model: "imdb_model"
  2. _________________________________________________________________
  3. Layer (type) Output Shape Param #
  4. =================================================================
  5. embedding (Embedding) multiple 70000
  6. _________________________________________________________________
  7. res_block (ResBlock) multiple 19143
  8. _________________________________________________________________
  9. res_block_1 (ResBlock) multiple 13703
  10. _________________________________________________________________
  11. dense (Dense) multiple 351
  12. =================================================================
  13. Total params: 103,197
  14. Trainable params: 103,197
  15. Non-trainable params: 0
  16. _________________________________________________________________

6-1,构建模型的3种方法 - 图5

  1. import datetime
  2. logdir = "./tflogs/keras_model/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
  3. tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)
  4. history = model.fit(ds_train,validation_data = ds_test,
  5. epochs = 6,callbacks=[tensorboard_callback])
  1. Epoch 1/6
  2. 1000/1000 [==============================] - 47s 47ms/step - loss: 0.5629 - accuracy: 0.6618 - AUC: 0.7548 - val_loss: 0.3422 - val_accuracy: 0.8510 - val_AUC: 0.9286
  3. Epoch 2/6
  4. 1000/1000 [==============================] - 43s 43ms/step - loss: 0.2648 - accuracy: 0.8903 - AUC: 0.9576 - val_loss: 0.3276 - val_accuracy: 0.8650 - val_AUC: 0.9410
  5. Epoch 3/6
  6. 1000/1000 [==============================] - 42s 42ms/step - loss: 0.1573 - accuracy: 0.9439 - AUC: 0.9846 - val_loss: 0.3861 - val_accuracy: 0.8682 - val_AUC: 0.9390
  7. Epoch 4/6
  8. 1000/1000 [==============================] - 42s 42ms/step - loss: 0.0849 - accuracy: 0.9706 - AUC: 0.9950 - val_loss: 0.5324 - val_accuracy: 0.8616 - val_AUC: 0.9292
  9. Epoch 5/6
  10. 1000/1000 [==============================] - 43s 43ms/step - loss: 0.0393 - accuracy: 0.9876 - AUC: 0.9986 - val_loss: 0.7693 - val_accuracy: 0.8566 - val_AUC: 0.9132
  11. Epoch 6/6
  12. 1000/1000 [==============================] - 44s 44ms/step - loss: 0.0222 - accuracy: 0.9926 - AUC: 0.9994 - val_loss: 0.9328 - val_accuracy: 0.8584 - val_AUC: 0.9052
  1. plot_metric(history,"AUC")

6-1,构建模型的3种方法 - 图6

如果对本书内容理解上有需要进一步和作者交流的地方,欢迎在公众号”Python与算法之美”下留言。作者时间和精力有限,会酌情予以回复。

也可以在公众号后台回复关键字:加群,加入读者交流群和大家讨论。

image.png