1、embedding层
作用是将正整数(下标)转换为具有固定大小的向量,如[[4],[20]]->[[0.25,0.1],[0.6,-0.2]]
1.1 api
tf.keras.layers.Embedding
tf.keras.layers.Embedding( input_dim, output_dim, embeddings_initializer=‘uniform‘, embeddings_regularizer=None, activity_regularizer=None, embeddings_constraint=None, mask_zero=False, input_length=None, **kwargs )
Arguments | |
---|---|
input_dim |
Integer. Size of the vocabulary, i.e. maximum integer index + 1. 词汇表的长度,输入数据最大下标+1 |
output_dim |
Integer. Dimension of the dense embedding. 全连接嵌入的维度,即1个正整数转化为的向量的长度 |
embeddings_initializer |
Initializer for the embeddings matrix (see keras.initializers ). |
embeddings_regularizer |
Regularizer function applied to the embeddings matrix (see keras.regularizers ). |
embeddings_constraint |
Constraint function applied to the embeddings matrix (see keras.constraints ). |
mask_zero |
Boolean, whether or not the input value 0 is a special "padding" value that should be masked out. This is useful when using recurrent layers which may take variable length input. If this is 布尔值,确定是否将输入中的‘0’看作是应该被忽略的‘填充’(padding)值,该参数在使用递归层处理变长输入时有用。设置为True的话,模型中后续的层必须都支持masking,否则会抛出异常。如果该值为True,则下标0在字典中不可用,input_dim应设置为|vocabulary| + 1。 |
input_length |
Length of input sequences, when it is constant. This argument is required if you are going to connect 当输入序列的长度固定时,该值为其长度。如果要在该层后接Flatten层,然后接Dense层,则必须指定该参数,否则Dense层的输出维度无法自动推断。 |
1.2 实例
import tensorflow as tf import numpy as np model = tf.keras.Sequential() model.add(tf.keras.layers.Embedding(65, 64, input_length=10)) # The model will take as input an integer matrix of size (batch, # input_length), and the largest integer (i.e. word index) in the input # should be no larger than 999 (vocabulary size). # Now model.output_shape is (None, 10, 64), where `None` is the batch # dimension. input_array = np.random.randint(65, size=(32, 10)) print(input_array.shape) model.compile(‘rmsprop‘, ‘mse‘) output_array = model.predict(input_array) print(output_array.shape)
#(32, 10)
#(32, 10, 64)
# 65为所有词汇的大小,所以下面randint取值也是要在0-65范围内取值;如果randint取值大于65的话,会报错
# Embedding参数input_length为输入的每条数据的维度,randint可以看出(32, 10),batch_size为32, 每条数据长度为10
# Embedding第二个参数64表示输出数据的维度,即每条数据的维度的每一个值会变成长度为64的向量
输入输出数据变换:输入数据形状为(32, 10),其中32位batch_size, 10位每个数据的维度;Embedding层设置输入数据维度为10(和输入数据符合),输出数据维度为64,即每条数据的维度的每一个值会变成长度为64的向量;所以输出数据形状为(32,10,64)
2.Conv1d
tf.keras.layers.Conv1D
2.1 api
Arguments | |
---|---|
filters |
Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). 卷积核个数 |
kernel_size |
An integer or tuple/list of a single integer, specifying the length of the 1D convolution window. 卷积核大小,卷积核其实应该是一个二维的,这里只需要指定一维,是因为卷积核的第二维与输入的词向量维度是一致的,因为对于句子而言,卷积的移动方向只能是沿着词的方向,即只能在列维度移动 |
strides |
An integer or tuple/list of a single integer, specifying the stride length of the convolution. Specifying any stride value != 1 is incompatible with specifying any 步长 |
padding |
One of "valid" , "same" or "causal" (case-insensitive). "valid" means no padding. "same" results in padding evenly to the left/right or up/down of the input such that output has the same height/width dimension as the input. "causal" results in causal (dilated) convolutions, e.g. output[t] does not depend on input[t+1:] . Useful when modeling temporal data where the model should not violate the temporal order. See WaveNet: A Generative Model for Raw Audio, section 2.1. |
data_format |
A string, one of channels_last (default) or channels_first . |
dilation_rate |
an integer or tuple/list of a single integer, specifying the dilation rate to use for dilated convolution. Currently, specifying any dilation_rate value != 1 is incompatible with specifying any strides value != 1. |
groups |
A positive integer specifying the number of groups in which the input is split along the channel axis. Each group is convolved separately with filters / groups filters. The output is the concatenation of all the groups results along the channel axis. Input channels and filters must both be divisible by groups . |
activation |
Activation function to use. If you don‘t specify anything, no activation is applied ( see 激活函数 |
use_bias |
Boolean, whether the layer uses a bias vector. |
kernel_initializer |
Initializer for the kernel weights matrix ( see keras.initializers ). |
bias_initializer |
Initializer for the bias vector ( see keras.initializers ). |
kernel_regularizer |
Regularizer function applied to the kernel weights matrix (see keras.regularizers ). |
bias_regularizer |
Regularizer function applied to the bias vector ( see keras.regularizers ). |
activity_regularizer |
Regularizer function applied to the output of the layer (its "activation") ( see 正则项 |
kernel_constraint |
Constraint function applied to the kernel matrix ( see keras.constraints ). |
bias_constraint |
Constraint function applied to the bias vector ( see keras.constraints ). |
2.2 实例
# The inputs are 128-length vectors with 10 timesteps, and the batch size # is 4. input_shape = (4, 10, 128) x = tf.random.normal(input_shape) y = tf.keras.layers.Conv1D(32, 3, activation=‘relu‘,input_shape=input_shape[1:])(x) print(y.shape)
#(4, 8, 32)
1维卷积经常用于处理文本,上面实例以文本角度解释。
输入文本形状为(4, 10, 128),batch_size为4,句子长度为10,句子的每个单词用128的向量表示。Conv1D第一个参数表示卷积核个数为32,卷积核长度为3*128(见上面参数解释);所以对应输出形状为 (4 ,(10-3+1), 32(卷积核个数))= (4, 8,32)