自定义
1. 自定义层
1.1 主要参数
- 输入参数
- kernel:Dense层中是权重系数w
- bias:偏置系数
1.2 函数
__init__
, where you can do all input-independent(与输入无关的?) initializationbuild
, where you know the shapes of the input tensors and can do the rest of the initializationcall
, where you do the forward computation
build函数是可以根据shape创建variable(变量),而用__init__意味着要额外指定shape。
class MyDenseLayer(tf.keras.layers.Layer):
def __init__(self, num_outputs):
super(MyDenseLayer, self).__init__()
self.num_outputs = num_outputs
def build(self, input_shape):
self.kernel = self.add_weight("kernel",
shape=[int(input_shape[-1]),
self.num_outputs])
def call(self, input):
return tf.matmul(input, self.kernel)
layer = MyDenseLayer(10)
_ = layer(tf.zeros([10, 5])) # Calling the layer `.builds` it.
print([var.name for var in layer.trainable_variables])
# ['my_dense_layer/kernel:0']
2. 自定义模型
自定义层以及层与层之间的操作
也可以直接用sequence把他们串联起来
3. 自定义训练
3.1 变量
tensor不可改变;但Python is a stateful programming language:
There are operations (tf.assign_sub
, tf.scatter_update
, etc.) that manipulate the value stored in a TensorFlow variable tf.Variable
.
3.2 fit a linear model
- Define the model.
- Define a loss function.
- Obtain training data.
- Run through the training data and use an “optimizer” to adjust the variables to fit the data.
3.2.1 Define a model
class Model(object):
def __init__(self):
# Initialize the weights to `5.0` and the bias to `0.0`
# In practice, these should be initialized to random values (for example, with `tf.random.normal`)
self.W = tf.Variable(5.0)
self.b = tf.Variable(0.0)
def __call__(self, x):
return self.W * x + self.b
model = Model()
# 断言,判断对错
assert model(3.0).numpy() == 15.0
3.2.3 Define a loss function
def loss(target_y, predicted_y):
return tf.reduce_mean(tf.square(target_y - predicted_y))
3.2.3 Obtain training data
TRUE_W = 3.0
TRUE_b = 2.0
NUM_EXAMPLES = 1000
inputs = tf.random.normal(shape=[NUM_EXAMPLES])
noise = tf.random.normal(shape=[NUM_EXAMPLES])
outputs = inputs * TRUE_W + TRUE_b + noise
查看损失函数
import matplotlib.pyplot as plt
plt.scatter(inputs, outputs, c='b')
plt.scatter(inputs, model(inputs), c='r')
plt.show()
print('Current loss: %1.6f' % loss(model(inputs), outputs).numpy())
3.2.4 Define a training loop
There are many variants of the gradient descent scheme that are captured in tf.train.Optimizer
—our recommended implementation.
def train(model, inputs, outputs, learning_rate):
with tf.GradientTape() as t:
current_loss = loss(outputs, model(inputs))
dW, db = t.gradient(current_loss, [model.W, model.b])
model.W.assign_sub(learning_rate * dW)
model.b.assign_sub(learning_rate * db)
model = Model()
# Collect the history of W-values and b-values to plot later
Ws, bs = [], []
epochs = range(10)
for epoch in epochs:
Ws.append(model.W.numpy())
bs.append(model.b.numpy())
current_loss = loss(outputs, model(inputs))
train(model, inputs, outputs, learning_rate=0.1)
print('Epoch %2d: W=%1.2f b=%1.2f, loss=%2.5f' %
(epoch, Ws[-1], bs[-1], current_loss))
# Let's plot it all
plt.plot(epochs, Ws, 'r',
epochs, bs, 'b')
plt.plot([TRUE_W] * len(epochs), 'r--',
[TRUE_b] * len(epochs), 'b--')
plt.legend(['W', 'b', 'True W', 'True b'])
plt.show()