自定义

08 Jun 2020

1. 自定义层

1.1 主要参数

1.2 函数

  1. __init__ , where you can do all input-independent(与输入无关的?) initialization
  2. build, where you know the shapes of the input tensors and can do the rest of the initialization
  3. call, where you do the forward computation

build函数是可以根据shape创建variable(变量),而用__init__意味着要额外指定shape。

class MyDenseLayer(tf.keras.layers.Layer):
  def __init__(self, num_outputs):
    super(MyDenseLayer, self).__init__()
    self.num_outputs = num_outputs

  def build(self, input_shape):
    self.kernel = self.add_weight("kernel",
                                  shape=[int(input_shape[-1]),
                                         self.num_outputs])

  def call(self, input):
    return tf.matmul(input, self.kernel)

layer = MyDenseLayer(10)

_ = layer(tf.zeros([10, 5])) # Calling the layer `.builds` it.

print([var.name for var in layer.trainable_variables])
# ['my_dense_layer/kernel:0']

2. 自定义模型

自定义层以及层与层之间的操作

也可以直接用sequence把他们串联起来

3. 自定义训练

3.1 变量

tensor不可改变;但Python is a stateful programming language:

There are operations (tf.assign_sub, tf.scatter_update, etc.) that manipulate the value stored in a TensorFlow variable tf.Variable.

3.2 fit a linear model

  1. Define the model.
  2. Define a loss function.
  3. Obtain training data.
  4. Run through the training data and use an “optimizer” to adjust the variables to fit the data.

3.2.1 Define a model

class Model(object):
  def __init__(self):
    # Initialize the weights to `5.0` and the bias to `0.0`
    # In practice, these should be initialized to random values (for example, with `tf.random.normal`)
    self.W = tf.Variable(5.0)
    self.b = tf.Variable(0.0)

  def __call__(self, x):
    return self.W * x + self.b

model = Model()
# 断言,判断对错
assert model(3.0).numpy() == 15.0

3.2.3 Define a loss function

def loss(target_y, predicted_y):
  return tf.reduce_mean(tf.square(target_y - predicted_y))

3.2.3 Obtain training data

TRUE_W = 3.0
TRUE_b = 2.0
NUM_EXAMPLES = 1000

inputs  = tf.random.normal(shape=[NUM_EXAMPLES])
noise   = tf.random.normal(shape=[NUM_EXAMPLES])
outputs = inputs * TRUE_W + TRUE_b + noise

查看损失函数

import matplotlib.pyplot as plt

plt.scatter(inputs, outputs, c='b')
plt.scatter(inputs, model(inputs), c='r')
plt.show()

print('Current loss: %1.6f' % loss(model(inputs), outputs).numpy())

output__eb83LtrB4nt_0

3.2.4 Define a training loop

There are many variants of the gradient descent scheme that are captured in tf.train.Optimizer—our recommended implementation.

def train(model, inputs, outputs, learning_rate):
  with tf.GradientTape() as t:
    current_loss = loss(outputs, model(inputs))
  dW, db = t.gradient(current_loss, [model.W, model.b])
  model.W.assign_sub(learning_rate * dW)
  model.b.assign_sub(learning_rate * db)
model = Model()

# Collect the history of W-values and b-values to plot later
Ws, bs = [], []
epochs = range(10)
for epoch in epochs:
  Ws.append(model.W.numpy())
  bs.append(model.b.numpy())
  current_loss = loss(outputs, model(inputs))

  train(model, inputs, outputs, learning_rate=0.1)
  print('Epoch %2d: W=%1.2f b=%1.2f, loss=%2.5f' %
        (epoch, Ws[-1], bs[-1], current_loss))

# Let's plot it all
plt.plot(epochs, Ws, 'r',
         epochs, bs, 'b')
plt.plot([TRUE_W] * len(epochs), 'r--',
         [TRUE_b] * len(epochs), 'b--')
plt.legend(['W', 'b', 'True W', 'True b'])
plt.show()