The reason you might want to use the Functional API (instead of the Sequential API) is if you need more flexibility. For example, you are using:
- multiple inputs and outputs in a multitask learning model, or conditioning variables.
- complicated non-linear topology with layers feeding to multiple other layers.
With the Sequential API, we didn’t have a separate layer for the data input. We could add the input shape to the first layer and that would tell the network what size of inputs to expect. However the Functional API uses Input
layer from tensorflow.keras.layers
explicitly to represent the data input. Also the Model
class from tensorflow.keras.models
will be used.
Unlike the Sequential API, the object of a layer is not added into a list in the model, but the layer object is used as a function, that calls on the inputs
object. For example:
outputs = Conv1D(16, 5, activation='relu')(inputs)
The outputs
is then passed as the input to the next layer (as a function call), and then move on to the next layer. At each stage, we create a layer object and then we use that layer object as a function that calls on a tensor inputs
and returns a tensor outputs
, until the final layer we have outputs
object. In this way, we define the flow of the model. Finally build the model:
model = Model(inputs=inputs, outputs=outputs)
Once we have this model instance built, everything from here onwards is just the same as Sequential API:
model.compile | Compile the model with loss function, optimizer and metrics. |
model.fit | Fit the model on training data, with a held-out validation data. |
model.evaluate | Evaluate the model with test data. |
model.predict | Make model predictions. |
Multiple Inputs and Outputs
When you want to use techniques like conditioning variable, you need to use multiple inputs and probably multiple outputs, meanwhile model has to include all of them. For example:
inputs = Input(shape=(32,1))
aux_inputs = Input(shape=(12,))
...
outputs = Dense(20, activation='sigmoid')([inputs, aux_inputs])
aux_outputs = Dense(1, activation='linear')([inputs, aux_inputs])
...
model = Model(inputs=[inputs, aux_inputs],
outputs=[outputs, aux_outputs])
We need to be carefully about how we compile and train our model. Two outputs most likely to have different loss functions. We need to pass a list of loss functions. These losses are in the same order as the order of outputs when creating the model. We also need to combine the two losses somehow to use a graident-based optimizer, by specifying loss_weights
:
model.compile(loss=['binary_crossentropy', 'mse'], loss_weight=[1, 0.4], ...)
The fit
method of the model will also need multiple inputs and outputs, The same thing goes for the evaluate
and predict
methods as well.
history = model.fit([X_train, X_aux],
[y_train, y_aux], ...)
Variables and Tensors
Variables and tensors are lower level objects in TensorFlow.
The kernel matrix and bias in the Dense layer are variable objects, which persist throughout your program, but contain values that can be changed over the course of your program. We can create variables independently of any model.
Usually, we’ll be using variables as part of our models, and when we call model.fit
to launch the training run, this will change the variable values for us as part of the optimization, but we can also change the variable values manually. Every variable has a method called assign()
, we can just pass this method a new value for the variable to store. We can also directly convert the variable to a numpy array using the method numpy()
.
Tensors are multidimensional versions of vectors and arrays. When we build neural network models, we are actually defining a computational graph where input data is processed through the layers of the network and sent through the graph all the way to the outputs. Tensors are the object that gets passed around and capture those computation within the graph. For example, the addition operation of a variable with a constant value produces a tensor output.
my_var = tf.Variable([-1, 2], dtype=tf.float32, name='my_var')
h = my_var + [5, 4]
print(h)
# tf.Tensor([4. 6.], shape(2,), dtype=float32)
Another example: a Dense
layer has an input tensor, does some computation at the input tensor, and return an output tensor. These tensors carry the information that flow through our neural network or computational graph.
inputs = Input(shape=(5,))
h = Dense(16, activation='sigmoid')(inputs)
print(h)
# tf.Tensor('dense/Indentity:0', shape=(None, 16), dtype=float32)
The value of this tensor is unknown yet, which depends on the inputs that is fed into the model. This tensor only gets a concrete value when the graph is executed or when the network is run with a given input. The output tensor will also be a batch of outputs and the None
captures the variable batch size.
Finally, you may also create a constant tensor:
x = tf.constant([[5, 2], [1, 3]])
print(x)
# tf.Tensor([[5, 2] [1, 3]], shape=(2, 2), dtype=int32)
tf.ones
and tf.zeros
can also create constant tensor filled with ones or zeros.
Accessing Model Layers
We can take a look at the layer of a model by using model.layers()
method, which outputs a list of instances of layer objects. This means we can retrieve a specific layer by indexing the list, and then start digging into that layer. For example: accessing the weights of the second layer:
# retrieve a layer by index
print(model.layers[1].weights)
# or return numpy array
print(model.layers[1].get_weights()]
For convolutional and dense layers, we can also access the kernel and bias variables independently. Once we have the model built, we can access model’s input and output tensors directly from the model object itself.
# retrieve a layer by name
print(model.get_layer('some_layer').input)
print(model.get_layer('some_layer').output)
When it comes to the tensors, we can easily access them inside a network, and use them to build new models out of old ones. Since every layer has an input and output tensor, and every model has an input and output tensor., we can build new models using tensors from inside other models:
- Create a new model by removing some layers from an existing model.
- Create a new model by adding extra new layers to an existing model.
- Either sequential API or functional API could be used.
Freezing Layers and Models
In transfer learning, it is very typical to take part of modules and use them as a feature extractor to build new models. While we do this though, we might want to make sure that the parameters of the feature extractor stay fixed / frozen during the training, and there we only train the new layers we’ve added to the model. This can be done by specifying trainable=False
when creating a layer.
h = Conv2D(16, 3, activation='relu', ..., trainable=False)
Another method is freeze the layer after the model is built (but before the model is complied):
model.get_layer('conv2d_layer').trainable = False
We can also freeze entire models:
model = load_model('pretrained_model')
model.trainable = False
My Certificate
For more on The Keras Functional API, please refer to the wonderful course here https://www.coursera.org/learn/customising-models-tensorflow2
Related Quick Recap
I am Kesler Zhu, thank you for visiting my website. Check out more course reviews at https://KZHU.ai