Code, Explained: Training a model in TensorFlow

Jessica YungArtificial Intelligence, Self-Driving Car ND

In a previous post, we went through the TensorFlow code for a multilayer perceptron. Now we will discuss how we train the model with TensorFlow, specifically in a TensorFlow Session.

We will use Aymeric Damien’s implementation in this post. I recommend you skim through the code first and have the code open in a separate window. I have included the key portions of the code below.

Procedures within a TensorFlow Session

Let’s take a look at the portion of code that works inside a TensorFlow session first.

Within a session, we

  1. Initialise variables
  2. For each epoch:
    1. For each batch:
      1. Run (1) optimisation op (backprop) and (2) cost op (to get loss value)
      2. Compute average cost for batch
    2. Display average cost for this epoch
tf_train_model_diagram.png

Slightly more graphical

Unpacking the optimisation and cost operations

The meat of this is in running the optimisation and cost operations. Let’s unpack that section. The code is

This runs the optimiser and cost operations with the input  x = batch_x, y = batch_y. We’re feeding the model the input, hence feed_dict.

Return values:

  • _ is the variable we use to hold what gets returned by running the optimizer in the Session. We don’t need to save that output since the optimiser alters the model’s weights and biases directly, so we save the optimiser’s returned value to _.
  • c is the variable we save the cost to.

What’s happening when we run optimizer and cost? We go further back into the code to find:

Looking further back, we can see where pred comes from:

What does all this mean? We can summarise it in a diagram:

tf_pred_cost_optimiser.png

Further reading: