Simple Optimization in TensorFlow
Actually using TensorFlow to optimize/fit a model is similar to the workflow we outlined in the Basics section, but with a few crucial additions:
- Placeholder variables for
X
andy
- Defining a
loss
function - Select an
Optimizer
object you want to use - Make a
train
node that uses theOptimizer
to minimize theloss
- Run your
Session()
to fetch thetrain
node, passing your placeholdersX
andy
withfeed_dict
Another Iris Example
Assuming comfort with the general intuition of Logistic Regression, we’ll spin up a trivial example to demonstrate setting up the probem in TensorFlow.
from sklearn.datasets import load_iris
import tensorflow as tf
data = load_iris()
X = data.data
X[:5]
array([[ 5.1, 3.5, 1.4, 0.2],
[ 4.9, 3. , 1.4, 0.2],
[ 4.7, 3.2, 1.3, 0.2],
[ 4.6, 3.1, 1.5, 0.2],
[ 5. , 3.6, 1.4, 0.2]])
y = data.target
y = (y == 0).astype(float)
y[48:54]
array([ 1., 1., 0., 0., 0., 0.])
print(X.shape, y.shape)
(150, 4) (150,)
The model
We use tf.placeholder()
to slot out nodes we’ll use to pass in observations.
x = tf.placeholder(tf.float32, shape=[None, 4])
y_true = tf.placeholder(tf.float32, shape=None)
The w
eights and b
ias terms will update in each iteration. We’ll initialize them to zeros and let TensorFlow do the rest.
y_pred
leverages w
and b
at each step, applying the sigmoid function.
w = tf.Variable([[0, 0, 0, 0]], dtype=tf.float32, name='weights')
b = tf.Variable(0, dtype=tf.float32, name='bias')
y_pred = tf.sigmoid(tf.matmul(w, tf.transpose(x)) + b)
We need to define a loss
function for TensorFlow to evaluate against.
The most popular cost function for classification is tf.nn.sigmoid_cross_entropy_with_logits
, with labels
set to your targets and logits
the node/placeholder in your execution graph.
The reduce_mean()
gives us the “one over m
, times the sum” leading value in our cost function.
loss = tf.nn.sigmoid_cross_entropy_with_logits(labels=y_true, logits=y_pred)
loss = tf.reduce_mean(loss)
Finaly, we define an optimization strategy and use that to build a train
node.
learning_rate = 0.5
optimizer = tf.train.AdamOptimizer(learning_rate=0.5)
train = optimizer.minimize(loss)
Execute the graph
All told, actually running this model requires initializing the global variables and a call to tf.Session().run()
to fetch the train
node, passing in our training observations.
NUM_STEPS = 25
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for step in range(NUM_STEPS):
sess.run(train, feed_dict={x: X, y_true: y})
if step % 5 == 0:
print(step, sess.run([w, b]))
print(10, sess.run([w, b]))
0 [array([[-0.49999967, -0.49999914, -0.49999964, -0.49999908]], dtype=float32), -0.49999782]
5 [array([[-1.63306212, -1.6268971 , -1.63869369, -1.6400671 ]], dtype=float32), -1.6299324]
10 [array([[-2.16804147, -2.15893531, -2.17636108, -2.1783905 ]], dtype=float32), -2.1634178]
15 [array([[-2.47912383, -2.4683075 , -2.48900652, -2.49141765]], dtype=float32), -2.4736314]
20 [array([[-2.66975784, -2.65789342, -2.68059874, -2.68324351]], dtype=float32), -2.6637332]
10 [array([[-2.76915216, -2.75674129, -2.78049254, -2.78325939]], dtype=float32), -2.7628503]
Printing the weights we can see that within a few quick iterations the model is already learning better values for our features to minimize loss.