Basics of Tensorflow Crash Course

In [6]:
import tensorflow as tf
print(tf.VERSION)

import numpy as np

# to make some comparisons of how fast operations are the several frameworks
import time
1.11.0

The framework for computation using Tensorflow follows a different paradigm in relation to what we are used to. First of all we need to create a graph and only then we flow through it.

The fundamental Hello World!

In [7]:
h = tf.constant("Hello")
w = tf.constant(" World!")

hw = h+w
print(hw)

with tf.Session() as sess:
    ans = sess.run(hw)
    
print(ans)
Tensor("add_2:0", shape=(), dtype=string)
b'Hello World!'

We create the independent nodes of the graph:

In [8]:
a = tf.constant(5)
b = tf.constant(2)
c = tf.constant(32)

print(a)
Tensor("Const_12:0", shape=(), dtype=int32)

and from these we create another nodes that are dependent on the previous ones

In [9]:
d = tf.multiply(a,b)
e = tf.add(c,b)
f = tf.subtract(d,e)

Notice that the most dependent node is f, whose value will be the result of the flow along the graph. So in order to compute this value we need to open a session and run it. Until now, in opposition to what happens in plain Python, the variables are not yet initialized. To see this just try to make a print of them.

In [10]:
print(a)
Tensor("Const_12:0", shape=(), dtype=int32)
In [11]:
sess = tf.Session()

# the output is the result of a session run
out_a = sess.run(a)
out_f = sess.run(f)

# and now close the session in order to free the default_graph.
sess.close()

# and now finally our variable is no longer just an object but as a value on it.
print(out_a  , out_f)
5 -24

In order to record the value of a variable, in this case \textbf{a}, we need to explicitly atribute the result of its run to a variable outside the graph. Moreover, we need to update all the dependent nodes of the graph in order to make the updates necessary. In the next example, we change the value of the node \textbf{a}, but because we did not updated the nodes \textbf{d} and \textbf{e} on which \textbf{f} is dependent, the output of the session will not suffer any change.

In [12]:
a = tf.constant(4)
f = tf.subtract(d,e)

sess = tf.Session()

out_a = sess.run(a)
out_d = sess.run(d)
out_f = sess.run(f)

sess.close()

print(out_a , out_d , out_f)
4 10 -24
In [13]:
a = tf.constant(40)
d = tf.multiply(a,b)
e = tf.add(c,b)
f = tf.subtract(d,e)

sess = tf.Session()

out_a = sess.run(a)
out_d = sess.run(d)
out_f = sess.run(f)

sess.close()

print(out_a , out_d , out_f)
40 80 46

Of course there is a much better way to produce the output of the variables in question. Basically you declare a list with the names of the variables and run a session over them.

In [14]:
sess = tf.Session()

varas = [a,b,c,d,e,f]

outs = sess.run(varas)

sess.close()

print(outs)
[40, 2, 32, 80, 34, 46]

Data Types

There is a pertinent difference between the edges and the nodes of the graph in the tensorflow. The nodes are quantities that waiting to flow through the edges. In the previous examples, when we made an assignment \textbf{tf.add()}, we were creating an instance for the original values that were stored could flow along the graph. Because the most general data structure that we can have in tensorflow, the one that is stored in the nodes, are tensors, just the multidmensional generalization of a matrix, the name came as obvious.

The default graph in tensorflow, once initialized, is just a structure that sotre tensor objects (nodes) linked by operation objects (edges).

Among all tensors that we can store in a node, the most basic are numbers (0-tensor) and strings.

In [15]:
aa = tf.constant(3.09,dtype = tf.float64)
print(aa)
print(aa.dtype)
Tensor("Const_17:0", shape=(), dtype=float64)
<dtype: 'float64'>

The cast has an obvious importance in any language. In tensorflow it is done through the function \textbf{tf.cast()}.

In [16]:
x = tf.constant([1,2,3],name='x' , dtype=tf.float32)
x = tf.cast(x, dtype = tf.int32)
print(x)
print(x.dtype)
Tensor("Cast:0", shape=(3,), dtype=int32)
<dtype: 'int32'>
In [17]:
y = tf.constant(np.array([1,2,3]),name = 'y' , dtype = tf.float32)
print(y.dtype)
print(y.name)
print(y.shape)
<dtype: 'float32'>
y:0
(3,)

A tensor is at most part of the times initialized by a numpy nd-array.

In [18]:
# a 3-tensor is just a bidimensional array of matrices. The generalization follows that path.
nparray_init = np.random.random((20,30,10))

atf = tf.constant(nparray_init)

print(atf)
# instead of the print we can use the tf.get_shape() function to extract the tuple with the tensor dimensions 
atf.get_shape()
Tensor("Const_18:0", shape=(20, 30, 10), dtype=float64)
Out[18]:
TensorShape([Dimension(20), Dimension(30), Dimension(10)])
In [19]:
# there are several initializers that follows the same patterns of numpy
teste1 = tf.random_normal((10,3),0,1)
print(teste1.get_shape())

teste2 = tf.linspace(10.0,11,20)
print(teste2.get_shape())
(10, 3)
(20,)

A core operation that we need to do often is the matmul (the product of two matrices). We need to assure that the dimensions match between the two matrices in order to be able to compute this operation. In numpy we could use the \textbf{np.reshape()} function to do this. In tensorflow the equivalent will be \textbf{tf.reshape()}, used as follows.

In [20]:
teste2 = tf.linspace(0.0,10,10)
# in this moment it is simply a 1-tensor
print(teste2.get_shape())

# notice that in python the calls are made by value and not by reference. This means that this 
tf.reshape(teste2,[10,1])
# will keep the shape of teste2 unchanged
print(teste2.get_shape())

# we need to make an assignement in order the changes take place
teste2 = tf.reshape(teste2,[10,1])
print(teste2.get_shape())
(10,)
(10,)
(10, 1)

As said before, to run the necessary computations we need first to create a graph and then to create a session. In this session we will run all the operations and produce the necessary outputs.

However if we need to perform some operations aside, we can create another session with \textbf{tf.InteractiveSession( )}. In this way, we do not need to produce changes in the default session (in the default graph) to accommodate this auxiliary operations.

In [21]:
matA = np.random.uniform(0,5,(2,10))
matAtf = tf.constant(matA)
print(matAtf)
print(teste2)
Tensor("Const_19:0", shape=(2, 10), dtype=float64)
Tensor("Reshape_1:0", shape=(10, 1), dtype=float32)
In [22]:
# we need to perform a cast because teste2.dtype = float32 while matAtf.dtype = float64
teste2 = tf.cast(teste2, dtype=tf.float64)

result = tf.matmul(matAtf , teste2)

sess = tf.InteractiveSession()
outs = sess.run(result)
sess.close()

print(outs)
[[144.33185168]
 [129.78904184]]

Names and Name Scopes

Within a graph there can not be more than one node with the same name. As practical consequence, if we assign the same name to more than one node, tensorflow will automatically generate a different string to deal with this.

In [23]:
c1 = tf.constant(4.0 , name = 'c')
c2 = tf.constant(4.0 , name = 'c')

print(c1.name,c2.name)
c:0 c_1:0

Using namescopes, we can group a set of tensor with the same prefix name.

In [24]:
c1 = tf.constant(4.0 , name = 'c1')
with tf.name_scope('test_name'):
    c2 = tf.constant(4.0 , name = 'c1')
    c3 = tf.constant(4.0 , name = 'c3')

print(c1.name , c2.name , c3.name)
c1:0 test_name/c1:0 test_name/c3:0

Variables and Placeholders

While nodes in a session run always have their values changed, it could be useful if some can keep the value from one moment until it must be updated. These are managed by \textbf{Varible( )}. In spite of the name can be a little bit contradictory, what the concept means is just that we just change the value of a variable by direct order.

In languages like C and C++, the most common method to initialize a variable is to reserve memory and assign a value to it. In Tensorflow the things come in opposite order: we first need to say which value will initialize a variable and then we will assign space for it. This is the equivalent to say in C that we need to allocate memory for an integer or a float.

In [25]:
init_val = tf.random_normal((1,5),0,1)
var = tf.Variable(init_val,name = 'var')
init = tf.global_variables_initializer()

print("Before the session runs: ", var)

# the beginning of the session
sess = tf.InteractiveSession()
sess.run(init)
post_var = sess.run(var)
sess.close()

print("After the session runs: ", post_var)
Before the session runs:  <tf.Variable 'var:0' shape=(1, 5) dtype=float32_ref>
After the session runs:  [[-2.1106927   0.09882971 -0.44557652 -1.0070975  -0.84188193]]

Placeholders is just a memory space that is reserved for some type of datastructure. The is one fundamental keyword called \textbf{None}. This is equivalent to do not specify the some of the characteristics of the object that will occupy that space. The next set of examples makes this clear.

In [26]:
mat1 = np.random.uniform(0,1,(10,5))
mat2 = np.random.uniform(0,1,(5,1))

# We create a placeholder that will receive a matrix with undefined number of rows 
# but with a fixed size of 5 columns.
myph1 = tf.placeholder(tf.float32,shape=(None,5))

# We can fill the second placeholder with 5 rows and any number of columns
# Doing this, the matrix product is always possible between what fills both placeholders.
myph2 = tf.placeholder(tf.float32 , shape = (5,None))
result = tf.matmul(myph1 , myph2)

sess = tf.InteractiveSession()
# we feed the placeholders using a dictionary
outs = sess.run(result,feed_dict={myph1:mat1 , myph2:mat2})
sess.close()

print(outs)
[[0.77183646]
 [1.1297846 ]
 [0.65974027]
 [0.88213825]
 [1.0029517 ]
 [0.8556793 ]
 [0.86641806]
 [0.77480614]
 [1.0566782 ]
 [1.1952276 ]]