.. _shape_info:

============================================
How shape informations are handled by Theano
============================================

It is not possible to enforce strict shape into a Theano variable when
building a graph. The given parameter of theano.function can change the
shape any TheanoVariable in a graph.

Currently shape informations are used for 2 things in Theano:

- When the exact shape is known, we use it to generate faster c code for
  the 2d convolution on the cpu and gpu.

- To remove computations in the graph when we only want to know the
  shape, but not the actual value of a variable. This is done with the
  `Op.infer_shape <http://deeplearning.net/software/theano/extending/cop.html#Op.infer_shape>`_
  method.

  ex:
.. code-block:: python

   import theano
   x = theano.tensor.matrix('x')
   f = theano.function([x], (x**2).shape)
   theano.printing.debugprint(f)
   #MakeVector [@43860304] ''   2
   # |Shape_i{0} [@43424912] ''   1
   # | |x [@43423568]
   # |Shape_i{1} [@43797968] ''   0
   # | |x [@43423568]

The output of this compiled function do not contain any multiplication
or power. Theano has removed them to compute directly the shape of the
output.

Shape inference problem
=======================

Theano propagates shape information in the graph. Sometimes this
can lead to errors. For example:

.. code-block:: python

   import numpy
   import theano
   x = theano.tensor.matrix('x')
   y = theano.tensor.matrix('y')
   z = theano.tensor.join(0,x,y)
   xv = numpy.random.rand(5,4)
   yv = numpy.random.rand(3,3)

   f = theano.function([x,y], z.shape)
   theano.printing.debugprint(f)
   #MakeVector [@23910032] ''   4
   # |Elemwise{Add{output_types_preference=transfer_type{0}}}[(0, 0)] [@24055120] ''   3
   # | |Shape_i{0} [@23154000] ''   1
   # | | |x [@23151760]
   # | |Shape_i{0} [@23593040] ''   2
   # | | |y [@23151888]
   # |Shape_i{1} [@23531152] ''   0
   # | |x [@23151760]

   #MakeVector [@56338064] ''   4
   # |Elemwise{Add{output_types_preference=transfer_type{0}}}[(0, 0)] [@56483152] ''   3
   # | |Shape_i{0} [@55586128] ''   1
   # | | |<TensorType(float64, matrix)> [@55583888]
   # | |Shape_i{0} [@56021072] ''   2
   # | | |<TensorType(float64, matrix)> [@55584016]
   # |Shape_i{1} [@55959184] ''   0
   # | |<TensorType(float64, matrix)> [@55583888]

   print f(xv,yv)# DOES NOT RAISE AN ERROR AS SHOULD BE.
   #[8,4]

   f = theano.function([x,y], z)# Do not take the shape.
   theano.printing.debugprint(f)
   #Join [@44540496] ''   0
   # |0 [@44540432]
   # |x [@44540240]
   # |y [@44540304]

   f(xv,yv)
   # Raise a dimensions mismatch error.

As you see, when you ask for the shape of some computation (join in the
example), we sometimes compute an inferred shape directly, without executing
the computation itself (there is no join in the first output or debugprint).

This makes the computation of the shape faster, but it can hide errors. In
the example, the computation of the shape of join is done on the first
theano variable in the join, not on the other.

This can probably happen with many other op as elemwise, dot, ...
Indeed, to make some optimizations (for speed or stability, for instance),
Theano can assume that the computation is correct and consistent
in the first place, this is the case here.

You can detect those problem by running the code without this
optimization, with the Theano flag
`optimizer_excluding=local_shape_to_shape_i`. You can also have the
same effect by running in the mode FAST_COMPILE (it will not apply this
optimization, nor most other optimizations) or DEBUG_MODE (it will test
before and after all optimizations (much slower)).


Specifing exact shape
=====================

Currently, specifying a shape is not as easy as we want. We plan some
upgrade, but this is the current state of what can be done.

- You can pass the shape info directly to the `ConvOp` created
  when calling conv2d. You must add the parameter image_shape
  and filter_shape to that call. They but most be tuple of 4
  elements. Ex:

.. code-block:: python

    theano.tensor.nnet.conv2d(..., image_shape=(7,3,5,5), filter_shape=(2,3,4,4))

- You can use the SpecifyShape op to add shape anywhere in the
  graph. This allows to do some optimizations. In the following example,
  this allows to precompute the Theano function to a constant.

.. code-block:: python

   import theano
   x = theano.tensor.matrix()
   x_specify_shape = theano.tensor.specify_shape(x, (2,2))
   f = theano.function([x], (x_specify_shape**2).shape)
   theano.printing.debugprint(f)
   # [2 2] [@72791376]

Future plans
============

- Add the parameter "constant shape" to theano.shared(). This is probably
  the most frequent use case when we will use it. This will make the code
  simpler and we will be able to check that the shape does not change when
  we update the shared variable.
