
.. _extending_theano:

****************
Extending Theano
****************

Theano graphs
-------------

- Theano works with symbolic graphs
- Those graphs are bi-partite graphs (graph with 2 types of nodes)
- The 2 types of nodes are Apply and Variable nodes
- Each Apply node has a link to the Op that it executes

Inputs and Outputs are lists of Theano variables

.. image:: ../hpcs2011_tutorial/pics/apply_node.png
    :width: 500 px

Op contract
-----------


.. code-block:: python

    import theano

    class MyOp(theano.Op):
        def make_node(self, *inputs):
            pass

        def __eq__(self, other):
            pass

        def __hash__(self):
            pass

        def __str__(self):
            pass

        # Python implementation:
        def perform(self, node, inputs_storage, output_storage):
            pass

        # C implementation: [see theano web site for other functions]
	def c_code(...):
	# ...
            pass

        # others implementation (pycuda, ...):
        def make_thunk(self, node, storage_map, _, _2):
            pass

        # optional:
        def __init__(self, ...):
            pass

        def grad(self, inputs, g):
            pass

	def R_op(self, inputs, eval_points):
            pass

        def infer_shape(node, (i0_shapes, ...))
            pass

.. ../extending/op.txt

There are 2 mandatory methods that one needs to implement.
The first one is :func:`make_node`. The second one 
would describe the computations that are required to be done
at run time. Currently there are 2 different possibilites:
implement the :func:`perform`
and/or :func:`c_code <Op.c_code>` (and other related :ref:`c methods
<cop>`), or the :func:`make_thunk` method. The ``perform`` allows
to easily wrap an existing python function into Theano. The ``c_code``
and related methods allow the op to generate c code that will be 
compiled and linked by Theano. On the other hand, the ``make_thunk``
method will be called only once during compilation and should generate
a ``thunk``: a standalone function that when called will do the wanted computations.
This is useful if you want to generate code and compile it yourself. For
example, this allows you to use PyCUDA to compile gpu code.

Also there are 2 methods that are highly recommended to be implemented. They are
needed in order to merge duplicate computations involving your op. So if you
do not want Theano to execute your op multiple times with the same inputs,
do implement them. Those methods are :func:`__eq__` and
:func:`__hash__`.

The :func:`infer_shape` method allows to infer shape of some variable, somewhere in the
middle of the computational graph without actually computing the outputs (when possible).
This could be helpful if one only needs the shape of the output instead of the actual outputs.

The :func:`grad` method is required if you want to differentiate some cost whose expression
includes your op.

The :func:`__str__` method is useful in order to provide a more meaningful
string representation of your Op.

The :func:`R_op` method is needed if you want `theano.tensor.Rop` to
work with your op.

Op example
----------

.. code-block:: python

    import theano

    class DoubleOp(theano.Op):
        def __eq__(self, other):
            return type(self) == type(other)

        def __hash__(self):
            return hash(type(self))

        def __str__(self):
            return self.__class__.__name__

        def make_node(self, x):
            x = theano.tensor.as_tensor_variable(x)
            return theano.Apply(self, [x], [x.type()])

        def perform(self, node, inputs, output_storage):
            x = inputs[0]
            z = output_storage[0]
            z[0] = x * 2

        def infer_shape(self, node, i0_shapes):
            return i0_shapes

        def grad(self, inputs, output_grads):
            return [output_grads[0] * 2]

	def R_op(self, inputs, eval_points):
            # R_op can receive None as eval_points.
            # That mean there is no diferientiable path through that input
            # If this imply that you cannot compute some outputs,
            # return None for those.
            if eval_points[0] is None:
                return eval_points
            return self.grad(inputs, eval_points)

Test it!

.. code-block:: python

    x = theano.tensor.matrix()
    f = theano.function([x], DoubleOp()(x))
    import numpy
    inp = numpy.random.rand(5, 5)
    out = f(inp)
    assert numpy.allclose(inp * 2, out)
    print inp
    print out

Testing the gradient
--------------------

The function :ref:`verify_grad <validating_grad>`
verifies the gradient of an Op or Theano graph. It compares the
analytic (symbolically computed) gradient and the numeric
gradient (computed through the Finite Difference Method).

To verify the grad method of the DoubleOp, you can use this:

.. code-block:: python

   import numpy
   import theano.tests
   theano.tests.unittest_tools.verify_grad(DoubleOp(), [numpy.random.rand(5,7,2)])

If nothing happens, then it works! If you want to see it fail, you can
implement a wrong gradient (for instance removing the multiplication by 2).

Testing the Rop
---------------

The functions :func:`RopLop_checker.check_mat_rop_lop`,
:func:`RopLop_checker.check_rop_lop` and :func:`RopLop_checker.check_nondiff_rop` allow to test the implemntation of the Rop of one function.

To verify the Rop method of the DoubleOp, you can use this:

.. code-block:: python

   import numpy
   import theano.tests
   from theano.tensor.tests.test_rop import RopLop_checker
   class test_Double(RopLop_checker):
       def setUp(self):
           super(test_Double, self).setUp()
       def test_double_rop(self):
           self.check_rop_lop(DoubleOp()(self.x), self.in_shape)
	   assert False

You can use `nosetests` to run it as all other test in Theano or you can run it like that in a python shell:

.. code-block:: python

   t = test_Double("test_double_rop")
   t.setUp()
   t.test_double_rop()

Exercises 8
-----------

- Run the code in the file double_op.py.
- Modify and execute to compute: x * y
- Modify and execute the example to return 2 outputs: x + y and x - y

  - Our current element-wise fusion generates computation with only 1 output.



