.. _cop:

====================================
Implementing the arithmetic Ops in C
====================================

Now that we have set up our ``double`` type properly to allow C
implementations for operations that work on it, all we have to do now
is to actually define these operations in C.


How does it work?
=================

Before a C :ref:`op` is executed, the variables related to each of its
inputs will be declared and will be filled appropriately, either from
an input provided by the end user (using c_extract) or it might simply
have been calculated by another operation. For each of the outputs,
the variables associated to them will be declared and initialized.

The operation then has to compute what it needs to using the
input variables and place the variables in the output variables.


What needs to be defined
========================

There are less methods to define for an Op than for a Type:

.. class:: Op

    .. method:: c_code(node, name, input_names, output_names, sub)

      This must return C code that carries the computation we want to do.

      sub is a dictionary of strings for you to substitute into your code.
      It's not clear if it ever contains anything other than 'fail'.
      sub['fail'] is a string of code that you should execute (after calling
      PyErr_Format) if your C code needs to raise an exception.

    .. method:: c_code_cleanup(node, name, input_names, output_names, sub)

      This must return C code that cleans up whatever c_code allocated and
      that we must free.

      *Default:* The default behavior is to do nothing.

    .. method:: c_headers()
    .. method:: c_header_dirs()
    .. method:: c_libraries()
    .. method:: c_lib_dirs()

      Allows you to specify headers, libraries, and their directories,

    .. method:: c_compile_args()
    .. method:: c_no_compile_args()

      Allows you to specify special g++ arguments to add/exclude

    .. method:: c_support_code()

      Allows you to specify helper functions/structs that the
      :ref:`op` needs.  That code will be reused for each apply of
      this op. It will be inserted at global scope.

    .. method:: c_support_code_apply(node, name)

      Allows you to specify helper functions/structs specialized for a
      particular apply of an :ref:`op`. Use `c_support_code` if the
      code is the same for each apply of an op.
      It will be inserted at global scope.

    .. method:: infer_shape(node, (i0_shapes,i1_shapes,...))

      Allow optimizations to lift the Shape op over this op.
      An example of why this is good is when we only need the shape of a
      variable: we will be able to obtain it without computing the variable
      itself.
      Must return a list where each element is a tuple representing the shape
      of one output.
      For example, for the matrix-matrix product ``infer_shape`` will have as
      inputs (node, ((x0,x1), (y0,y1))) and should return [(x0, y1)]. Both the
      inputs and the return value may be Theano variables.

    .. method:: c_code_cache_version()

       Should return a tuple of hashable objects like integers. This
       specifies the version of the code. It is used to cache the
       compiled code. You MUST change the returned tuple for each
       change in the code. If you don't want to cache the compiled code
       return an empty tuple or don't implement it.

The ``name`` argument is currently given an invalid value, so steer
away from it. As was the case with Type, ``sub['fail']`` provides
failure code that you *must* use if you want to raise an exception,
after setting the exception message.

The ``node`` argument is an :ref:`apply` node representing an
application of the current Op on a list of inputs, producing a list of
outputs. ``input_names`` and ``output_names`` arguments contain as
many strings as there are inputs and outputs to the application of the
Op and they correspond to the ``name`` that is passed to the type of
each Variable in these lists. For example, if ``node.inputs[0].type ==
double``, then ``input_names[0]`` is the ``name`` argument passed to
``double.c_declare`` etc. when the first input is processed by Theano.

In a nutshell, ``input_names`` and ``output_names`` parameterize the
names of the inputs your operation needs to use and the outputs it
needs to put variables into. But this will be clear with the examples.


Defining the methods
====================

We will be defining C code for the multiplication Op on doubles.

**c_code**

.. If you modify this code, also change :
.. theano/tests/test_tutorial.py:T_extending.test_extending_2

.. code-block:: python

   def c_code(node, name, input_names, output_names, sub):
       x_name, y_name = input_names[0], input_names[1]
       output_name = output_names[0]
       return """
       %(output_name)s = %(x_name)s * %(y_name)s;
       """ % locals()
   mul.c_code = c_code

And that's it. As we enter the scope of the C code we are defining in
the method above, many variables are defined for us. Namely, the
variables x_name, y_name and output_name are all of the primitive C
``double`` type and they were declared using the C code returned by
``double.c_declare``.

Implementing multiplication is as simple as multiplying the two input
doubles and setting the output double to what comes out of it. If you
had more than one output, you would just set the variable(s) for
each output to what they should be.

.. warning::
   Do *NOT* use C's ``return`` statement to return the variable(s) of
   the computations. Set the output variables directly as shown
   above. Theano will pick them up for you.


**c_code_cleanup**

There is nothing to cleanup after multiplying two doubles. Typically,
you won't need to define this method unless you malloc() some
temporary storage (which you would free() here) or create temporary
Python objects (which you would Py_XDECREF() here).


Final version
=============

As before, I tried to organize the code in order to minimize
repetition. You can check that mul produces the same C code in this
version that it produces in the code I gave above.


.. If you modify this code, also change :
.. theano/tests/test_tutorial.py:T_extending.test_extending_2

.. code-block:: python

   from theano import gof

   class BinaryDoubleOp(gof.Op):

       def __init__(self, name, fn, ccode):
           self.name = name
           self.fn = fn
           self.ccode = ccode

       def make_node(self, x, y):
           if isinstance(x, (int, float)):
               x = gof.Constant(double, x)
           if isinstance(y, (int, float)):
               y = gof.Constant(double, y)
           if x.type != double or y.type != double:
               raise TypeError('%s only works on doubles' % self.name)
           return gof.Apply(self, [x, y], [double()])

       def perform(self, node, inp, out):
           x, y = inp
           z, = out
           z[0] = self.fn(x, y)

       def __str__(self):
           return self.name

       def c_code(self, node, name, inp, out, sub):
           x, y = inp
           z, = out
           return self.ccode % locals()


   add = BinaryDoubleOp(name = 'add',
                        fn = lambda x, y: x + y,
                        ccode = "%(z)s = %(x)s + %(y)s;")

   sub = BinaryDoubleOp(name = 'sub',
                        fn = lambda x, y: x - y,
                        ccode = "%(z)s = %(x)s - %(y)s;")

   mul = BinaryDoubleOp(name = 'mul',
                        fn = lambda x, y: x * y,
                        ccode = "%(z)s = %(x)s * %(y)s;")

   div = BinaryDoubleOp(name = 'div',
                        fn = lambda x, y: x / y,
                        ccode = "%(z)s = %(x)s / %(y)s;")
