.. _halo_profiling:

Halo Profiling
==============
.. sectionauthor:: Britton Smith <britton.smith@colorado.edu>

The HaloProfiler provides a means of performing analysis on multiple points in a dataset at 
once.  This is primarily intended for use with cosmological simulations, in which  
gravitationally bound structures composed of dark matter and gas, called halos, form and 
become the hosts for galaxies and galaxy clusters.

The HaloProfiler performs two primary functions: radial profiles and projections.  
With only a few exceptions discussed below, all of the HaloProfiler's machinery can 
be run in parallel, with `mpi4py <http://code.google.com/p/mpi4py/>`_ installed, by running 
your script inside an mpirun call with the --parallel flag at the end.

Configuring the HaloProfiler
----------------------------

A sample script to run the HaloProfiler can be found in :ref:`cookbook-run_halo_profiler`.  
In order to run the HaloProfiler on a dataset, a HaloProfiler object must be instantiated 
with the path to the dataset as the only argument:

.. code-block:: python

  import yt.extensions.HaloProfiler as HP
  hp = HP.HaloProfiler("DD0242/DD0242")

Most of the HaloProfiler's options are configured with keyword arguments given at 
instantiation.  These options are:

 * **halos** (*str*): "multiple" for profiling more than one halo.  In this mode halos are read in from a list or identified with a `halo finder <../cookbook/running_halofinder.html>`_.  In "single" mode, the one and only halo center is identified automatically as the location of the peak in the density field.  Default: "multiple".

 * **halo_list_file** (*str*): name of file containing the list of halos.  The HaloProfiler will look for this file in the data directory.  Default: "HopAnalysis.out".

 * **halo_list_format** (*str* or *dict*): the format of the halo list file.  "yt_hop" for the format given by yt's halo finders.  "enzo_hop" for the format written by enzo_hop.  This keyword can also be given in the form of a dictionary specifying the column in which various properties can be found.  For example, {"id": 0, "center": [1, 2, 3], "mass": 4, "radius": 5}.  Default: "yt_hop".

 * **halo_finder_function** (*function*): If halos is set to multiple and the file given by halo_list_file does not exit, the halo finding function specified here will be called.  Default: HaloFinder (yt_hop).

 * **halo_finder_args** (*tuple*): args given with call to halo finder function.  Default: None.

 * **halo_finder_kwargs** (*dict*): kwargs given with call to halo finder function. Default: None.

 * **use_density_center** (*bool*): re-center halos before performing profiles with an center of mass weighted by overdensity.  This is generally not needed.  Default: False.

 * **density_center_exponent** (*float*): when use_density_center set to True, this specifies the exponent, alpha, such that the halo center calculation is weighted by overdensity^alpha.  Default: 1.0.

 * **use_field_max_center** (*str*): another alternative for halo re-centering by selecting the location of the maximum of the field given by this keyword.  This is generally not needed.  Default: None.

 * **halo_radius** (*float*): if no halo radii are provided in the halo list file, this parameter is used to specify the radius out to which radial profiles will be made.  This keyword is also used when halos is set to single.  Default: 0.1.

 * **radius_units** (*str*): the units of **halo_radius**.  Default: "1" (code units).

 * **n_profile_bins** (*int*): the number of bins in the radial profiles.  Default: 50.

 * **profile_output_dir** (*str*): the subdirectory, inside the data directory, in which radial profile output files will be created.  The directory will be created if it does not exist.  Default: "radial_profiles".

 * **projection_output_dir** (*str*): the subdirectory, inside the data directory, in which projection output files will be created.  The directory will be created if it does not exist.  Default: "projections".

 * **projection_width** (*float*): the width of halo projections.  Default: 8.0.

 * **projection_width_units** (*str*): the units of projection_width. Default: "mpc".

 * **project_at_level** (*int* or "max"): the maximum refinement level to be included in projections.  Default: "max" (maximum level within the dataset).

 * **velocity_center** (*list*): the method in which the halo bulk velocity is calculated (used for calculation of radial and tangential velocities.  Valid options are:
     	- ["bulk", "halo"] (Default): the velocity provided in the halo list
        - ["bulk", "sphere"]: the bulk velocity of the sphere centered on the halo center.
    	- ["max", field]: the velocity of the cell that is the location of the maximum of the field specified (used only when halos set to single).

 * **filter_quantities** (*list*): quantities from the original halo list file to be written out in the filtered list file.  Default: ['id','center'].

.. warning:: The HaloProfiler runs in parallel in a round-robin style, evenly distributing the list of halos among all processors.  Hence, the HaloProfiler will not work in parallel when **halos** is set to single.

Profiles
--------

Once the HaloProfiler object has been instantiated, fields can be added for profiling with 
the :meth:`add_profile` method:

.. code-block:: python

  hp.add_profile('CellVolume', weight_field=None, accumulation=True)
  hp.add_profile('TotalMassMsun', weight_field=None, accumulation=True)
  hp.add_profile('Density', weight_field=None, accumulation=False)
  hp.add_profile('Temperature', weight_field='CellMassMsun', accumulation=False)
  hp.make_profiles()

The :meth:`make_profiles` method will begin the profiling.

.. image:: profiles.png
   :width: 500

Radial profiles of Overdensity (left) and Temperature (right) for five halos.

Projections
-----------

The process of making projections is similar to that of profiles:

.. code-block:: python

  hp.add_projection('Density', weight_field=None)
  hp.add_projection('Temperature', weight_field='Density')
  hp.add_projection('Metallicity', weight_field='Density')
  hp.make_projections(axes=[0, 1, 2], save_cube=True, save_images=True, halo_list="filtered")

If **save_cube** is set to True, the projection data will be written to a set of hdf5 files 
in the directory given by **projection_output_dir**.  The keyword, **halo_list**, can be 
used to select between the full list of halos ("all"), the filtered list ("filtered"), or 
an entirely new list given in the form of a file name.  See :ref:`filter_functions` for a 
discussion of filtering halos.

.. image:: projections.png
   :width: 500

Projections of Density (top) and Temperature, weighted by Density (bottom), in the x (left), 
y (middle), and z (right) directions for a single halo with a width of 8 Mpc.

Halo Filters
------------

Filters can be added to create a refined list of halos based on their profiles or to avoid 
profiling halos altogether based on information given in the halo list file.

.. _filter_functions:

Filter Functions
^^^^^^^^^^^^^^^^

It is often the case that one is looking to identify halos with a specific set of 
properties.  This can be accomplished through the creation of filter functions.  A filter 
function can take as many args and kwargs as you like, as long as the first argument is a 
profile object, or at least a dictionary which contains the profile arrays for each field.  
Filter functions must return a list of two things.  The first is a True or False indicating 
whether the halo passed the filter.  The second is a dictionary containing quantities 
calculated for that halo that will be written to a file if the halo passes the filter.  A 
sample filter function based on virial quantities can be found in 
``yt/extensions/HaloFilters.py``.

Halo filtering takes place during the call to :meth:`make_profiles`.  The 
:meth:`add_halo_filter` method is used to add a filter to be used during the profiling:

.. code-block:: python

  hp.add_halo_filter(HP.VirialFilter, must_be_virialized=True, 
                     overdensity_field='ActualOverdensity', 
		     virial_overdensity=200, 
		     virial_filters=[['TotalMassMsun','>=','1e14']],
		     virial_quantities=['TotalMassMsun','RadiusMpc'])

The addition above will calculate and return virial quantities, mass and radius, for an 
overdensity of 200.  In order to pass the filter, at least one point in the profile must be 
above the specified overdensity and the virial mass must be at least 1e14 solar masses.  If 
the VirialFilter function has been added to the filter list, the HaloProfiler will make 
sure that the fields necessary for calculating virial quantities are added.  As 
many filters as desired can be added.  If filters have been added, the next call to 
:meth:`make_profiles` will filter by all of the added filter functions:

.. code-block:: python

  hp.make_profiles(filename="FilteredQuantities.out")

If the **filename** keyword is set, a file will be written with all of the filtered halos 
and the quantities returned by the filter functions.

.. note:: If the profiles have already been run, the HaloProfiler will read in the previously created output files instead of re-running the profiles.  The HaloProfiler will check to make sure the output file contains all of the requested halo fields.  If not, the profile will be made again from scratch.

.. _halo_profiler_pre_filters:

Pre-filters
^^^^^^^^^^^

A single dataset can contain thousands or tens of thousands of halos.  Significant time can 
be saved by not profiling halos that are certain to not pass any filter functions in place.  
Simple filters based on quantities provided in the initial halo list can be used to filter 
out unwanted halos using the **prefilters** keyword:

.. code-block:: python

  hp.make_profiles(filename="FilteredQuantities.out",
		   prefilters=["halo['mass'] > 1e13"])

Arguments provided with the **prefilters** keyword should be given as a list of strings.  
Each string in the list will be evaluated with an *eval*.

.. note:: If a VirialFilter function has been added with a filter based on mass (as in the example above), a prefilter will be automatically added to filter out halos with masses greater or less than (depending on the conditional of the filter) a factor of ten of the specified virial mass.
