=========
cssselect
=========

`cssselect` is a parser for `CSS Selectors` Level 3 that can also translate
selectors to `XPath 1.0`_ queries. Such queries can be used in lxml_ to find
the matching elements in an XML or HTML document.

This module used to live inside of lxml as ``lxml.cssselect`` before it was
extracted as a stand-alone project.

.. _CSS Selectors: http://www.w3.org/TR/selectors/
.. _XPath 1.0: http://www.w3.org/TR/xpath/
.. _lxml: http://lxml.de/


.. contents::
..
   1  The CSSSelector class
   2  CSS Selectors
     2.1  Namespaces
   3  Limitations


The CSSSelector class
=====================

The most important class in the ``cssselect`` module is ``CSSSelector``.  It
provides the same interface as lxml’s XPath_ class, but accepts a CSS selector
expression as input:

.. _XPath: http://lxml.de/xpathxslt.html#xpath

.. sourcecode:: pycon

    >>> from cssselect import CSSSelector
    >>> sel = CSSSelector('div.content')
    >>> sel  #doctest: +ELLIPSIS
    <CSSSelector ... for 'div.content'>
    >>> sel.css
    'div.content'

The selector actually compiles to XPath, and you can see the
expression by inspecting the object:

.. sourcecode:: pycon

    >>> sel.path
    "descendant-or-self::div[contains(concat(' ', normalize-space(@class), ' '), ' content ')]"

To use the selector, simply call it with a document or element
object:

.. sourcecode:: pycon

    >>> from lxml.etree import fromstring
    >>> h = fromstring('''<div id="outer">
    ...   <div id="inner" class="content body">
    ...       text
    ...   </div></div>''')
    >>> [e.get('id') for e in sel(h)]
    ['inner']


CSS Selectors
=============

This libraries attempts to implement CSS selectors `as described in
the w3c specification
<http://www.w3.org/TR/2001/CR-css3-selectors-20011113/>`_.  Many of
the pseudo-classes do not apply in this context, including all
`dynamic pseudo-classes
<http://www.w3.org/TR/2001/CR-css3-selectors-20011113/#dynamic-pseudos>`_.
In particular these will not be available:

* link state: ``:link``, ``:visited``, ``:target``
* actions: ``:hover``, ``:active``, ``:focus``
* UI states: ``:enabled``, ``:disabled``, ``:indeterminate``
  (``:checked`` and ``:unchecked`` *are* available)

Also, none of the pseudo-elements apply, because the selector only
returns elements and pseudo-elements select portions of text, like
``::first-line``.


Namespaces
==========

In CSS you can use ``namespace-prefix|element``, similar to
``namespace-prefix:element`` in an XPath expression.  In fact, it maps
one-to-one, and the same rules are used to map namespace prefixes to
namespace URIs.


Limitations
===========

These applicable pseudoclasses are not yet implemented:

* ``:lang(language)``
* ``*:first-of-type``, ``*:last-of-type``, ``*:nth-of-type``,
  ``*:nth-last-of-type``, ``*:only-of-type``.  All of these work when
  you specify an element type, but not with ``*``

Unlike XPath you cannot provide parameters in your expressions -- all
expressions are completely static.

XPath has underspecified string quoting rules (there seems to be no
string quoting at all), so if you use expressions that contain
characters that requiring quoting you might have problems with the
translation from CSS to XPath.
