Metadata-Version: 2.0
Name: sitemap-python
Version: 0.2.0
Summary: UNKNOWN
Home-page: https://github.com/socrateslee/sitemap_python
Author: Lichun
Author-email: UNKNOWN
License: MIT
Platform: UNKNOWN
Classifier: Environment :: Console
Classifier: Framework :: Django
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python
Requires-Dist: six

sitemap\_python
===============

A Python utility for building sitemaps.

Usage
-----

Generate sitemap
~~~~~~~~~~~~~~~~

::

    import datetime
    import sitemap.generator as generator

    sitemap = generator.Sitemap()
    sitemap.add("http://www.example.com",
                lastmod=datetime.datetime.now(),
                changefreq="monthly",
                priority="1.0")
    sitemap_xml = sitemap.generate()


    sitemap_index = generator.Sitemap(type='sitemapindex')
    sitemap_index.add("http://www.example.com/sitemap01.xml",
                      lastmod=datetime.datetime.now(),
    sitemap_index_xml = sitemap_index.generate()

Ping search engine
~~~~~~~~~~~~~~~~~~

Currently support ping Google and Bing with sitemap urls.

::

    import sitemap.ping as ping

    ping.ping("google", "http://www.example.com/sitemap.xml")
    ping.ping_urls("bing", ["http://www.example.com/sitemap.xml"])

Push url to Baidu
~~~~~~~~~~~~~~~~~

Push urls directly to Baidu. Related document available `at
here <http://zhanzhang.baidu.com/college/courseinfo?id=267&page=2#h2_article_title14>`__.

::

    import sitemap.baidu as baidu
    bp = baidu.BaiduPush("http://www.example.com", "<YOUR_KEY>")
    bp.add("http://www.example.com/example.html")
    bp.flush()

Verify the spider ip address
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

**sitemap.spider** can be use to verify whether the ip address of spider
is genius.

Example:

::

    from sitemap.spider import get_verified_spider_name

    # spider_name will be None if no search engine is matched
    spider_name = get_verified_spider_name("66.249.65.219")

The method **get\_verified\_spider\_name** has uses
*socket.gethostbyaddr*, which may be slow in some cases. So make
**guess\_spider\_name\_from\_ua** method may filter out several results
via User-Agent.

::

    from sitemap.spider import get_verified_spider_name, guess_spider_name_from_ua

    spider_name = guess_spider_name_from_ua(spider_ua)
    if spider_name:
        spider_name = get_verified_spider_name(spider_ip)


