Metadata-Version: 2.4
Name: tarzi
Version: 0.0.12
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Rust
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Dist: maturin>=1.0,<2.0 ; extra == 'dev'
Requires-Dist: pytest>=6.0 ; extra == 'dev'
Requires-Dist: pytest-cov>=4.0 ; extra == 'dev'
Requires-Dist: black>=22.0 ; extra == 'dev'
Requires-Dist: ruff>=0.1.0 ; extra == 'dev'
Requires-Dist: isort>=5.0 ; extra == 'dev'
Requires-Dist: autoflake>=2.0 ; extra == 'dev'
Requires-Dist: twine>=4.0.0 ; extra == 'dev'
Requires-Dist: build>=0.10.0 ; extra == 'dev'
Requires-Dist: pytest>=6.0 ; extra == 'test'
Requires-Dist: pytest-cov>=4.0 ; extra == 'test'
Requires-Dist: pytest-asyncio>=0.20.0 ; extra == 'test'
Requires-Dist: sphinx>=6.0.0 ; extra == 'docs'
Requires-Dist: sphinx-copybutton>=0.5.2 ; extra == 'docs'
Requires-Dist: myst-parser>=2.0.0 ; extra == 'docs'
Requires-Dist: sphinx-tabs>=3.4.1 ; extra == 'docs'
Requires-Dist: sphinx-design>=0.5.0 ; extra == 'docs'
Requires-Dist: furo>=2023.9.10 ; extra == 'docs'
Requires-Dist: sphinx-autoapi>=3.0.0 ; extra == 'docs'
Provides-Extra: dev
Provides-Extra: test
Provides-Extra: docs
License-File: LICENSE
Summary: Rust-native lite search for AI applications
Keywords: web-scraping,search-engine,ai-tools,rust,browser-automation
Author: xmingc <chenxm35@gmail.com>
Author-email: xmingc <chenxm35@gmail.com>
Maintainer-email: xmingc <chenxm35@gmail.com>
License: Apache-2.0
Requires-Python: >=3.10
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Homepage, https://github.com/mirasurf/tarzi.rs
Project-URL: Documentation, https://tarzi.readthedocs.io/
Project-URL: Repository, https://github.com/mirasurf/tarzi.rs
Project-URL: Bug Tracker, https://github.com/mirasurf/tarzi.rs/issues

<div align="center">
  <img src="https://github.com/mirasurf/tarzi.rs/blob/4e751f8d389c0ac7f2061afa9286d2d7fa551aaf/static/tarzi-320.png" alt="Tarzi Logo" width="200" height="200">
</div>
<h1 align="center">tarzi.rs</h1>  
<p align="center">
  <a href="https://crates.io/crates/tarzi">
    <img src="https://img.shields.io/crates/v/tarzi.svg?style=flat-square" alt="Crate Version" />
  </a>
  <a href="https://pypi.org/project/tarzi/">
    <img src="https://img.shields.io/pypi/v/tarzi.svg?style=flat-square" alt="PyPI Version" />
  </a>
  <!-- CI and Docs -->
  <a href="https://github.com/mirasurf/tarzi.rs/actions/workflows/rust-ci.yml">
    <img src="https://github.com/mirasurf/tarzi.rs/actions/workflows/rust-ci.yml/badge.svg" alt="Rust CI" />
  </a>
  <a href="https://github.com/mirasurf/tarzi.rs/actions/workflows/python-ci.yml">
    <img src="https://github.com/mirasurf/tarzi.rs/actions/workflows/python-ci.yml/badge.svg" alt="Python CI" />
  </a>
  <a href="https://tarzirs.readthedocs.io/en/latest/">
    <img src="https://app.readthedocs.org/projects/tarzirs/badge/?version=latest&style=flat" alt="Docs" />
  </a>
  <!-- License -->
  <a href="https://www.apache.org/licenses/LICENSE-2.0">
    <img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg?style=flat-square" alt="License" />
  </a>
  <!-- X (formerly Twitter) -->
  <a href="https://x.com/mirasurf_ai">
    <img src="https://img.shields.io/twitter/follow/mirasurf_ai?label=@mirasurf_ai&style=flat-square" alt="X Follow" />
  </a>
</p>

## 🐒 Tarzi

**Tarzi** is a unified search interface designed for **Retrieval-Augmented Generation (RAG)** and **agentic systems** built on large language models. Search is a core functionality in these systems, yet most search engine providers (SEPs) impose API paywalls or strict rate limits. **Tarzi**, empowered by browser automation and web crawling technologies, removes these barriers by supporting token-free queries across multiple search engines. With a single dependency, you can integrate and switch between different SEPs as needed—seamlessly and efficiently.

<div align="center">
  <img src="static/tariz-workflow.png" alt="Tarzi Logo" width="100%">
</div>

## ⚙️ Core Capabilities

- 🦀 **Dual Implementation**: Native Rust library and Python wrapper with CLI tools
- 🔄 **Content Conversion**: Convert raw HTML into Markdown, JSON, or YAML, which is ready for LLMs
- 🔍 **Search Integration**: Fetch fully rendered result pages with a unified interface for both browser (token-free) and API (token-required) modes
- 🧠 **Multi-Engine Support**: Works with Bing, Google, DuckDuckGo, Brave Search, Tavily, and more  
- 🛡️ **Proxy Support**: Bypass network bans using proxy support to access global SEPs
- 🚀 **End-to-End Workflow**: Full pipeline from search to content extraction for AI and automation use cases

## 🧪 Advanced Features (Supports required)

- 🖥️ **Custom Browser Controls**: Set screen size, viewport, and locale for realistic behavior  
- 🕵️‍♂️ **Anti-Bot Evasion**: Use fingerprint spoofing, proxy rotation, and human-like actions to avoid detection  
- 🧠 **Smarter Queries**: Improve search results with prompt rewriting and intent-aware queries 
- 🔗 **Workflow Automation**: Chain steps like search, click, form fill, and scraping into automated flows  
- 🤖 **Agent Integration (MCP)**: Connect with agent frameworks for context-aware, distributed task execution  
- 📊 **Observability**: Monitor success rate, latency, CAPTCHA frequency, and export logs for analysis

## Install

```
pip install tarzi
```

## Usage Examples

* Examples in Python and Rust: [examples](/examples/)

## Alternatives

* LangChain [PlayWrightBrowserToolkit](https://python.langchain.com/docs/integrations/tools/playwright/)

## Contributors

Thank you ❤ all human and non-human contributors.

[![tarzi contributors](https://contrib.rocks/image?repo=mirasurf/tarzi.rs "tarzi contributors")](https://github.com/mirasurf/tarzi.rs/graphs/contributors)

