Metadata-Version: 2.4
Name: parsec-python
Version: 0.1.0
Summary: A Monadic Parser Combinator for Python.
Author-email: luminox <lunex_nocty@qq.com>
License-Expression: LGPL-2.1
Project-URL: Homepage, https://github.com/lunexnocty/parsec-python
Project-URL: Issues, https://github.com/lunexnocty/parsec-python/issues
Classifier: Development Status :: 3 - Alpha
Classifier: Topic :: Software Development :: Libraries
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Typing :: Typed
Requires-Python: >=3.13
Description-Content-Type: text/markdown
License-File: LICENSE
Provides-Extra: lint
Requires-Dist: ruff; extra == "lint"
Requires-Dist: pyright>=1.1.400; extra == "lint"
Provides-Extra: test
Provides-Extra: dev
Requires-Dist: twine>=6.1.0; extra == "dev"
Requires-Dist: build>=1.2.2.post1; extra == "dev"
Dynamic: license-file

<h1 align="center"> Parsec <a href="https://deepwiki.com/lunexnocty/parsec-python"><img src="https://deepwiki.com/badge.svg" alt="Ask DeepWiki"></a></h1>
<p  align="center">
  <em>A Monadic Parser Combinator for Python.</em>
</p>
<p align="center">
    <img src="https://img.shields.io/github/license/lunexnocty/parsec-python.svg?style=flat-square" alt="License"/>
    <img src="https://github.com/lunexnocty/parsec-python/actions/workflows/typecheck.yml/badge.svg" alt="Typed"/>
    <img src="https://github.com/lunexnocty/parsec-python/actions/workflows/unittest.yml/badge.svg" alt="Tests"/>
</p>
<p align="center">
  <a href="https://deepwiki.com/lunexnocty/parsec-python"><strong>Documents</strong></a>
  ·
  <a href="https://github.com/lunexnocty/parsec-python/tree/main/examples">Example</a>
  ·
  <a href="https://github.com/lunexnocty/parsec-python/issues">Issue</a>
</p>

---
Parsec provides a declarative, modular way to build complex text parsers. By leveraging the powerful expressiveness of Monads, you can compose simple parsers like building blocks to handle complex grammar structures, while elegantly managing state and errors during the parsing process.

Unlike traditional parser generation tools (like Lark, PLY), Parsec allows you to define and combine your parsing logic directly within Python code, without the need for separate grammar files, making parser construction more flexible and dynamic.

## Features

* [X] **Monadic** A Parser is a monad.
* [X] **Declarative** Declarative grammar definition
* [X] **Operator**  Operator-based combinators(`<<`, `>>`, `&`, `|`, `/`, `@`)
* [X] **Lazy Evalution** Lazy evaluation for recursion
* [X] **Curried** Curried functional interfaces
* [X] **Typed** Support type inference

## Intallation
### Requirements
For full type support, Python 3.13+ is required.
### Install from source code
```bash
git clone https://github.com/lunexnocty/parsec-python.git
cd parsec-python
uv install .
```

## Quick Start

The `parsec.text` module provides a set of fundamental text parsers, such as `number`, `char`, `blank`, etc. The `parsec.combinator` module offers a rich, curried functional interface for combinators, enabling you to compose these basic parsers into powerful and expressive parsing logic.

To make parser composition even more intuitive, several operators have been overloaded:

- `p << f` equivalent to `f(p)`, enables successive application of combinators to a parser, supporting a piping style of composition.
- `p >> f` equivalent to `p.bind(f)`, represents the monadic bind operation, allowing the result of parser p to determine and sequence the next parser generated by function f. This enables context-sensitive and dependent parsing, as found in the Monad interface in functional programming.
- `p @ f` equivalent to `p.map(f)`, mapping the `f` over the result of the parser `p`. This corresponds to the Functor's `map` (or `fmap`) operation, allowing you to transform the output of a parser in a declarative and compositional way.
- The `&` operator combines multiple parsers in sequence and collects their results into a tuple.
- The `|` operator tries each parser in sequence and returns the first successful result.
- The `/` operator is similar to `|`, but never backtracking.

Parsers can also be defined lazily to support recursive grammar definitions—essential for constructs like nested expressions or parentheses.

Below is an arithmetic expression grammar supporting operator precedence, parentheses, and left-associative chaining:

```ebnf
expr           := <add_or_sub_exp>
add_or_sub_exp := <mul_or_div_exp> (('+' | '-') <mul_or_div_exp>)*
mul_or_div_exp := <factor> (('*' | '/') <factor>)*
factor         := '(' <expr> ')' | <num>
num            := { number }
```

Leveraging these features, you can build a fully functional arithmetic calculator with ease:

```python
from parsec import combinator as C
from parsec import text as T
from parsec.utils import curry

def calc(op: str):
    @curry
    def _(x: int | float, y: int | float):
        return {"+": x + y, "-": x - y, "*": x * y, "/": x / y}[op]
    return _

expr = T.Parser[str, int | float]()
num = T.number << C.trim(T.blank)
factor = expr << C.between(T.open_round)(T.close_round) | num
mul_or_div = T.char("*") | T.char("/")  # operator `|`
mul_or_div_op = (mul_or_div << C.trim(T.blank)) @ calc  #  priority of `@` is higher than `<<`
mul_or_div_expr = factor << C.chainl1(mul_or_div_op)
add_or_sub = T.item << C.range("+-")  # use `range` combinator
add_or_sub_op = (add_or_sub << C.trim(T.blank)) @ calc
add_or_sub_expr = mul_or_div_expr << C.chainl1(add_or_sub_op)
expr.define(add_or_sub_expr)  # Lazy definition

src = "(1. + .2e-1) * 100 - 1 / 2.5 "
assert T.parse(expr, src) == eval(src)  # True
```

The `parsec.combinator.chainl1` combinator handles left-associative chaining of operations, parsing one or more occurrences of a parser `p` separated by an operator parser, and combining results in a left-associative manner.

This approach is highly extensible: you can add additional operators, functions, or syntax features by composing and reusing combinators.

- For more basic text parsers, see [`parsec.text`](./parsec/text.py)
- For more parser combinators, see [`parsec.combinator`](./parsec/combinator.py)

## Architecture
A `parser` is a function that takes a `Context[I]` as input and returns a `Result[I, R]`, where `I` and `R` are generic type parameters. Here, `I` represents the type of each element in the input stream, and `R` denotes the type of the value produced by the parser.
```python
parser[I, R]: Context[I] -> Result[I, R]
```

The parsing function is wrapped in the Parser[I, R] class, endowing it with a monadic interface for functional composition and declarative parsing.
```python
class Parser[I, R]:
  def bind[S](self, fn: R -> Parser[I, S]) -> Parser[I, S]: ...
  def okay(self, value: R) -> Parser[I, R]: ...
  def fail(self, errpr: ParseErr) -> Parser[I, R]: ...
```

A `Context[I]` consists of two primary components: a `stream[I]`, which provides access to the underlying input sequence, and a `State[I]`, which manages auxiliary parsing state. If you need to parse data types other than text, you can extend `IStream` and `IState` to implement custom stream and state management logic, enabling the parsing of arbitrary data sequences.

```python
class Context[I]:
  stream: IStream[I]
  state: IState[I]
```

The `Result[I, R]` type represents the outcome of a parsing operation, containing the updated context, the parsing result (either a successfully parsed value or an error), and the number of input elements consumed during parsing.

```python
class Result[I, R]:
  context: Context[I]
  outcome: Okay[R] | Fail
  consumed: int
```

The `combinator` module provides a curried functional interface for composing parsers, and also supports method chaining. Both styles are equivalent in expressive power.

`Parsec` do not support left-recursive grammars.
