Small parsec-like parser combinators library with semi-automatic error recovery.
pip install reparsecWith reparsec, simple arithmetic expression parser and evaluator could be written like this:
from typing import Callable
from reparsec import Delay
from reparsec.scannerless import literal, parse, regexp
from reparsec.sequence import eof
def op_action(op: str) -> Callable[[int, int], int]:
    return {
        "+": lambda a, b: a + b,
        "-": lambda a, b: a - b,
        "*": lambda a, b: a * b,
    }[op]
spaces = regexp(r"\s*")
number = regexp(r"\d+").fmap(int) << spaces
mul_op = regexp(r"[*]").fmap(op_action) << spaces
add_op = regexp(r"[-+]").fmap(op_action) << spaces
l_paren = literal("(") << spaces
r_paren = literal(")") << spaces
expr = Delay[str, int]()
expr.define(
    (
        number |
        expr.between(l_paren, r_paren)
    )
    .chainl1(mul_op)
    .chainl1(add_op)
)
parser = expr << eof()This parser can:
- evaluate an expression:
>>> parser.parse("1 + 2 * (3 + 4)").unwrap()
15- report first syntax error:
>>> parser.parse("1 + 2 * * (3 + 4 5)").unwrap()
Traceback (most recent call last):
...
reparsec.types.ParseError: at 8: expected '('- attempt to recover and report multiple syntax errors:
>>> parser.parse("1 + 2 * * (3 + 4 5)", recover=True).unwrap()
Traceback (most recent call last):
...
reparsec.types.ParseError: at 8: expected '(' (skipped 2 tokens), at 17: expected ')' (skipped 1 token)- automatically repair input and return some result:
>>> parser.parse("1 + 2 * * (3 + 4 5)", recover=True).unwrap(recover=True)
15- track line and column numbers:
>>> parse(parser, """1 +
... 2 * * (
... 3 + 4 5)""", recover=True).unwrap()
Traceback (most recent call last):
...
reparsec.types.ParseError: at 2:5: expected '(' (skipped 2 tokens), at 3:7: expected ')' (skipped 1 token)More examples: