Merge pull request #959 from lark-parser/v1.0-merge-master

4 years ago · b79c449dc3
--- a/README.md
+++ b/README.md
@@ -26,7 +26,7 @@ Most importantly, Lark will save you time and prevent you from getting parsing h
 - [Documentation @readthedocs](https://lark-parser.readthedocs.io/)
 - [Cheatsheet (PDF)](/docs/_static/lark_cheatsheet.pdf)
 - [Online IDE (very basic)](https://lark-parser.github.io/lark/ide/app.html)
 - [Online IDE](https://lark-parser.github.io/ide)
 - [Tutorial](/docs/json_tutorial.md) for writing a JSON parser.
 - Blog post: [How to write a DSL with Lark](http://blog.erezsh.com/how-to-write-a-dsl-in-python-with-lark/)
 - [Gitter chat](https://gitter.im/lark-parser/Lobby)
--- a/docs/classes.rst
+++ b/docs/classes.rst
@@ -66,6 +66,8 @@ UnexpectedInput
 .. autoclass:: lark.exceptions.UnexpectedCharacters
 .. autoclass:: lark.exceptions.UnexpectedEOF
 InteractiveParser
 -----------------
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -113,7 +113,7 @@ Resources
 .. _Examples: https://github.com/lark-parser/lark/tree/master/examples
 .. _Third-party examples: https://github.com/ligurio/lark-grammars
 .. _Online IDE: https://lark-parser.github.io/lark/ide/app.html
 .. _Online IDE: https://lark-parser.github.io/ide
 .. _How to write a DSL: http://blog.erezsh.com/how-to-write-a-dsl-in-python-with-lark/
 .. _Program Synthesis is Possible: https://www.cs.cornell.edu/~asampson/blog/minisynth.html
 .. _Cheatsheet (PDF): _static/lark_cheatsheet.pdf
--- a/docs/json_tutorial.md
+++ b/docs/json_tutorial.md
@@ -427,9 +427,9 @@ I measured memory consumption using a little script called [memusg](https://gist
 | Lark - Earley *(with lexer)* | 42s | 4s | 1167M | 608M |
 | Lark - LALR(1) | 8s | 1.53s | 453M | 266M |
 | Lark - LALR(1) tree-less | 4.76s | 1.23s | 70M | 134M |
 | PyParsing ([Parser](http://pyparsing.wikispaces.com/file/view/jsonParser.py)) | 32s | 3.53s | 443M | 225M |
 | funcparserlib ([Parser](https://github.com/vlasovskikh/funcparserlib/blob/master/funcparserlib/tests/json.py)) | 8.5s | 1.3s | 483M | 293M |
 | Parsimonious ([Parser](https://gist.githubusercontent.com/reclosedev/5222560/raw/5e97cf7eb62c3a3671885ec170577285e891f7d5/parsimonious_json.py)) | ? | 5.7s | ? | 1545M |
 | PyParsing ([Parser](https://github.com/pyparsing/pyparsing/blob/master/examples/jsonParser.py)) | 32s | 3.53s | 443M | 225M |
 | funcparserlib ([Parser](https://github.com/vlasovskikh/funcparserlib/blob/master/tests/json.py)) | 8.5s | 1.3s | 483M | 293M |
 | Parsimonious ([Parser](https://gist.github.com/reclosedev/5222560)) | ? | 5.7s | ? | 1545M |
 I added a few other parsers for comparison. PyParsing and funcparselib fair pretty well in their memory usage (they don't build a tree), but they can't compete with the run-time speed of LALR(1).
@@ -442,7 +442,7 @@ Once again, shout-out to PyPy for being so effective.
 This is the end of the tutorial. I hoped you liked it and learned a little about Lark.
 To see what else you can do with Lark, check out the [examples](examples).
 To see what else you can do with Lark, check out the [examples](/examples).
 For questions or any other subject, feel free to email me at erezshin at gmail dot com.
--- a/docs/visitors.rst
+++ b/docs/visitors.rst
@@ -107,3 +107,8 @@ Discard
 -------
 .. autoclass:: lark.visitors.Discard
 VisitError
 -------
 .. autoclass:: lark.exceptions.VisitError
--- a/examples/advanced/python3.lark
+++ b/examples/advanced/python3.lark
@@ -21,7 +21,7 @@ decorators: decorator+
 decorated: decorators (classdef | funcdef | async_funcdef)
 async_funcdef: "async" funcdef
 funcdef: "def" NAME "(" parameters? ")" ["->" test] ":" suite
 funcdef: "def" NAME "(" [parameters] ")" ["->" test] ":" suite
 parameters: paramvalue ("," paramvalue)* ["," SLASH] ["," [starparams | kwparams]]
          | starparams
@@ -29,25 +29,36 @@ parameters: paramvalue ("," paramvalue)* ["," SLASH] ["," [starparams | kwparams
 SLASH: "/" // Otherwise the it will completely disappear and it will be undisguisable in the result
 starparams: "*" typedparam? ("," paramvalue)* ["," kwparams]
 kwparams: "**" typedparam
 kwparams: "**" typedparam ","?
 ?paramvalue: typedparam ["=" test]
 ?typedparam: NAME [":" test]
 ?paramvalue: typedparam ("=" test)?
 ?typedparam: NAME (":" test)?
 varargslist: (vfpdef ["=" test] ("," vfpdef ["=" test])* ["," [ "*" [vfpdef] ("," vfpdef ["=" test])* ["," ["**" vfpdef [","]]] | "**" vfpdef [","]]]
  | "*" [vfpdef] ("," vfpdef ["=" test])* ["," ["**" vfpdef [","]]]
  | "**" vfpdef [","])
 vfpdef: NAME
 lambdef: "lambda" [lambda_params] ":" test
 lambdef_nocond: "lambda" [lambda_params] ":" test_nocond
 lambda_params: lambda_paramvalue ("," lambda_paramvalue)* ["," [lambda_starparams | lambda_kwparams]]
          | lambda_starparams
          | lambda_kwparams
 ?lambda_paramvalue: NAME ("=" test)?
 lambda_starparams: "*" [NAME]  ("," lambda_paramvalue)* ["," [lambda_kwparams]]
 lambda_kwparams: "**" NAME ","?
 ?stmt: simple_stmt | compound_stmt
 ?simple_stmt: small_stmt (";" small_stmt)* [";"] _NEWLINE
 ?small_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt | import_stmt | global_stmt | nonlocal_stmt | assert_stmt)
 ?expr_stmt: testlist_star_expr (annassign | augassign (yield_expr|testlist)
         | ("=" (yield_expr|testlist_star_expr))*)
 annassign: ":" test ["=" test]
 ?testlist_star_expr: (test|star_expr) ("," (test|star_expr))* [","]
 !augassign: ("+=" | "-=" | "*=" | "@=" | "/=" | "%=" | "&=" | "|=" | "^=" | "<<=" | ">>=" | "**=" | "//=")
 ?small_stmt: (expr_stmt | assign_stmt | del_stmt | pass_stmt | flow_stmt | import_stmt | global_stmt | nonlocal_stmt | assert_stmt)
 expr_stmt: testlist_star_expr 
 assign_stmt: annassign | augassign | assign
 annassign: testlist_star_expr ":" test ["=" test]
 assign: testlist_star_expr ("=" (yield_expr|testlist_star_expr))+
 augassign: testlist_star_expr augassign_op (yield_expr|testlist)
 !augassign_op: "+=" | "-=" | "*=" | "@=" | "/=" | "%=" | "&=" | "|=" | "^=" | "<<=" | ">>=" | "**=" | "//="
 ?testlist_star_expr: test_or_star_expr 
                   | test_or_star_expr ("," test_or_star_expr)+ ","?  -> tuple
                   | test_or_star_expr ","  -> tuple
 // For normal and annotated assignments, additional restrictions enforced by the interpreter
 del_stmt: "del" exprlist
 pass_stmt: "pass"
@@ -71,43 +82,52 @@ global_stmt: "global" NAME ("," NAME)*
 nonlocal_stmt: "nonlocal" NAME ("," NAME)*
 assert_stmt: "assert" test ["," test]
 compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated | async_stmt
 ?compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated | async_stmt
 async_stmt: "async" (funcdef | with_stmt | for_stmt)
 if_stmt: "if" test ":" suite ("elif" test ":" suite)* ["else" ":" suite]
 if_stmt: "if" test ":" suite elifs ["else" ":" suite]
 elifs: elif_*
 elif_: "elif" test ":" suite
 while_stmt: "while" test ":" suite ["else" ":" suite]
 for_stmt: "for" exprlist "in" testlist ":" suite ["else" ":" suite]
 try_stmt: ("try" ":" suite ((except_clause ":" suite)+ ["else" ":" suite] ["finally" ":" suite] | "finally" ":" suite))
 with_stmt: "with" with_item ("," with_item)*  ":" suite
 try_stmt: "try" ":" suite except_clauses ["else" ":" suite] [finally]
        | "try" ":" suite finally   -> try_finally
 finally: "finally" ":" suite
 except_clauses: except_clause+ 
 except_clause: "except" [test ["as" NAME]] ":" suite
 with_stmt: "with" with_items ":" suite
 with_items: with_item ("," with_item)* 
 with_item: test ["as" expr]
 // NB compile.c makes sure that the default except clause is last
 except_clause: "except" [test ["as" NAME]]
 suite: simple_stmt | _NEWLINE _INDENT stmt+ _DEDENT
 ?test: or_test ("if" or_test "else" test)? | lambdef
 ?test: or_test ("if" or_test "else" test)?
     | lambdef
 ?test_nocond: or_test | lambdef_nocond
 lambdef: "lambda" [varargslist] ":" test
 lambdef_nocond: "lambda" [varargslist] ":" test_nocond
 ?or_test: and_test ("or" and_test)*
 ?and_test: not_test ("and" not_test)*
 ?not_test: "not" not_test -> not
 ?not_test: "not" not_test -> not_test
         | comparison
 ?comparison: expr (_comp_op expr)*
 ?comparison: expr (comp_op expr)*
 star_expr: "*" expr
 ?expr: xor_expr ("|" xor_expr)*
 ?expr: or_expr
 ?or_expr: xor_expr ("|" xor_expr)*
 ?xor_expr: and_expr ("^" and_expr)*
 ?and_expr: shift_expr ("&" shift_expr)*
 ?shift_expr: arith_expr (_shift_op arith_expr)*
 ?arith_expr: term (_add_op term)*
 ?term: factor (_mul_op factor)*
 ?factor: _factor_op factor | power
 ?factor: _unary_op factor | power
 !_factor_op: "+"|"-"|"~"
 !_unary_op: "+"|"-"|"~"
 !_add_op: "+"|"-"
 !_shift_op: "<<"|">>"
 !_mul_op: "*"|"@"|"/"|"%"|"//"
 // <> isn't actually a valid comparison operator in Python. It's here for the
 // sake of a __future__ import described in PEP 401 (which really works :-)
 !_comp_op: "<"|">"|"=="|">="|"<="|"<>"|"!="|"in"|"not" "in"|"is"|"is" "not"
 !comp_op: "<"|">"|"=="|">="|"<="|"<>"|"!="|"in"|"not" "in"|"is"|"is" "not"
 ?power: await_expr ("**" factor)?
 ?await_expr: AWAIT? atom_expr
@@ -118,61 +138,75 @@ AWAIT: "await"
          | atom_expr "." NAME               -> getattr
          | atom
 ?atom: "(" [yield_expr|tuplelist_comp] ")" -> tuple
     | "[" [testlist_comp] "]"  -> list
     | "{" [dict_comp] "}" -> dict
     | "{" set_comp "}" -> set
 ?atom: "(" yield_expr ")"
     | "(" _tuple_inner? ")" -> tuple
     | "(" comprehension{test_or_star_expr} ")" -> tuple_comprehension
     | "[" _testlist_comp? "]"  -> list
     | "[" comprehension{test_or_star_expr} "]"  -> list_comprehension
     | "{" _dict_exprlist? "}" -> dict
     | "{" comprehension{key_value} "}" -> dict_comprehension
     | "{" _set_exprlist "}" -> set
     | "{" comprehension{test} "}" -> set_comprehension
     | NAME -> var
     | number | string+
     | number 
     | string_concat
     | "(" test ")"
     | "..." -> ellipsis
     | "None"    -> const_none
     | "True"    -> const_true
     | "False"   -> const_false
 ?testlist_comp: test | tuplelist_comp
 tuplelist_comp: (test|star_expr) (comp_for | ("," (test|star_expr))+ [","] | ",")
 ?string_concat: string+
 _testlist_comp: test | _tuple_inner
 _tuple_inner: test_or_star_expr (("," test_or_star_expr)+ [","] | ",")
 ?test_or_star_expr: test
                 | star_expr
 ?subscriptlist: subscript
              | subscript (("," subscript)+ [","] | ",") -> subscript_tuple
 subscript: test | ([test] ":" [test] [sliceop]) -> slice
 ?subscript: test | ([test] ":" [test] [sliceop]) -> slice
 sliceop: ":" [test]
 exprlist: (expr|star_expr)
        | (expr|star_expr) (("," (expr|star_expr))+ [","]|",") -> exprlist_tuple
 testlist: test | testlist_tuple
 ?exprlist: (expr|star_expr)
         | (expr|star_expr) (("," (expr|star_expr))+ [","]|",")
 ?testlist: test | testlist_tuple
 testlist_tuple: test (("," test)+ [","] | ",")
 dict_comp: key_value comp_for 
         | (key_value | "**" expr) ("," (key_value | "**" expr))* [","]
 _dict_exprlist: (key_value | "**" expr) ("," (key_value | "**" expr))* [","]
 key_value: test ":"  test
 set_comp: test comp_for 
        | (test|star_expr) ("," (test | star_expr))* [","]
 _set_exprlist: test_or_star_expr (","  test_or_star_expr)* [","]
 classdef: "class" NAME ["(" [arguments] ")"] ":" suite
 arguments: argvalue ("," argvalue)*  ("," [ starargs | kwargs])?
         | starargs
         | kwargs
         | test comp_for
         | comprehension{test}
 starargs: "*" test ("," "*" test)* ("," argvalue)* ["," kwargs]
 starargs: stararg ("," stararg)* ("," argvalue)* ["," kwargs]
 stararg: "*" test
 kwargs: "**" test
 ?argvalue: test ("=" test)?
 comp_iter: comp_for | comp_if | async_for
 async_for: "async" "for" exprlist "in" or_test [comp_iter]
 comp_for: "for" exprlist "in" or_test [comp_iter]
 comp_if: "if" test_nocond [comp_iter]
 comprehension{comp_result}: comp_result comp_fors [comp_if]
 comp_fors: comp_for+ 
 comp_for: [ASYNC] "for" exprlist "in" or_test
 ASYNC: "async"
 ?comp_if: "if" test_nocond
 // not used in grammar, but may appear in "node" passed from Parser to Compiler
 encoding_decl: NAME
 yield_expr: "yield" [yield_arg]
 yield_arg: "from" test | testlist
 yield_expr: "yield" [testlist]
          | "yield" "from" test -> yield_from
 number: DEC_NUMBER | HEX_NUMBER | BIN_NUMBER | OCT_NUMBER | FLOAT_NUMBER | IMAG_NUMBER
 string: STRING | LONG_STRING
@@ -181,6 +215,7 @@ string: STRING | LONG_STRING
 %import python (NAME, COMMENT, STRING, LONG_STRING)
 %import python (DEC_NUMBER, HEX_NUMBER, OCT_NUMBER, BIN_NUMBER, FLOAT_NUMBER, IMAG_NUMBER)
 // Other terminals
 _NEWLINE: ( /\r?\n[\t ]*/ | COMMENT )+
--- a/examples/standalone/json_parser_main.py
+++ b/examples/standalone/json_parser_main.py
@@ -10,7 +10,9 @@ Standalone Parser
 import sys
 from json_parser import Lark_StandAlone, Transformer, inline_args
 from json_parser import Lark_StandAlone, Transformer, v_args
 inline_args = v_args(inline=True)
 class TreeToJson(Transformer):
    @inline_args
--- a/lark/ast_utils.py
+++ b/lark/ast_utils.py
@@ -38,8 +38,8 @@ def create_transformer(ast_module: types.ModuleType, transformer: Optional[Trans
    Classes starting with an underscore (`_`) will be skipped.
    Parameters:
        ast_module - A Python module containing all the subclasses of `ast_utils.Ast`
        transformer (Optional[Transformer]) - An initial transformer. Its attributes may be overwritten.
        ast_module: A Python module containing all the subclasses of ``ast_utils.Ast``
        transformer (Optional[Transformer]): An initial transformer. Its attributes may be overwritten.
    """
    t = transformer or Transformer()
--- a/lark/common.py
+++ b/lark/common.py
@@ -1,4 +1,5 @@
 from types import ModuleType
 from copy import deepcopy
 from .utils import Serialize
 from .lexer import TerminalDef, Token
@@ -40,6 +41,17 @@ class LexerConf(Serialize):
    def _deserialize(self):
        self.terminals_by_name = {t.name: t for t in self.terminals}
    def __deepcopy__(self, memo=None):
        return type(self)(
            deepcopy(self.terminals, memo),
            self.re_module,
            deepcopy(self.ignore, memo),
            deepcopy(self.postlex, memo),
            deepcopy(self.callbacks, memo),
            deepcopy(self.g_regex_flags, memo),
            deepcopy(self.skip_validation, memo),
            deepcopy(self.use_bytes, memo),
        )
 class ParserConf(Serialize):
--- a/lark/exceptions.py
+++ b/lark/exceptions.py
@@ -41,8 +41,9 @@ class UnexpectedInput(LarkError):
    Used as a base class for the following exceptions:
    - ``UnexpectedToken``: The parser received an unexpected token
    - ``UnexpectedCharacters``: The lexer encountered an unexpected string
    - ``UnexpectedToken``: The parser received an unexpected token
    - ``UnexpectedEOF``: The parser expected a token, but the input ended
    After catching one of these exceptions, you may call the following helper methods to create a nicer error message.
    """
@@ -136,10 +137,13 @@ class UnexpectedInput(LarkError):
 class UnexpectedEOF(ParseError, UnexpectedInput):
    """An exception that is raised by the parser, when the input ends while it still expects a token.
    """
    expected: 'List[Token]'
    def __init__(self, expected, state=None, terminals_by_name=None):
        super(UnexpectedEOF, self).__init__()
        self.expected = expected
        self.state = state
        from .lexer import Token
@@ -149,7 +153,6 @@ class UnexpectedEOF(ParseError, UnexpectedInput):
        self.column = -1
        self._terminals_by_name = terminals_by_name
        super(UnexpectedEOF, self).__init__()
    def __str__(self):
        message = "Unexpected end-of-input. "
@@ -158,12 +161,17 @@ class UnexpectedEOF(ParseError, UnexpectedInput):
 class UnexpectedCharacters(LexError, UnexpectedInput):
    """An exception that is raised by the lexer, when it cannot match the next 
    string of characters to any of its terminals.
    """
    allowed: Set[str]
    considered_tokens: Set[Any]
    def __init__(self, seq, lex_pos, line, column, allowed=None, considered_tokens=None, state=None, token_history=None,
                 terminals_by_name=None, considered_rules=None):
        super(UnexpectedCharacters, self).__init__()
        # TODO considered_tokens and allowed can be figured out using state
        self.line = line
        self.column = column
@@ -182,7 +190,6 @@ class UnexpectedCharacters(LexError, UnexpectedInput):
            self.char = seq[lex_pos]
        self._context = self.get_context(seq)
        super(UnexpectedCharacters, self).__init__()
    def __str__(self):
        message = "No terminal matches '%s' in the current parser context, at line %d col %d" % (self.char, self.line, self.column)
@@ -198,10 +205,15 @@ class UnexpectedToken(ParseError, UnexpectedInput):
    """An exception that is raised by the parser, when the token it received
    doesn't match any valid step forward.
    The parser provides an interactive instance through `interactive_parser`,
    which is initialized to the point of failture, and can be used for debugging and error handling.
    Parameters:
        token: The mismatched token
        expected: The set of expected tokens
        considered_rules: Which rules were considered, to deduce the expected tokens
        state: A value representing the parser state. Do not rely on its value or type.
        interactive_parser: An instance of ``InteractiveParser``, that is initialized to the point of failture,
                            and can be used for debugging and error handling.
    see: ``InteractiveParser``.
    Note: These parameters are available as attributes of the instance.
    """
    expected: Set[str]
@@ -209,6 +221,8 @@ class UnexpectedToken(ParseError, UnexpectedInput):
    interactive_parser: 'InteractiveParser'
    def __init__(self, token, expected, considered_rules=None, state=None, interactive_parser=None, terminals_by_name=None, token_history=None):
        super(UnexpectedToken, self).__init__()
        # TODO considered_rules and expected can be figured out using state
        self.line = getattr(token, 'line', '?')
        self.column = getattr(token, 'column', '?')
@@ -223,7 +237,6 @@ class UnexpectedToken(ParseError, UnexpectedInput):
        self._terminals_by_name = terminals_by_name
        self.token_history = token_history
        super(UnexpectedToken, self).__init__()
    @property
    def accepts(self) -> Set[str]:
@@ -245,18 +258,24 @@ class VisitError(LarkError):
    """VisitError is raised when visitors are interrupted by an exception
    It provides the following attributes for inspection:
    - obj: the tree node or token it was processing when the exception was raised
    - orig_exc: the exception that cause it to fail
    Parameters:
        rule: the name of the visit rule that failed
        obj: the tree-node or token that was being processed
        orig_exc: the exception that cause it to fail
    Note: These parameters are available as attributes
    """
    obj: 'Union[Tree, Token]'
    orig_exc: Exception
    def __init__(self, rule, obj, orig_exc):
        self.obj = obj
        self.orig_exc = orig_exc
        message = 'Error trying to process rule "%s":\n\n%s' % (rule, orig_exc)
        super(VisitError, self).__init__(message)
        self.rule = rule
        self.obj = obj
        self.orig_exc = orig_exc
 ###}
--- a/lark/lark.py
+++ b/lark/lark.py
@@ -79,7 +79,7 @@ class LarkOptions(Serialize):
            Applies the transformer to every parse tree (equivalent to applying it after the parse, but faster)
    propagate_positions
            Propagates (line, column, end_line, end_column) attributes into all tree branches.
            Accepts ``False``, ``True``, or "ignore_ws", which will trim the whitespace around your trees. 
            Accepts ``False``, ``True``, or a callable, which will filter which nodes to ignore when propagating.
    maybe_placeholders
            When ``True``, the ``[]`` operator returns ``None`` when not matched.
@@ -137,7 +137,7 @@ class LarkOptions(Serialize):
            A List of either paths or loader functions to specify from where grammars are imported
    source_path
            Override the source of from where the grammar was loaded. Useful for relative imports and unconventional grammar loading
    **=== End Options ===**
    **=== End of Options ===**
    """
    if __doc__:
        __doc__ += OPTIONS_DOC
@@ -195,7 +195,7 @@ class LarkOptions(Serialize):
        assert_config(self.parser, ('earley', 'lalr', 'cyk', None))
        if self.parser == 'earley' and self.transformer:
            raise ConfigurationError('Cannot specify an embedded transformer when using the Earley algorithm.'
            raise ConfigurationError('Cannot specify an embedded transformer when using the Earley algorithm. '
                             'Please use your transformer on the resulting parse tree, or use a different algorithm (i.e. LALR)')
        if o:
@@ -484,11 +484,11 @@ class Lark(Serialize):
            d = f
        else:
            d = pickle.load(f)
        memo = d['memo']
        memo_json = d['memo']
        data = d['data']
        assert memo
        memo = SerializeMemoizer.deserialize(memo, {'Rule': Rule, 'TerminalDef': TerminalDef}, {})
        assert memo_json
        memo = SerializeMemoizer.deserialize(memo_json, {'Rule': Rule, 'TerminalDef': TerminalDef}, {})
        options = dict(data['options'])
        if (set(kwargs) - _LOAD_ALLOWED_OPTIONS) & set(LarkOptions._defaults):
            raise ConfigurationError("Some options are not allowed when loading a Parser: {}"
@@ -545,11 +545,11 @@ class Lark(Serialize):
            Lark.open_from_package(__name__, "example.lark", ("grammars",), parser=...)
        """
        package = FromPackageLoader(package, search_paths)
        full_path, text = package(None, grammar_path)
        package_loader = FromPackageLoader(package, search_paths)
        full_path, text = package_loader(None, grammar_path)
        options.setdefault('source_path', full_path)
        options.setdefault('import_paths', [])
        options['import_paths'].append(package)
        options['import_paths'].append(package_loader)
        return cls(text, **options)
    def __repr__(self):
@@ -560,6 +560,8 @@ class Lark(Serialize):
        """Only lex (and postlex) the text, without parsing it. Only relevant when lexer='standard'
        When dont_ignore=True, the lexer will return all tokens, even those marked for %ignore.
        :raises UnexpectedCharacters: In case the lexer cannot find a suitable match.
        """
        if not hasattr(self, 'lexer') or dont_ignore:
            lexer = self._build_lexer(dont_ignore)
@@ -602,6 +604,10 @@ class Lark(Serialize):
            If a transformer is supplied to ``__init__``, returns whatever is the
            result of the transformation. Otherwise, returns a Tree instance.
        :raises UnexpectedInput: On a parse error, one of these sub-exceptions will rise:
                ``UnexpectedCharacters``, ``UnexpectedToken``, or ``UnexpectedEOF``.
                For convenience, these sub-exceptions also inherit from ``ParserError`` and ``LexerError``.
        """
        return self.parser.parse(text, start=start, on_error=on_error)
--- a/lark/lexer.py
+++ b/lark/lexer.py
@@ -158,20 +158,20 @@ class Token(str):
    def __new__(cls, type_, value, start_pos=None, line=None, column=None, end_line=None, end_column=None, end_pos=None):
        try:
            self = super(Token, cls).__new__(cls, value)
            inst = super(Token, cls).__new__(cls, value)
        except UnicodeDecodeError:
            value = value.decode('latin1')
            self = super(Token, cls).__new__(cls, value)
        self.type = type_
        self.start_pos = start_pos
        self.value = value
        self.line = line
        self.column = column
        self.end_line = end_line
        self.end_column = end_column
        self.end_pos = end_pos
        return self
            inst = super(Token, cls).__new__(cls, value)
        inst.type = type_
        inst.start_pos = start_pos
        inst.value = value
        inst.line = line
        inst.column = column
        inst.end_line = end_line
        inst.end_column = end_column
        inst.end_pos = end_pos
        return inst
    def update(self, type_: Optional[str]=None, value: Optional[Any]=None) -> 'Token':
        return Token.new_borrow_pos(
@@ -234,15 +234,13 @@ class LineCounter:
 class UnlessCallback:
    def __init__(self, mres):
        self.mres = mres
    def __init__(self, scanner):
        self.scanner = scanner
    def __call__(self, t):
        for mre, type_from_index in self.mres:
            m = mre.match(t.value)
            if m:
                t.type = type_from_index[m.lastindex]
                break
        res = self.scanner.match(t.value, 0)
        if res:
            _value, t.type = res
        return t
@@ -257,6 +255,11 @@ class CallChain:
        return self.callback2(t) if self.cond(t2) else t2
 def _get_match(re_, regexp, s, flags):
    m = re_.match(regexp, s, flags)
    if m:
        return m.group(0)
 def _create_unless(terminals, g_regex_flags, re_, use_bytes):
    tokens_by_type = classify(terminals, lambda t: type(t.pattern))
    assert len(tokens_by_type) <= 2, tokens_by_type.keys()
@@ -268,40 +271,54 @@ def _create_unless(terminals, g_regex_flags, re_, use_bytes):
            if strtok.priority > retok.priority:
                continue
            s = strtok.pattern.value
            m = re_.match(retok.pattern.to_regexp(), s, g_regex_flags)
            if m and m.group(0) == s:
            if s == _get_match(re_, retok.pattern.to_regexp(), s, g_regex_flags):
                unless.append(strtok)
                if strtok.pattern.flags <= retok.pattern.flags:
                    embedded_strs.add(strtok)
        if unless:
            callback[retok.name] = UnlessCallback(build_mres(unless, g_regex_flags, re_, match_whole=True, use_bytes=use_bytes))
    terminals = [t for t in terminals if t not in embedded_strs]
    return terminals, callback
 def _build_mres(terminals, max_size, g_regex_flags, match_whole, re_, use_bytes):
    # Python sets an unreasonable group limit (currently 100) in its re module
    # Worse, the only way to know we reached it is by catching an AssertionError!
    # This function recursively tries less and less groups until it's successful.
    postfix = '$' if match_whole else ''
    mres = []
    while terminals:
        pattern = u'|'.join(u'(?P<%s>%s)' % (t.name, t.pattern.to_regexp() + postfix) for t in terminals[:max_size])
        if use_bytes:
            pattern = pattern.encode('latin-1')
        try:
            mre = re_.compile(pattern, g_regex_flags)
        except AssertionError:  # Yes, this is what Python provides us.. :/
            return _build_mres(terminals, max_size//2, g_regex_flags, match_whole, re_, use_bytes)
            callback[retok.name] = UnlessCallback(Scanner(unless, g_regex_flags, re_, match_whole=True, use_bytes=use_bytes))
        mres.append((mre, {i: n for n, i in mre.groupindex.items()}))
        terminals = terminals[max_size:]
    return mres
    new_terminals = [t for t in terminals if t not in embedded_strs]
    return new_terminals, callback
 def build_mres(terminals, g_regex_flags, re_, use_bytes, match_whole=False):
    return _build_mres(terminals, len(terminals), g_regex_flags, match_whole, re_, use_bytes)
 class Scanner:
    def __init__(self, terminals, g_regex_flags, re_, use_bytes, match_whole=False):
        self.terminals = terminals
        self.g_regex_flags = g_regex_flags
        self.re_ = re_
        self.use_bytes = use_bytes
        self.match_whole = match_whole
        self.allowed_types = {t.name for t in self.terminals}
        self._mres = self._build_mres(terminals, len(terminals))
    def _build_mres(self, terminals, max_size):
        # Python sets an unreasonable group limit (currently 100) in its re module
        # Worse, the only way to know we reached it is by catching an AssertionError!
        # This function recursively tries less and less groups until it's successful.
        postfix = '$' if self.match_whole else ''
        mres = []
        while terminals:
            pattern = u'|'.join(u'(?P<%s>%s)' % (t.name, t.pattern.to_regexp() + postfix) for t in terminals[:max_size])
            if self.use_bytes:
                pattern = pattern.encode('latin-1')
            try:
                mre = self.re_.compile(pattern, self.g_regex_flags)
            except AssertionError:  # Yes, this is what Python provides us.. :/
                return self._build_mres(terminals, max_size//2)
            mres.append((mre, {i: n for n, i in mre.groupindex.items()}))
            terminals = terminals[max_size:]
        return mres
    def match(self, text, pos):
        for mre, type_from_index in self._mres:
            m = mre.match(text, pos)
            if m:
                return m.group(0), type_from_index[m.lastindex]
 def _regexp_has_newline(r):
@@ -390,9 +407,9 @@ class TraditionalLexer(Lexer):
        self.use_bytes = conf.use_bytes
        self.terminals_by_name = conf.terminals_by_name
        self._mres = None
        self._scanner = None
    def _build(self) -> None:
    def _build_scanner(self):
        terminals, self.callback = _create_unless(self.terminals, self.g_regex_flags, self.re, self.use_bytes)
        assert all(self.callback.values())
@@ -403,20 +420,16 @@ class TraditionalLexer(Lexer):
            else:
                self.callback[type_] = f
        self._mres = build_mres(terminals, self.g_regex_flags, self.re, self.use_bytes)
        self._scanner = Scanner(terminals, self.g_regex_flags, self.re, self.use_bytes)
    @property
    def mres(self) -> List[Tuple[REPattern, Dict[int, str]]]:
        if self._mres is None:
            self._build()
            assert self._mres is not None
        return self._mres
    def match(self, text: str, pos: int) -> Optional[Tuple[str, str]]:
        for mre, type_from_index in self.mres:
            m = mre.match(text, pos)
            if m:
                return m.group(0), type_from_index[m.lastindex]
    def scanner(self):
        if self._scanner is None:
            self._build_scanner()
        return self._scanner
    def match(self, text, pos):
        return self.scanner.match(text, pos)
    def lex(self, state: LexerState, parser_state: Any) -> Iterator[Token]:
        with suppress(EOFError):
@@ -428,7 +441,7 @@ class TraditionalLexer(Lexer):
        while line_ctr.char_pos < len(lex_state.text):
            res = self.match(lex_state.text, line_ctr.char_pos)
            if not res:
                allowed = {v for m, tfi in self.mres for v in tfi.values()} - self.ignore_types
                allowed = self.scanner.allowed_types - self.ignore_types
                if not allowed:
                    allowed = {"<END-OF-FILE>"}
                raise UnexpectedCharacters(lex_state.text, line_ctr.char_pos, line_ctr.line, line_ctr.column,
--- a/lark/load_grammar.py
+++ b/lark/load_grammar.py
@@ -10,7 +10,7 @@ from numbers import Integral
 from contextlib import suppress
 from typing import List, Tuple, Union, Callable, Dict, Optional
 from .utils import bfs, logger, classify_bool, is_id_continue, is_id_start, bfs_all_unique
 from .utils import bfs, logger, classify_bool, is_id_continue, is_id_start, bfs_all_unique, small_factors
 from .lexer import Token, TerminalDef, PatternStr, PatternRE
 from .parse_tree_builder import ParseTreeBuilder
@@ -176,27 +176,136 @@ RULES = {
 }
 # Value 5 keeps the number of states in the lalr parser somewhat minimal
 # It isn't optimal, but close to it. See PR #949
 SMALL_FACTOR_THRESHOLD = 5
 # The Threshold whether repeat via ~ are split up into different rules
 # 50 is chosen since it keeps the number of states low and therefore lalr analysis time low,
 # while not being to overaggressive and unnecessarily creating rules that might create shift/reduce conflicts.
 # (See PR #949)
 REPEAT_BREAK_THRESHOLD = 50
@inline_args
 class EBNF_to_BNF(Transformer_InPlace):
    def __init__(self):
        self.new_rules = []
        self.rules_by_expr = {}
        self.rules_cache = {}
        self.prefix = 'anon'
        self.i = 0
        self.rule_options = None
    def _add_recurse_rule(self, type_, expr):
        if expr in self.rules_by_expr:
            return self.rules_by_expr[expr]
        new_name = '__%s_%s_%d' % (self.prefix, type_, self.i)
    def _name_rule(self, inner):
        new_name = '__%s_%s_%d' % (self.prefix, inner, self.i)
        self.i += 1
        t = NonTerminal(new_name)
        tree = ST('expansions', [ST('expansion', [expr]), ST('expansion', [t, expr])])
        self.new_rules.append((new_name, tree, self.rule_options))
        self.rules_by_expr[expr] = t
        return new_name
    def _add_rule(self, key, name, expansions):
        t = NonTerminal(name)
        self.new_rules.append((name, expansions, self.rule_options))
        self.rules_cache[key] = t
        return t
    def _add_recurse_rule(self, type_, expr):
        try:
            return self.rules_cache[expr]
        except KeyError:
            new_name = self._name_rule(type_)
            t = NonTerminal(new_name)
            tree = ST('expansions', [
                ST('expansion', [expr]),
                ST('expansion', [t, expr])
            ])
            return self._add_rule(expr, new_name, tree)
    def _add_repeat_rule(self, a, b, target, atom):
        """Generate a rule that repeats target ``a`` times, and repeats atom ``b`` times.
        When called recursively (into target), it repeats atom for x(n) times, where:
            x(0) = 1
            x(n) = a(n) * x(n-1) + b
        Example rule when a=3, b=4:
            new_rule: target target target atom atom atom atom
        """
        key = (a, b, target, atom)
        try:
            return self.rules_cache[key]
        except KeyError:
            new_name = self._name_rule('repeat_a%d_b%d' % (a, b))
            tree = ST('expansions', [ST('expansion', [target] * a + [atom] * b)])
            return self._add_rule(key, new_name, tree)
    def _add_repeat_opt_rule(self, a, b, target, target_opt, atom):
        """Creates a rule that matches atom 0 to (a*n+b)-1 times.
        When target matches n times atom, and target_opt 0 to n-1 times target_opt,
        First we generate target * i followed by target_opt, for i from 0 to a-1
        These match 0 to n*a - 1 times atom
        Then we generate target * a followed by atom * i, for i from 0 to b-1
        These match n*a to n*a + b-1 times atom
        The created rule will not have any shift/reduce conflicts so that it can be used with lalr
        Example rule when a=3, b=4:
            new_rule: target_opt
                    | target target_opt
                    | target target target_opt
                    | target target target
                    | target target target atom
                    | target target target atom atom
                    | target target target atom atom atom
        """
        key = (a, b, target, atom, "opt")
        try:
            return self.rules_cache[key]
        except KeyError:
            new_name = self._name_rule('repeat_a%d_b%d_opt' % (a, b))
            tree = ST('expansions', [
                ST('expansion', [target]*i + [target_opt]) for i in range(a)
            ] + [
                ST('expansion', [target]*a + [atom]*i) for i in range(b)
            ])
            return self._add_rule(key, new_name, tree)
    def _generate_repeats(self, rule, mn, mx):
        """Generates a rule tree that repeats ``rule`` exactly between ``mn`` to ``mx`` times.
        """
        # For a small number of repeats, we can take the naive approach
        if mx < REPEAT_BREAK_THRESHOLD:
            return ST('expansions', [ST('expansion', [rule] * n) for n in range(mn, mx + 1)])
        # For large repeat values, we break the repetition into sub-rules. 
        # We treat ``rule~mn..mx`` as ``rule~mn rule~0..(diff=mx-mn)``.
        # We then use small_factors to split up mn and diff up into values [(a, b), ...]
        # This values are used with the help of _add_repeat_rule and _add_repeat_rule_opt
        # to generate a complete rule/expression that matches the corresponding number of repeats
        mn_target = rule
        for a, b in small_factors(mn, SMALL_FACTOR_THRESHOLD):
            mn_target = self._add_repeat_rule(a, b, mn_target, rule)
        if mx == mn:
            return mn_target
        diff = mx - mn + 1  # We add one because _add_repeat_opt_rule generates rules that match one less
        diff_factors = small_factors(diff, SMALL_FACTOR_THRESHOLD)
        diff_target = rule  # Match rule 1 times
        diff_opt_target = ST('expansion', [])  # match rule 0 times (e.g. up to 1 -1 times)
        for a, b in diff_factors[:-1]:
            diff_opt_target = self._add_repeat_opt_rule(a, b, diff_target, diff_opt_target, rule)
            diff_target = self._add_repeat_rule(a, b, diff_target, rule)
        a, b = diff_factors[-1]
        diff_opt_target = self._add_repeat_opt_rule(a, b, diff_target, diff_opt_target, rule)
        return ST('expansions', [ST('expansion', [mn_target] + [diff_opt_target])])
    def expr(self, rule, op, *args):
        if op.value == '?':
            empty = ST('expansion', [])
@@ -221,7 +330,9 @@ class EBNF_to_BNF(Transformer_InPlace):
                mn, mx = map(int, args)
                if mx < mn or mn < 0:
                    raise GrammarError("Bad Range for %s (%d..%d isn't allowed)" % (rule, mn, mx))
            return ST('expansions', [ST('expansion', [rule] * n) for n in range(mn, mx+1)])
            return self._generate_repeats(rule, mn, mx)
        assert False, op
    def maybe(self, rule):
--- a/lark/parse_tree_builder.py
+++ b/lark/parse_tree_builder.py
@@ -22,54 +22,59 @@ class ExpandSingleChild:
 class PropagatePositions:
    def __init__(self, node_builder):
    def __init__(self, node_builder, node_filter=None):
        self.node_builder = node_builder
        self.node_filter = node_filter
    def __call__(self, children):
        res = self.node_builder(children)
        # local reference to Tree.meta reduces number of presence checks
        if isinstance(res, Tree):
            res_meta = res.meta
            # Calculate positions while the tree is streaming, according to the rule:
            # - nodes start at the start of their first child's container,
            #   and end at the end of their last child's container.
            # Containers are nodes that take up space in text, but have been inlined in the tree.
            src_meta = self._pp_get_meta(children)
            if src_meta is not None:
                res_meta.line = src_meta.line
                res_meta.column = src_meta.column
                res_meta.start_pos = src_meta.start_pos
                res_meta.empty = False
            res_meta = res.meta
            src_meta = self._pp_get_meta(reversed(children))
            if src_meta is not None:
                res_meta.end_line = src_meta.end_line
                res_meta.end_column = src_meta.end_column
                res_meta.end_pos = src_meta.end_pos
                res_meta.empty = False
            first_meta = self._pp_get_meta(children)
            if first_meta is not None:
                if not hasattr(res_meta, 'line'):
                    # meta was already set, probably because the rule has been inlined (e.g. `?rule`)
                    res_meta.line = getattr(first_meta, 'container_line', first_meta.line)
                    res_meta.column = getattr(first_meta, 'container_column', first_meta.column)
                    res_meta.start_pos = getattr(first_meta, 'container_start_pos', first_meta.start_pos)
                    res_meta.empty = False
                res_meta.container_line = getattr(first_meta, 'container_line', first_meta.line)
                res_meta.container_column = getattr(first_meta, 'container_column', first_meta.column)
            last_meta = self._pp_get_meta(reversed(children))
            if last_meta is not None:
                if not hasattr(res_meta, 'end_line'):
                    res_meta.end_line = getattr(last_meta, 'container_end_line', last_meta.end_line)
                    res_meta.end_column = getattr(last_meta, 'container_end_column', last_meta.end_column)
                    res_meta.end_pos = getattr(last_meta, 'container_end_pos', last_meta.end_pos)
                    res_meta.empty = False
                res_meta.container_end_line = getattr(last_meta, 'container_end_line', last_meta.end_line)
                res_meta.container_end_column = getattr(last_meta, 'container_end_column', last_meta.end_column)
        return res
    def _pp_get_meta(self, children):
        for c in children:
            if self.node_filter is not None and not self.node_filter(c):
                continue
            if isinstance(c, Tree):
                if not c.meta.empty:
                    return c.meta
            elif isinstance(c, Token):
                return c
 class PropagatePositions_IgnoreWs(PropagatePositions):
    def _pp_get_meta(self, children):
        for c in children:
            if isinstance(c, Tree):
                if not c.meta.empty:
                    return c.meta
            elif isinstance(c, Token):
                if c and not c.isspace():     # Disregard whitespace-only tokens
                    return c
 def make_propagate_positions(option):
    if option == "ignore_ws":
        return PropagatePositions_IgnoreWs
    if callable(option):
        return partial(PropagatePositions, node_filter=option)
    elif option is True:
        return PropagatePositions
    elif option is False:
--- a/lark/parser_frontends.py
+++ b/lark/parser_frontends.py
@@ -39,8 +39,7 @@ class MakeParsingFrontend:
        lexer_conf.lexer_type = self.lexer_type
        return ParsingFrontend(lexer_conf, parser_conf, options)
    @classmethod
    def deserialize(cls, data, memo, lexer_conf, callbacks, options):
    def deserialize(self, data, memo, lexer_conf, callbacks, options):
        parser_conf = ParserConf.deserialize(data['parser_conf'], memo)
        parser = LALR_Parser.deserialize(data['parser'], memo, callbacks, options.debug)
        parser_conf.callbacks = callbacks
@@ -92,26 +91,26 @@ class ParsingFrontend(Serialize):
    def _verify_start(self, start=None):
        if start is None:
            start = self.parser_conf.start
            if len(start) > 1:
                raise ConfigurationError("Lark initialized with more than 1 possible start rule. Must specify which start rule to parse", start)
            start ,= start
            start_decls = self.parser_conf.start
            if len(start_decls) > 1:
                raise ConfigurationError("Lark initialized with more than 1 possible start rule. Must specify which start rule to parse", start_decls)
            start ,= start_decls
        elif start not in self.parser_conf.start:
            raise ConfigurationError("Unknown start rule %s. Must be one of %r" % (start, self.parser_conf.start))
        return start
    def parse(self, text, start=None, on_error=None):
        start = self._verify_start(start)
        chosen_start = self._verify_start(start)
        stream = text if self.skip_lexer else LexerThread(self.lexer, text)
        kw = {} if on_error is None else {'on_error': on_error}
        return self.parser.parse(stream, start, **kw)
        return self.parser.parse(stream, chosen_start, **kw)
    def parse_interactive(self, text=None, start=None):
        start = self._verify_start(start)
        chosen_start = self._verify_start(start)
        if self.parser_conf.parser_type != 'lalr':
            raise ConfigurationError("parse_interactive() currently only works with parser='lalr' ")
        stream = text if self.skip_lexer else LexerThread(self.lexer, text)
        return self.parser.parse_interactive(stream, start)
        return self.parser.parse_interactive(stream, chosen_start)
 def get_frontend(parser, lexer):
--- a/lark/parsers/lalr_interactive_parser.py
+++ b/lark/parsers/lalr_interactive_parser.py
@@ -65,7 +65,7 @@ class InteractiveParser(object):
        """Print the output of ``choices()`` in a way that's easier to read."""
        out = ["Parser choices:"]
        for k, v in self.choices().items():
            out.append('\t- %s -> %s' % (k, v))
            out.append('\t- %s -> %r' % (k, v))
        out.append('stack size: %s' % len(self.parser_state.state_stack))
        return '\n'.join(out)
--- a/lark/parsers/lalr_parser.py
+++ b/lark/parsers/lalr_parser.py
@@ -178,8 +178,8 @@ class _Parser(object):
            for token in state.lexer.lex(state):
                state.feed_token(token)
            token = Token.new_borrow_pos('$END', '', token) if token else Token('$END', '', 0, 1, 1)
            return state.feed_token(token, True)
            end_token = Token.new_borrow_pos('$END', '', token) if token else Token('$END', '', 0, 1, 1)
            return state.feed_token(end_token, True)
        except UnexpectedInput as e:
            try:
                e.interactive_parser = InteractiveParser(self, state, state.lexer)
--- a/lark/utils.py
+++ b/lark/utils.py
@@ -61,14 +61,13 @@ class Serialize(object):
        fields = getattr(self, '__serialize_fields__')
        res = {f: _serialize(getattr(self, f), memo) for f in fields}
        res['__type__'] = type(self).__name__
        postprocess = getattr(self, '_serialize', None)
        if postprocess:
            postprocess(res, memo)
        if hasattr(self, '_serialize'):
            self._serialize(res, memo)
        return res
    @classmethod
    def deserialize(cls, data, memo):
        namespace = getattr(cls, '__serialize_namespace__', {})
        namespace = getattr(cls, '__serialize_namespace__', [])
        namespace = {c.__name__:c for c in namespace}
        fields = getattr(cls, '__serialize_fields__')
@@ -82,9 +81,10 @@ class Serialize(object):
                setattr(inst, f, _deserialize(data[f], namespace, memo))
            except KeyError as e:
                raise KeyError("Cannot find key for class", cls, e)
        postprocess = getattr(inst, '_deserialize', None)
        if postprocess:
            postprocess()
        if hasattr(inst, '_deserialize'):
            inst._deserialize()
        return inst
@@ -163,7 +163,7 @@ def get_regexp_width(expr):
                return 1, sre_constants.MAXREPEAT
            else:
                return 0, sre_constants.MAXREPEAT
 ###}
@@ -198,14 +198,6 @@ def dedup_list(l):
    return [x for x in l if not (x in dedup or dedup.add(x))]
 def compare(a, b):
    if a == b:
        return 0
    elif a > b:
        return 1
    return -1
 class Enumerator(Serialize):
    def __init__(self):
        self.enums = {}
@@ -253,7 +245,7 @@ except ImportError:
 class FS:
    exists = os.path.exists
    @staticmethod
    def open(name, mode="r", **kwargs):
        if atomicwrites and "w" in mode:
@@ -324,3 +316,29 @@ def _serialize(value, memo):
        return {key:_serialize(elem, memo) for key, elem in value.items()}
    # assert value is None or isinstance(value, (int, float, str, tuple)), value
    return value
 def small_factors(n, max_factor):
    """
    Splits n up into smaller factors and summands <= max_factor.
    Returns a list of [(a, b), ...]
    so that the following code returns n:
    n = 1
    for a, b in values:
        n = n * a + b
    Currently, we also keep a + b <= max_factor, but that might change
    """
    assert n >= 0
    assert max_factor > 2
    if n <= max_factor:
        return [(n, 0)]
    for a in range(max_factor, 1, -1):
        r, b = divmod(n, a)
        if a + b <= max_factor:
            return small_factors(r, max_factor) + [(a, b)]
    assert False, "Failed to factorize %s" % n
--- a/tests/test_grammar.py
+++ b/tests/test_grammar.py
@@ -3,7 +3,7 @@ from __future__ import absolute_import
 import sys
 from unittest import TestCase, main
 from lark import Lark, Token, Tree
 from lark import Lark, Token, Tree, ParseError, UnexpectedInput
 from lark.load_grammar import GrammarError, GRAMMAR_ERRORS, find_grammar_errors
 from lark.load_grammar import FromPackageLoader
@@ -198,6 +198,53 @@ class TestGrammar(TestCase):
        x = find_grammar_errors(text)
        assert [e.line for e, _s in find_grammar_errors(text)] == [2, 6]
    def test_ranged_repeat_terms(self):
        g = u"""!start: AAA
                AAA: "A"~3
            """
        l = Lark(g, parser='lalr')
        self.assertEqual(l.parse(u'AAA'), Tree('start', ["AAA"]))
        self.assertRaises((ParseError, UnexpectedInput), l.parse, u'AA')
        self.assertRaises((ParseError, UnexpectedInput), l.parse, u'AAAA')
        g = u"""!start: AABB CC
                AABB: "A"~0..2 "B"~2
                CC: "C"~1..2
            """
        l = Lark(g, parser='lalr')
        self.assertEqual(l.parse(u'AABBCC'), Tree('start', ['AABB', 'CC']))
        self.assertEqual(l.parse(u'BBC'), Tree('start', ['BB', 'C']))
        self.assertEqual(l.parse(u'ABBCC'), Tree('start', ['ABB', 'CC']))
        self.assertRaises((ParseError, UnexpectedInput), l.parse, u'AAAB')
        self.assertRaises((ParseError, UnexpectedInput), l.parse, u'AAABBB')
        self.assertRaises((ParseError, UnexpectedInput), l.parse, u'ABB')
        self.assertRaises((ParseError, UnexpectedInput), l.parse, u'AAAABB')
    def test_ranged_repeat_large(self):
        g = u"""!start: "A"~60
            """
        l = Lark(g, parser='lalr')
        self.assertGreater(len(l.rules), 1, "Expected that more than one rule will be generated")
        self.assertEqual(l.parse(u'A' * 60), Tree('start', ["A"] * 60))
        self.assertRaises(ParseError, l.parse, u'A' * 59)
        self.assertRaises((ParseError, UnexpectedInput), l.parse, u'A' * 61)
        g = u"""!start: "A"~15..100
            """
        l = Lark(g, parser='lalr')
        for i in range(0, 110):
            if 15 <= i <= 100:
                self.assertEqual(l.parse(u'A' * i), Tree('start', ['A']*i))
            else:
                self.assertRaises(UnexpectedInput, l.parse, u'A' * i)
        # 8191 is a Mersenne prime
        g = u"""start: "A"~8191
            """
        l = Lark(g, parser='lalr')
        self.assertEqual(l.parse(u'A' * 8191), Tree('start', []))
        self.assertRaises(UnexpectedInput, l.parse, u'A' * 8190)
        self.assertRaises(UnexpectedInput, l.parse, u'A' * 8192)
 if __name__ == '__main__':
--- a/tests/test_parser.py
+++ b/tests/test_parser.py
@@ -94,6 +94,26 @@ class TestParsers(unittest.TestCase):
        r = g.parse('a')
        self.assertEqual( r.children[0].meta.line, 1 )
    def test_propagate_positions2(self):
        g = Lark("""start: a
                    a: b
                    ?b: "(" t ")"
                    !t: "t"
                 """, propagate_positions=True)
        start = g.parse("(t)")
        a ,= start.children
        t ,= a.children
        assert t.children[0] == "t"
        assert t.meta.column == 2
        assert t.meta.end_column == 3
        assert start.meta.column == a.meta.column == 1
        assert start.meta.end_column == a.meta.end_column == 4
    def test_expand1(self):
        g = Lark("""start: a
@@ -2183,27 +2203,7 @@ def _make_parser_test(LEXER, PARSER):
            self.assertRaises((ParseError, UnexpectedInput), l.parse, u'AAAABB')
        def test_ranged_repeat_terms(self):
            g = u"""!start: AAA
                    AAA: "A"~3
                """
            l = _Lark(g)
            self.assertEqual(l.parse(u'AAA'), Tree('start', ["AAA"]))
            self.assertRaises((ParseError, UnexpectedInput), l.parse, u'AA')
            self.assertRaises((ParseError, UnexpectedInput), l.parse, u'AAAA')
            g = u"""!start: AABB CC
                    AABB: "A"~0..2 "B"~2
                    CC: "C"~1..2
                """
            l = _Lark(g)
            self.assertEqual(l.parse(u'AABBCC'), Tree('start', ['AABB', 'CC']))
            self.assertEqual(l.parse(u'BBC'), Tree('start', ['BB', 'C']))
            self.assertEqual(l.parse(u'ABBCC'), Tree('start', ['ABB', 'CC']))
            self.assertRaises((ParseError, UnexpectedInput), l.parse, u'AAAB')
            self.assertRaises((ParseError, UnexpectedInput), l.parse, u'AAABBB')
            self.assertRaises((ParseError, UnexpectedInput), l.parse, u'ABB')
            self.assertRaises((ParseError, UnexpectedInput), l.parse, u'AAAABB')
        @unittest.skipIf(PARSER=='earley', "Priority not handled correctly right now")  # TODO XXX
        def test_priority_vs_embedded(self):