Stream: beginners

Topic: Recursive parser


view this post on Zulip Aurélien Geron (Oct 17 2024 at 09:52):

I'm trying to write an SGF-parser for an Exercism exercise. I'd like to use the roc-parser library.
The grammar is recursive. Simplifying a bit, suppose we just want to parse nested expressions like these: (AA;BB;CC(DD;EE(FF;GG)(HH))).

I have most of the grammar working, except for the recursive part: it won't compile because the compiler thinks it's an infinite recursion:

gameTree =
    P.const (\nodes -> \subTrees -> (nodes, subTrees))
    |> P.skip (S.utf8 ['('])
    |> P.keep (P.oneOrMore node)
    |> P.keep (P.lazy \_ -> P.many gameTree)
    |> P.skip (S.utf8 [')'])

Any idea how to solve this?

I looked at the recursive part of the XMLparser, and I don't see how its recursion is any different.

Thanks for your help!

view this post on Zulip drew (Oct 17 2024 at 14:22):

I added some ;'s to make the grammar more regular, but here's a solution https://gist.github.com/drewolson/1c4cace8d0afaf4baab4c71e6a7427dc

view this post on Zulip drew (Oct 17 2024 at 14:23):

you need some kind of terminal in your parser -- a parser that doesn't recur. in my example, parseLiteral is the parser that doesn't recur

view this post on Zulip Aurélien Geron (Oct 17 2024 at 17:56):

Got it, thanks so much @drew !

view this post on Zulip Aurélien Geron (Oct 17 2024 at 22:52):

I finally got the SGF parser to work! :smiley:
If you're interested, here's the code.

view this post on Zulip drew (Oct 17 2024 at 22:54):

awesome

view this post on Zulip Luke Boswell (Oct 17 2024 at 23:09):

Nice, looks sufficiently challenging for an exercism exercise. :smiley:

view this post on Zulip Aurélien Geron (Oct 18 2024 at 02:13):

Yeah, it's the highest difficulty level on the site (level 9)

view this post on Zulip Niklas Konstenius (Oct 20 2024 at 13:00):

I have most of the grammar working, except for the recursive part: it won't compile because the compiler thinks it's an infinite recursion:

@Aurélien Geron What kind of error did you get? A compiler crash or compiler error?

When I compile my solution the compiler crashes with stack overflow:

$ roc check SgfParsing.roc
0 errors and 0 warnings found in 22 ms

$  roc build  SgfParsing.roc

thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
zsh: abort      roc build SgfParsing.roc

I'm a little bit hesitant to look at your code to not spoil the solution. :smile:

view this post on Zulip Aurélien Geron (Oct 20 2024 at 18:03):

Mmh, I think my error was more explicit: it told me that it ran into an infinitely recursive type. The trick was to use lazy to break the infinite recursion, you can take a look at the XML parser in the roc-parser package for inspiration. Hope this helps.

view this post on Zulip Niklas Konstenius (Oct 21 2024 at 18:04):

I have tried using lazy in multiple places without luck. The compiler does not like the code at all... :grinning:

For example this code:

gameTreeP : SgfParser GameTree
gameTreeP =
    const \nodes -> \children -> createGameTree nodes children
    |> skip (string "(")
    |> keep sequenceP
    |> keep (lazy foo)
    |> skip (string ")")

foo : {} -> SgfParser (List GameTree)
foo = \_ -> many gameTreeP

Results in this error:

$ roc build main.roc

thread 'main' panicked at crates/compiler/gen_llvm/src/llvm/build.rs:5748:19:
Error in alias analysis: error in module ModName("UserApp"), function definition FuncName(".\x00\x00\x00\x0f\x00\x00\x00\x19\x89\x91\'%J\xdf\xf9"), definition of value binding ValueId(3): expected type '("UserApp"::"G\xbb\xc5\xe4\xc8\xe1F\xfd",)', found type '((((((((heap_cell,), (heap_cell, bag<()>)), ()), ((((((heap_cell,), (heap_cell, bag<()>)), ()), (((((((((heap_cell,), (heap_cell, bag<()>)), ()), ((), ())), ((heap_cell,), (heap_cell, bag<()>))), ()), (((((heap_cell,), (heap_cell, bag<()>)), ()), ((), ())), ((heap_cell,), (heap_cell, bag<()>)))), ((), ((), (((), ()), ())))), ())), ()), ((((heap_cell,), (heap_cell, bag<()>)), ()), (((((((((heap_cell,), (heap_cell, bag<()>)), ()), ((), ())), ((heap_cell,), (heap_cell, bag<()>))), ()), (((((heap_cell,), (heap_cell, bag<()>)), ()), ((), ())), ((heap_cell,), (heap_cell, bag<()>)))), ((), ((), (((), ()), ())))), ())))), ((), ())), (heap_cell, bag<"UserApp"::"\x0e\x08IH\xc7\x89n\x88">)), ((heap_cell,), (heap_cell, bag<()>))),)'
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

But if I inline the foo-lambda:

gameTreeP : SgfParser GameTree
gameTreeP =
    const \nodes -> \children -> createGameTree nodes children
    |> skip (string "(")
    |> keep sequenceP
    |> keep (lazy \_ -> many gameTreeP)
    |> skip (string ")")

foo : {} -> SgfParser (List GameTree)
foo = \_ -> many gameTreeP

another error is produced:

$ roc build main.roc

An internal compiler expectation was broken.
This is definitely a compiler bug.
Please file an issue here: <https://github.com/roc-lang/roc/issues/new/choose>
I thought a non-nullable-unwrapped variant for a lambda set was impossible: how could such a lambda set be created without a base case?
Location: crates/compiler/mono/src/layout.rs:1705:61

And if I comment the (unused) foo function:

gameTreeP : SgfParser GameTree
gameTreeP =
    const \nodes -> \children -> createGameTree nodes children
    |> skip (string "(")
    |> keep sequenceP
    |> keep (lazy \_ -> many gameTreeP)
    |> skip (string ")")

# foo : {} -> SgfParser (List GameTree)
# foo = \_ -> many gameTreeP

I get yet another error:

$ roc build main.roc

thread '<unknown>' has overflowed its stack
fatal runtime error: stack overflow
zsh: abort      roc build main.roc

I'm really not sure what I'm doing wrong here. Any help appreciated. :smile:

view this post on Zulip Aurélien Geron (Oct 23 2024 at 22:48):

Oh no, that's not great. :cry:
Perhaps you can try running my solution just to see whether it might be a compiler bug linked to your platform (I'm on MacOS, and the tests also pass on Ubuntu, but perhaps there's an issue on Windows). Also, what character encoding are you using in your Roc file? The function name in the first error message starts with a bunch of zeroes, I'm not sure that's normal.

view this post on Zulip Aurélien Geron (Oct 23 2024 at 22:49):

https://github.com/exercism/roc/blob/main/exercises/practice/sgf-parsing/.meta/Example.roc

view this post on Zulip Aurélien Geron (Oct 23 2024 at 22:52):

Also, maybe try lazy \_ -> foo in your first code example

view this post on Zulip Aurélien Geron (Oct 23 2024 at 22:54):

With lazy foo, the compiler will probably inspect foo right away, leading to a loop and stack overflow, whereas lazy \_ -> foo might not


Last updated: Jul 05 2025 at 12:14 UTC