In a recent meeting, @Richard Feldman suggested that Roc might never provide first-party support for regular expressions. My JS-soaked brain was shocked at this, but now I see that FPLs like Elm (& Haskell?) prefer parser libraries like elm-parser. Do we expect to want a roc-parser port of elm-parser? If so, should that be first-party (Roc standard library / builtin) or third-party (one or more decentralized libraries, perhaps JanCVanB/roc-parser)?
To be clear, parsers sound lovely and I'm intrigued :)
Yeah, libraries for building parsers via parser combinators are very common in FP. That's the approach elm-parser also seems to use, but I'm not terribly familiar with it. I think we'll want a community library around it, but I don't think a general parser library is something that should be provided by the standard library.
Oh yes there will be a parser library at some point for sure! For one, thing there's no way to stop anyone writing one! It's all pure functions, no builtin support needed, so no reason for it to be in the standard lib.
There are many approaches to parsing, for example for parsing PLs lex/yacc and its derivatives are very popular (in fact, OCaml's parser is entirely generated by a yacc-like tool). They all have their tradeoffs and performance characteristics can vary drastically depending on what you're parsing, so it's good to have a variety of different options offered by the community.
In Roc it's interesting because you can go with parser combinators in the style of elm-parser or Haskell attoparsec. But you can probably also do something that compiles to a more imperative approach.
It should be really efficient to "walk" over the bytes and accumulate some state along the way. But I don't know how nice it would be to compose sub-parsers together. Probably you end up with something like a "pull parser", which are meant to be fast.
What about included read and show functions (like in Haskell, which also sees extensive parser combinators usage). Of which the show function turns a value into a string, which can be extremely helpful while debugging or quickly storing state.
Yeah we have some things like that called Encoding and Decoding "abilities".
https://github.com/roc-lang/roc/blob/main/crates/compiler/builtins/roc/Encode.roc
https://github.com/roc-lang/roc/blob/main/crates/compiler/builtins/roc/Decode.roc
Mainly focused on things like JSON or CSV for now, though debug printing has been discussed at some point.
Would e.g. JSON require an implementation for a specific record or would generics (not the <T> kind but these) be used for such tasks?
It will require an implementation, Roc does not provide a mechanism for runtime type information like Haskell's Generic does. However, the compiler will derive an implementation for structural types when they are used for encoding/decoding. For example, the following works:
app "test" imports [Encode, Decode, Json] provides [main] to "./platform"
main =
when Str.toUtf8 "{\"outer\":{\"inner\":\"a\"},\"other\":{\"one\":\"b\",\"two\":10}}" |> Decode.fromBytes Json.fromUtf8 is
Ok {outer: {inner: "a"}, other: {one: "b", two: 10u8}} -> "ab10"
_ -> "something went wrong"
Internally, the compiler creates an implementation for decoding the record being matched in the when branch, which is the implementation used at runtime
JanCVanB said:
In a recent meeting, Richard Feldman suggested that Roc might never provide first-party support for regular expressions. My JS-soaked brain was shocked at this, but now I see that FPLs like Elm (& Haskell?) prefer parser libraries like elm-parser. Do we expect to want a roc-parser port of elm-parser? If so, should that be first-party (Roc standard library / builtin) or third-party (one or more decentralized libraries, perhaps JanCVanB/roc-parser)?
(opt-in) arbitrary look-ahead (which elm-parser calls 'backtracking') and providing context is not something unique to elm-parser but something seen in most Parsec descendants. (c.f. the try and label functions in the OG Parsec.)
The example parser for CSV that I started to write is very much in this style as well and could easily support it :blush:.
A big disadvantage of PCRE-style regular expressions is that they do not compose. You have a very large chunk of 'special syntax' which cannot be split up into smaller functions. Another is that, except when your language has special support to either evaluate them at compile-time or do dynamic compilation at runtime, such a regex string has to be turned into a parser automaton over and over again each time it is used.
Of course, there are advantages to supporting PCRE as well. The main I can think of is familiarity.
Last updated: Jun 16 2026 at 16:19 UTC