I'm curious what you feel about creating a data notation language (like JSON, XML, YAML, one of those) that is a subset of the Roc language. Like JSON is to JavaScript, this data notation language would be to Roc. The subset would include primitive values, lists, records, tags and comments, but not things like functions or type definitions.
There's a lot of existing data notation languages and I think it's fair to ask whether creating another one is a good idea. One thing that often frustrates me with most data languages is that they do not support ADTs/Tags, so when serializing/deserializing data in a programming language that does have them (such as Roc, Rust, Elm, or Haskell), there's some data transformation happening that makes the serialized data look quite a bit different from the values we work with in the program. I'd love to be able to use a data notation language that's made for ADT-supporting languages.
I think Roc could be a particularly suitable template for a data notation language because of it's support for open tag unions, that you have a value like Tree { leaves: 4 } and Roc can infer a type for it. There's no need to add a type anotation specifying the type of Tree and any other constructors that exist for that type, which would require adding type annotations to the data language and make it more difficult to use.
Some other things things I think such a data notation language might bring:
Curious what you think!
This sounds cool, not something I am really qualified to comment on, but interested to see where this idea leads. I guess the roc tooling like LSP and formatter would also play nicely with this.
I also share your concern about creating yet another data notation language, but nevertheless, this sounds great! Having tag unions would be very nice. This could be implemented fully in user space so it would be very doable to try it out.
(The one caveat is that we can’t decode serialized data into tag unions yet which would be a partial blocker.)
Ohh, interesting, thanks for sharing that! I had started writing a coder/decoder in Roc for this data langage, taking a queue from the roc-json library (thanks Luke!). Now that you mention serializing into tag unions not (yet) being possible I notice DecoderFormatting is missing a tagfield.
Do you know of any conversations/plans/ideas around tag union decoding? I'd love to learn more about that. Seems like a tricky design question.
we definitely plan to support it, it's just not implemented yet :big_smile:
I think the most recent discussion about it is https://roc.zulipchat.com/#narrow/stream/316715-contributing/topic/json.20null.20handling/near/401990781
:thinking: I didn't intend for this to be true, but I think if we use true and false for the Bool values then this might be a superset of JSON
actually nm JSON requires quotes around record fields and Roc doesn't support those
Thanks for the link to the earlier conversation, cool stuff!
Not being a superset of JSON, I guess being a sibbling of YAML is out.
My dream of having a format called RocOn is coming true :laughing: :guitar: https://roc.zulipchat.com/#narrow/stream/304641-ideas/topic/Roc.20Object.20Notation.20-.20RocON/near/369069652
I recently saw pkl, maybe it would work well for your use cases: https://pkl-lang.org/main/current/language-tutorial/01_basic_config.html
It supports rust/swift style enums, though not full tags or fancy type inference.
I do think your idea is cool, but I like others think less config standards is generally better. The advantage i see in pkl is it's designed to be able to be converted to json or yaml or xml by default, helping mitigate the many standards problem.
If RocOn existed, could we implement a document database using it? Would be awesome to have algebraic datatypes in the database. At least semantically, no clue if it could be made performant
YES, I want such a database so badly too! There's so much effort and complexity involved in mapping ADTs to a relation database, definitely agree there's a parallel there with data notation languages, where most of the existing solutions out there seem to be built for classical languages without ADTs. I think we could still have things like indexes and foreign keys in such a Roc-based database BTW. I have many thoughts on this, we should maybe start a separate thread.
That said, I think in a database we might want to store the data in a binary format, instead of the textual format that RocOn would be. I half-remember there's an effort to define a default binary encoding for Roc already happening, don't know where though.
I'm making some progress on this by the way. I've written first versions of roc encoders and decoders, encoding and decoding into/from a compact format (no whitespace). I'll do a second pass to do whitespace correctly. Will share some code soon in case anyone's interested!
I'm very interested! :grinning:
Wow, sounds cool
I don't like being the one bogged down with the minor implementation difficulties when talking about this big of an idea, but how are we going to handle adding a tag to an existing tag union? I mean the tags
[A, X, Z] would be represented as 1, 2, 3 at runtime (maybe even backed with the smallest number of bytes, so this would be a 1 byte number?).
Let's say our function returns this union. Modifying the code (for example, adding a branch to our function) to have the chance to return B.
A stays 1, but X and Z are now 3 and 4, because B is now 2. If we stored the tags in a db before modifying our function, those are invalid now. I can only see a database that is very tightly coupled to Roc and has no problem applying such a tag-change to all of the records whenever the application code updates. I don't know if that makes it bad, just something to think about. The problem kinda feels like wanting to store raw C structs as binary data inside a db, because that would be more performant and convenient.
Yeah, good questions, I'm way out of my depth and have no idea about the answers. I'm peripherally aware that backwards compatibility is a topic binary formats like protocol buffers and cap'n take into account, and kind of extrapolated from there that it would maybe make sense in a database storage format as well, but really I know very little about database internals, so that was a very uninformed statement :sweat_smile:.
Here's my current progress on the rocon encoder/decoder. I got basic encoders/decoders for everything except decodeRecord (which still is missing some field-skipping logic) and decodeTuple. It also has no support for whitespace between values, plus a couple of other todo's.
https://github.com/jwoudenberg/rocon/blob/main/package/Rocon.roc
(Github syntax highlighting is great! :heart_eyes:)
(I'm on the frence about the name RocON. I like the "Roc On!" joke, but as a file extension rocon is a bit long and pronounciation is unclear. I'm wondering if rocn or rocd or rocl would be better)
I'm currently stuck on completing field skipping logic for record decoding, I'm getting a weird error I think might be a compiler bug (?). If anyone has any ideas, would love some help! I have my decodeRecord implementation, which has a type signature like this:
decodeRecord :
state,
(state, Str -> [Keep (Decoder state Rocon), Skip]),
(state -> Result val DecodeError)
-> Decoder val Rocon
If the decoder tells me to skip a field, then I need to parse whatever value that field contains just to find out where that value ends and the next field begins. That value might contain another Roc record, so I thought I'd reuse the record decoding logic I already have and call decodeRoc from my field skipping logic. I tried that, using {} for the state and val types, but when I do so I get a lot of compiler errors in unrelated parts of the code, as well as this error that I imagine gives the closest idea of what's going on:
── TYPE MISMATCH in package/Rocon.roc ──────────────────────────────────────────
This 1st argument to decodeRecord has an unexpected type:
710│ (decodeRecord {} stepField finalizer)
^^
The argument is a record of type:
{}
But decodeRecord needs its 1st argument to be:
state
Tip: The type annotation uses the type variable state to say that this
definition can produce any type of value. But in the body I see that
it will only produce a record value of a single specific type. Maybe
change the type annotation to be more specific? Maybe change the code
to be more general?
Is this a bug? Feels like what the compiler is saying is similar to saying you cannot pass {} to identity : a -> a, because a is more general than {}.
(I'm passing compatible types for stepField and finalizer)
I get this error when uncommenting this block and removing the crash line above it.
Cool! On the name side ron (roc object notation) is also an option. The downside is that we never really use the word object in Roc
rvn roc value notation
rvn could be pronounced "raven"
The mythical bird of prey gets a less mythical smaller cousin, a jack of all trades.
I like rvn and raven!
I'm so happy to see this coming along. I want to use it to cache sessions in memory in basic-webserver. I just need to finish this PR (add Effect for a KV cache) and then we can use it :smiley:
The name RON is already taken by Rusty Object Notation, so rvn sounds like a great alternative :+1:
rvn is a great name
According toWikipedia,
The collective noun for a group of ravens is an "unkindness".[4] In practice, most people use the more generic "flock".[5]
Hmm Unkindness DB sounds a bit offputting
The latin for raven is Corvus corax. Corax DB? :thinking:
Though Raven DB is actually pretty good on its own as well. No need to get fancy
A murder of crows and an unkindness of ravens.... People are so unfair to these birds.
Haha, let's create some good PR for Ravens!
I think I understand the error I was running into now. Definitely my mistake, but I think the compiler led me astray a bit too :sweat_smile:. I created an issue for that here: https://github.com/roc-lang/roc/issues/6592
Another update!
https://github.com/jwoudenberg/rvn
Rvn now does:
I'm pretty happy with where it's at currently. A couple of things I still want to add at some point:
"\u{40}"'🤷'The two codepoint related ones are more of a completionist thing than anything else. They're blocked on being able to pull in the unicode library.
All feedback welcome!
I guess Rvn is not a strict subset of Roc because Rvn is not indentation-sensitive. I don't think there's any ambiguity arising from that. Rvn could be made to fail decoding if indentation rules are not observed, but I think that would have mostly downsides?
yeah I don't think that should matter for literals
this is really awesome progress! :star_struck:
Last updated: Jun 16 2026 at 16:19 UTC