Stream: ideas

Topic: XML parser


view this post on Zulip Johannes Maas (Feb 24 2024 at 10:31):

I'm about to play around with writing an XML parser for Roc. (I want to parse RSS and Atom feeds.)

My current approach is to port https://github.com/miniBill/elm-xml-parser/blob/master/src/XmlParser.elm using https://github.com/lukewilliamboswell/roc-parser.

If anyone has already worked on an XML parser or is interested in helping out, feel free to reach out. Currently I'm just toying around with it locally.

view this post on Zulip Luke Boswell (Feb 24 2024 at 10:40):

I was thinking about this just this week. I haven't started or anything. Happy to help :smiling_face:

view this post on Zulip Johannes Maas (Feb 24 2024 at 10:46):

I was just researching the license issues with porting. Do you have any ideas on that? :D Right now I'm guessing, I'll just program it from scratch, orienting myself around specs.

Otherwise, feel free to suggest how you would like to collaborate. I can't guarantee that I'll be reliably working on this, so if you're planning on writing one anyway, it might smoother if I contribute to your code? Just throwing out ideas, we'll find a way that works for us. :)

view this post on Zulip Anton (Feb 24 2024 at 11:16):

You are free to port the code as long as you include the elm-xml-parser license in your project.

view this post on Zulip Johannes Maas (Feb 24 2024 at 11:17):

Anton said:

You are free to port the code as long as you include the elm-xml-parser license in your project.

Yes, thanks. I think, it'll still be easier to do it from scratch so that we can choose our own license.

I've already started out simplifying it a lot, so that I can just parse the types of files I'm interested in. So I think this should work out.

view this post on Zulip Anton (Feb 24 2024 at 11:26):

it'll still be easier to do it from scratch so that we can choose our own license.

You can still choose your own license, you just need to have the elm-xml-parser license in your project and in any distribution (e.g. release archive).

view this post on Zulip Johannes Maas (Feb 24 2024 at 11:26):

Oh, good to know, thanks!

view this post on Zulip Anton (Feb 24 2024 at 11:26):

This kind of confusion is why I like the CC0-1.0 license :)

view this post on Zulip Ricardo Valero de la Rosa (Feb 24 2024 at 16:32):

I started out a week ago XD using roc-parser, I didn't know about elm-xml-parser
I stopped because I couldn't figure out something about graphemes (and went down that rabbit hole) like how to use "bigger" graphemes that where not U8.
I'm not at my laptop right now but I'm curious how to solve that.

view this post on Zulip Anton (Feb 24 2024 at 16:49):

like how to use "bigger" graphemes that where not U8.

We plan to release the roc unicode package for that but I think it still needs some work.

view this post on Zulip Ricardo Valero de la Rosa (Feb 24 2024 at 17:48):

Any tips on how to get started? How can I help?

view this post on Zulip Anton (Feb 24 2024 at 18:14):

Thanks @Ricardo Valero de la Rosa :)
The main problem I'm aware of is that all code in Scalar.roc is commented out, can you try checking if that just works when uncommented?
Graphemes and scalars got removed from Roc in PR#6395, so that may be useful to look at as well.

view this post on Zulip Richard Feldman (Feb 24 2024 at 18:57):

Ricardo Valero de la Rosa said:

I stopped because I couldn't figure out something about graphemes (and went down that rabbit hole) like how to use "bigger" graphemes that where not U8.

I'm curious what the thing was! Do you remember by any chance?

view this post on Zulip Ricardo Valero de la Rosa (Feb 24 2024 at 19:07):

When trying to do a spec compliant parser there are other graphemes that the roc-parser wouldn't accept

For example NameChar (sorry this is the name in the spec) uses NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] | [#x203F-#x2040]
https://www.w3.org/TR/xml/#NT-NameChar

view this post on Zulip Ricardo Valero de la Rosa (Feb 24 2024 at 19:08):

It's dirty but I have some comments in here

https://github.com/lukewilliamboswell/roc-parser/pull/12

view this post on Zulip Richard Feldman (Feb 24 2024 at 20:23):

ah, so those would be code points - they fit in U32, and shouldn't need to go all the way to graphemes! :big_smile:

view this post on Zulip Luke Boswell (Feb 24 2024 at 21:27):

Just noticed unicode needs updating for Nat removal, I can fix that today so we can use it for xml

view this post on Zulip Luke Boswell (Feb 24 2024 at 21:29):

Specifically the CodePoint.roc parts.

view this post on Zulip Johannes Maas (Feb 25 2024 at 14:13):

I've had some fun playing around with parsing XML. The current result is here: https://github.com/j-maas/roc-xml-parser

Feel free to look around and use this code. If someone would like to just take over the code by forking, please do so!

view this post on Zulip Johannes Maas (Feb 25 2024 at 14:14):

However, Roc seems to hang when running roc test package/main.roc... Not sure what is going on.

view this post on Zulip Johannes Maas (Feb 25 2024 at 14:14):

roc test dependencies/roc-parser/pakcage/main.roc works, and from playing around with it, it seems that as soon as I import something in package/Xml.roc from the parser package it just hangs without output.

view this post on Zulip Isaac Van Doren (Feb 25 2024 at 14:56):

I’m not sure if it is related in your case, but if you import a package in an app module and then want to use that package in an interface module, you need to also import the module you are going to use in the app module (even if you aren’t using it there)

view this post on Zulip Johannes Maas (Feb 25 2024 at 14:59):

I only have a package, no app, and I'm exporting the module there. I'm also importing another package to be used in my module. If I remove the imports in my module, it doesn't hang.

view this post on Zulip Richard Feldman (Feb 25 2024 at 15:26):

ah this is probably the known bug about packages not being able to depend on other packages currently

view this post on Zulip Johannes Maas (Feb 25 2024 at 18:32):

For reference, I believe this is the issue with not being able to import other packages: https://github.com/roc-lang/roc/issues/5654


Last updated: Jun 16 2026 at 16:19 UTC