I've just packaged up the Html module created by @Brian Carroll, here's the repo and the docs, and here's an example app:
app "html-example"
packages {
pf: "https://github.com/roc-lang/basic-cli/releases/download/0.3.1/97mY3sUwo433-pcnEQUlMhn-sWiIf_J9bPhcAFZoqY4.tar.br",
html: "https://github.com/Hasnep/roc-html/releases/download/v0.1.0/BIQytwqKV_7C1cK1outBaik7Kn5EQpSE4mAAaH6g8Vw.tar.br",
}
imports [
pf.Stdout,
html.Html,
html.Attribute,
]
provides [main] to pf
main =
page = Html.html [] [
Html.body [] [
Html.h1 [] [Html.text "Roc"],
Html.p [] [
Html.text "You should really check out ",
Html.a [Attribute.href "https://roc-lang.org/"] [Html.text "Roc"],
Html.text "!",
],
],
]
Stdout.line (Html.render page)
which outputs this string:
<!DOCTYPE html><html><body><h1>Roc</h1><p>You should really check out <a href="https://roc-lang.org/">Roc</a>!</p></body></html>
Love the docs generation too. I'm going to try and do the same for Json package.
Here's the workflow that generates and deploys the docs to github pages, you just need to go into the repo's settings and choose "deploy github pages from action"
Cool, nice to see it as a package!
I noticed that it seems like roc-html doesn't escape text, which could lead to cross site scripting (XSS) vulnerabilities if someone used user input. For example, this test fails:
expect
userInput = "<script>alert('do something bad here')</script>"
exampleDocument = html [] [body [] [p [(attribute "example") "test"] [text userInput]]]
out = render exampleDocument
out != "<!DOCTYPE html><html><body><p example=\"test\"><script>alert('do something bad here')</script></p></body></html>"
@Hannes Would you be open to a PR to make it escape text and attribute values?
Is that something that should be handled by the user of this library? I imagine the tradeoff is performance? We could look at how Elm does it for inspiration? Do you know @Ajai Nelson if this functionality is common for this kind of library?
Yeah, I think it’s common practice for libraries like this and template engines to escape tags in text nodes. I don’t think I have ever used one that doesn’t.
As far as I know, this functionality is extremely common for this kind of library. Elm might not have to deal with it directly because it’s on the client side, and most DOM methods do the escaping for you. And I think most libraries still provide an escape hatch (no pun intended) if you don’t don’t want to escape something, so we could do that.
Actually, Elm does escape html in text nodes and bunch of other places
It has to because it’s super common to show content that’s provided by a different user’s input.
It doesn’t necessarily have to do that itself directly because there are browser APIs that escape HTML automatically. I know some JS libraries do some escaping manually at compile time for performance reasons, but I’m under the impression that they usually rely on the browser APIs. It’s still safe either way.
Anyway, that’s not relevant for server-side rendering in Roc because we’re not in the browser, so I think we should definitely do the escaping
Ajai Nelson said:
It doesn’t necessarily have to do that itself directly because there are browser APIs that escape HTML automatically. I know some JS libraries do some escaping manually at compile time for performance reasons, but I’m under the impression that they usually rely on the browser APIs. It’s still safe either way.
Yeah, but to keep purity Elm also has to go further than what those APIs can do by preventing <script>
tags, javascript:
URLs, and event handlers attributes when you're generating HTML. This is especially important for the security of the package ecosystem. This article has information on the topic.
Oh, I see, that makes sense
Ajai Nelson said:
Anyway, that’s not relevant for server-side rendering in Roc because we’re not in the browser, so I think we should definitely do the escaping
Agreed. I can see why this wasn't needed for static site generation in cases where you control the input, but I think that case is quite rare actually.
And even in that case, it’s nice not to have to escape <
s and stuff manually
yeah it's definitely best practice for server-side HTML rendering to escape it :thumbs_up:
I wonder if you could get chatGPT to translate https://github.com/OWASP/java-html-sanitizer to roc
I think sanitization might be a lot more complicated than we need right now for text nodes. If I understand right, sanitization assumes that you want to display HTML written by the user, so it renders that HTML safely by removing unsafe elements and attributes and stuff. Escaping just means making sure it renders as text instead of HTML. (See https://web.dev/sanitizer/#the-difference-between-escaping-and-sanitizing.)
sanitize('<em>hello world</em><img src="" onerror=alert(0)>') == '<em>hello world</em><img src="">'
escape('<em>hello world</em><img src="" onerror=alert(0)>') == '<em>hello world</em><img src="">'
I think escaping is what we want most of the time anyway for text nodes, and it's much simpler, safer, and presumably faster. I've been reading about escaping today and looking at web frameworks in a few different languages. It seems like all escaping requires is:
'
s with '
&
s with &
"
s with "
<
s with <
>
s with >
The same procedure works for both text nodes and attribute values. The only footgun I've found is that this replacement needs to be done with codepoints, not graphemes. The string something<h1>͏bad</h1>͏something
contains the codepoint for <
, but it doesn't include <
as a grapheme because it's combined with a combining character. The browser still considers that string to have an HTML tag, so it's presumably important to do the replacement at the codepoint level. (Due to the way UTF-8 works, the bytes for '
, &
, "
, <
, and >
can never appear as a byte inside a larger codepoint, so just looking at the byte level works too.)
I think I have the escaping working in roc-html, and I created an opaque type SafeStr
to make it harder to mess up. I can submit a PR soon if people think that's a good idea.
makes sense!
Ajai Nelson said:
I think I have the escaping working in roc-html, and I created an opaque type
SafeStr
to make it harder to mess up. I can submit a PR soon if people think that's a good idea.
Sounds great! One of the reasons I wanted to package it up was so that it could get PRs like this :)
Just released v0.2.0, thanks @Ajai Nelson! Here's an example app:
app "html-example"
packages {
pf: "https://github.com/roc-lang/basic-cli/releases/download/0.3.1/97mY3sUwo433-pcnEQUlMhn-sWiIf_J9bPhcAFZoqY4.tar.br",
html: "https://github.com/Hasnep/roc-html/releases/download/v0.2.0/5fqQTpMYIZkigkDa2rfTc92wt-P_lsa76JVXb8Qb3ms.tar.br",
}
imports [
pf.Stdout,
html.Html,
]
provides [main] to pf
main =
Html.html [] [
Html.body [] [
Html.h1 [] [Html.text "Epic Hacking Website"],
Html.p [] [
Html.text "Here's some sneaky JavaScript code to hack your computer:",
Html.text "<script>alert('You have been hacked!')</script>",
],
],
]
|> Html.render
|> Stdout.line
<!DOCTYPE html>
<html>
<body>
<h1>Epic Hacking Website</h1>
<p>
Here's some sneaky JavaScript
code to hack your computer:<script>alert('You have been
hacked!')</script>
</p>
</body>
</html>
Hi @Hannes . Maybe this is a silly question, but is roc-html targeted only at generating html strings or do you support (or plan to) the usage of for DOM nodes as well, if roc wants to do things similar to what Elm does?
I can answer that! I wrote the code that became roc-html.
A virtual dom, like elm has, is a much much bigger job than the static HTML library. I started working on it last year but the recursive data structures triggered some compiler bugs, so I put it on hold.
I think most of the bugs are fixed now. But as far as I know the code still doesn't run. I am working on other projects at the moment.
So it may get picked up again at some point but it's not being actively worked on.
I've just released roc-html v0.7.0 which supports Roc v0.0.0-alpha2. Thanks @Luke Boswell :)
LoL I literally needed this yesterday so cloned it and changed locally. Thanks for the update!
@Hannes Does roc-html only use the parenthesis function calls now that it's using Roc v0.0.0-alpha2? Asking for an elm-html fan who could still live with parens <3
Both styles still work right now
Last updated: Jul 06 2025 at 12:14 UTC