Stream: show and tell

Topic: Roc Html


view this post on Zulip Hannes (May 31 2023 at 01:11):

I've just packaged up the Html module created by @Brian Carroll, here's the repo and the docs, and here's an example app:

app "html-example"
    packages {
        pf: "https://github.com/roc-lang/basic-cli/releases/download/0.3.1/97mY3sUwo433-pcnEQUlMhn-sWiIf_J9bPhcAFZoqY4.tar.br",
        html: "https://github.com/Hasnep/roc-html/releases/download/v0.1.0/BIQytwqKV_7C1cK1outBaik7Kn5EQpSE4mAAaH6g8Vw.tar.br",
    }
    imports [
        pf.Stdout,
        html.Html,
        html.Attribute,
    ]
    provides [main] to pf

main =
    page = Html.html [] [
        Html.body [] [
            Html.h1 [] [Html.text "Roc"],
            Html.p [] [
                Html.text "You should really check out ",
                Html.a [Attribute.href "https://roc-lang.org/"] [Html.text "Roc"],
                Html.text "!",
            ],
        ],
    ]
    Stdout.line (Html.render page)

which outputs this string:

<!DOCTYPE html><html><body><h1>Roc</h1><p>You should really check out <a href="https://roc-lang.org/">Roc</a>!</p></body></html>

view this post on Zulip Luke Boswell (May 31 2023 at 04:26):

Love the docs generation too. I'm going to try and do the same for Json package.

view this post on Zulip Hannes (May 31 2023 at 04:51):

Here's the workflow that generates and deploys the docs to github pages, you just need to go into the repo's settings and choose "deploy github pages from action"

view this post on Zulip Brian Carroll (May 31 2023 at 06:52):

Cool, nice to see it as a package!

view this post on Zulip Ajai Nelson (May 31 2023 at 20:30):

I noticed that it seems like roc-html doesn't escape text, which could lead to cross site scripting (XSS) vulnerabilities if someone used user input. For example, this test fails:

expect
    userInput = "<script>alert('do something bad here')</script>"
    exampleDocument = html [] [body [] [p [(attribute "example") "test"] [text userInput]]]
    out = render exampleDocument
    out != "<!DOCTYPE html><html><body><p example=\"test\"><script>alert('do something bad here')</script></p></body></html>"

@Hannes Would you be open to a PR to make it escape text and attribute values?

view this post on Zulip Luke Boswell (May 31 2023 at 22:16):

Is that something that should be handled by the user of this library? I imagine the tradeoff is performance? We could look at how Elm does it for inspiration? Do you know @Ajai Nelson if this functionality is common for this kind of library?

view this post on Zulip Agus Zubiaga (May 31 2023 at 22:23):

Yeah, I think it’s common practice for libraries like this and template engines to escape tags in text nodes. I don’t think I have ever used one that doesn’t.

view this post on Zulip Ajai Nelson (May 31 2023 at 22:25):

As far as I know, this functionality is extremely common for this kind of library. Elm might not have to deal with it directly because it’s on the client side, and most DOM methods do the escaping for you. And I think most libraries still provide an escape hatch (no pun intended) if you don’t don’t want to escape something, so we could do that.

view this post on Zulip Agus Zubiaga (May 31 2023 at 22:27):

Actually, Elm does escape html in text nodes and bunch of other places

view this post on Zulip Agus Zubiaga (May 31 2023 at 22:28):

It has to because it’s super common to show content that’s provided by a different user’s input.

view this post on Zulip Ajai Nelson (May 31 2023 at 22:34):

It doesn’t necessarily have to do that itself directly because there are browser APIs that escape HTML automatically. I know some JS libraries do some escaping manually at compile time for performance reasons, but I’m under the impression that they usually rely on the browser APIs. It’s still safe either way.

view this post on Zulip Ajai Nelson (May 31 2023 at 22:36):

Anyway, that’s not relevant for server-side rendering in Roc because we’re not in the browser, so I think we should definitely do the escaping

view this post on Zulip Agus Zubiaga (May 31 2023 at 22:40):

Ajai Nelson said:

It doesn’t necessarily have to do that itself directly because there are browser APIs that escape HTML automatically. I know some JS libraries do some escaping manually at compile time for performance reasons, but I’m under the impression that they usually rely on the browser APIs. It’s still safe either way.

Yeah, but to keep purity Elm also has to go further than what those APIs can do by preventing <script> tags, javascript: URLs, and event handlers attributes when you're generating HTML. This is especially important for the security of the package ecosystem. This article has information on the topic.

view this post on Zulip Ajai Nelson (May 31 2023 at 22:42):

Oh, I see, that makes sense

view this post on Zulip Agus Zubiaga (May 31 2023 at 22:45):

Ajai Nelson said:

Anyway, that’s not relevant for server-side rendering in Roc because we’re not in the browser, so I think we should definitely do the escaping

Agreed. I can see why this wasn't needed for static site generation in cases where you control the input, but I think that case is quite rare actually.

view this post on Zulip Ajai Nelson (May 31 2023 at 22:49):

And even in that case, it’s nice not to have to escape <s and stuff manually

view this post on Zulip Richard Feldman (May 31 2023 at 23:35):

yeah it's definitely best practice for server-side HTML rendering to escape it :thumbs_up:

view this post on Zulip Richard Feldman (May 31 2023 at 23:36):

I wonder if you could get chatGPT to translate https://github.com/OWASP/java-html-sanitizer to roc

view this post on Zulip Ajai Nelson (Jun 01 2023 at 00:44):

I think sanitization might be a lot more complicated than we need right now for text nodes. If I understand right, sanitization assumes that you want to display HTML written by the user, so it renders that HTML safely by removing unsafe elements and attributes and stuff. Escaping just means making sure it renders as text instead of HTML. (See https://web.dev/sanitizer/#the-difference-between-escaping-and-sanitizing.)

sanitize('<em>hello world</em><img src="" onerror=alert(0)>') == '<em>hello world</em><img src="">'
escape('<em>hello world</em><img src="" onerror=alert(0)>') == '&lt;em&gt;hello world&lt;/em&gt;&lt;img src=&quot;&quot;&gt;'

I think escaping is what we want most of the time anyway for text nodes, and it's much simpler, safer, and presumably faster. I've been reading about escaping today and looking at web frameworks in a few different languages. It seems like all escaping requires is:

The same procedure works for both text nodes and attribute values. The only footgun I've found is that this replacement needs to be done with codepoints, not graphemes. The string something؀<h1>͏bad؀</h1>͏something contains the codepoint for <, but it doesn't include < as a grapheme because it's combined with a combining character. The browser still considers that string to have an HTML tag, so it's presumably important to do the replacement at the codepoint level. (Due to the way UTF-8 works, the bytes for ', &, ", <, and > can never appear as a byte inside a larger codepoint, so just looking at the byte level works too.)

I think I have the escaping working in roc-html, and I created an opaque type SafeStr to make it harder to mess up. I can submit a PR soon if people think that's a good idea.

view this post on Zulip Richard Feldman (Jun 01 2023 at 02:23):

makes sense!

view this post on Zulip Hannes (Jun 01 2023 at 07:22):

Ajai Nelson said:

I think I have the escaping working in roc-html, and I created an opaque type SafeStr to make it harder to mess up. I can submit a PR soon if people think that's a good idea.

Sounds great! One of the reasons I wanted to package it up was so that it could get PRs like this :)

view this post on Zulip Hannes (Jun 05 2023 at 02:31):

Just released v0.2.0, thanks @Ajai Nelson! Here's an example app:

app "html-example"
    packages {
        pf: "https://github.com/roc-lang/basic-cli/releases/download/0.3.1/97mY3sUwo433-pcnEQUlMhn-sWiIf_J9bPhcAFZoqY4.tar.br",
        html: "https://github.com/Hasnep/roc-html/releases/download/v0.2.0/5fqQTpMYIZkigkDa2rfTc92wt-P_lsa76JVXb8Qb3ms.tar.br",
    }
    imports [
        pf.Stdout,
        html.Html,
    ]
    provides [main] to pf

main =
    Html.html [] [
        Html.body [] [
            Html.h1 [] [Html.text "Epic Hacking Website"],
            Html.p [] [
                Html.text "Here's some sneaky JavaScript code to hack your computer:",
                Html.text "<script>alert('You have been hacked!')</script>",
            ],
        ],
    ]
    |> Html.render
    |> Stdout.line
<!DOCTYPE html>
<html>
  <body>
    <h1>Epic Hacking Website</h1>
    <p>
      Here&#39;s some sneaky JavaScript
      code to hack your computer:&lt;script&gt;alert(&#39;You have been
      hacked!&#39;)&lt;/script&gt;
    </p>
  </body>
</html>

view this post on Zulip Éber Freitas Dias (Jun 06 2023 at 15:00):

Hi @Hannes . Maybe this is a silly question, but is roc-html targeted only at generating html strings or do you support (or plan to) the usage of for DOM nodes as well, if roc wants to do things similar to what Elm does?

view this post on Zulip Brian Carroll (Jun 06 2023 at 22:06):

I can answer that! I wrote the code that became roc-html.

A virtual dom, like elm has, is a much much bigger job than the static HTML library. I started working on it last year but the recursive data structures triggered some compiler bugs, so I put it on hold.

I think most of the bugs are fixed now. But as far as I know the code still doesn't run. I am working on other projects at the moment.

So it may get picked up again at some point but it's not being actively worked on.

view this post on Zulip Hannes (Feb 11 2025 at 12:02):

I've just released roc-html v0.7.0 which supports Roc v0.0.0-alpha2. Thanks @Luke Boswell :)

view this post on Zulip Kilian Vounckx (Feb 11 2025 at 12:13):

LoL I literally needed this yesterday so cloned it and changed locally. Thanks for the update!

view this post on Zulip Drew Lazzeri (Mar 18 2025 at 17:43):

@Hannes Does roc-html only use the parenthesis function calls now that it's using Roc v0.0.0-alpha2? Asking for an elm-html fan who could still live with parens <3

view this post on Zulip Anton (Mar 18 2025 at 18:15):

Both styles still work right now


Last updated: Jul 06 2025 at 12:14 UTC