Stream: contributing

Topic: Pygments


view this post on Zulip batyoullfly (Aug 07 2024 at 23:06):

Has anyone looked into adding Roc to the Pygments Python library? It seems to be what Zulip uses for code block syntax highlighting, so if a lexer for Roc was added, eventually Roc code blocks would work in Zulip (I think?)

view this post on Zulip Luke Boswell (Aug 07 2024 at 23:30):

I don't think it has been discussed before

view this post on Zulip Luke Boswell (Aug 07 2024 at 23:30):

Sounds awesome though :sunglasses:

view this post on Zulip batyoullfly (Aug 07 2024 at 23:34):

I took a look earlier and I’m not quite sure how to actually implement it even after looking at the existing elm lexer but I might take a stab at it…

view this post on Zulip Kiryl Dziamura (Aug 08 2024 at 15:31):

It actually was discussed, but never implemented I think

view this post on Zulip batyoullfly (Aug 08 2024 at 16:45):

If I were to implement this where would be the best place to get a full list of all keywords and other syntax?

view this post on Zulip Anton (Aug 09 2024 at 09:09):

I think this one is not completely up to date but a good overview anyway.

view this post on Zulip batyoullfly (Aug 09 2024 at 11:50):

Thanks! Don’t know when I’ll have time but I read through the documentation last night for adding a new lexer and it was very in depth and it seems pretty simple, so I’ll definitely do it

view this post on Zulip batyoullfly (Aug 10 2024 at 15:16):

Making progress!
With Pygment's Elm lexer:
elm.png
With the new Roc lexer:
roc.png

view this post on Zulip batyoullfly (Aug 10 2024 at 15:19):

I didn't realize there would be so many decisions to make. Like the Rust lexer treats number literal suffixes (u32, i32, etc.) as keywords so they're highlighted differently from the rest of the number, but using the Roc language server for Zed, the suffixes are highlighted the same color...

view this post on Zulip batyoullfly (Aug 18 2024 at 13:45):

I think it might be hard to get a PR accepted considering roc isn’t at a 1.0 release yet and elements of the syntax are still changing somewhat often, but there is a way to write a plugin for Pygments which might be a better solution until things stabilize a bit. Would this be possible with the way this Zulip instance is hosted? I’m not really familiar with how this is all set up.

view this post on Zulip Richard Feldman (Aug 18 2024 at 13:50):

we got it accepted on GitHub - I'd think mentioning that in the PR would mean we'd have a good chance, yeah? :big_smile:

view this post on Zulip batyoullfly (Aug 18 2024 at 13:55):

Would almost definitely help, just also concerned about having to raise a PR every three months when something changes as I’m not sure how quickly things move for them typically. For reference the PR for adding Gleam support has been sitting for months

view this post on Zulip batyoullfly (Aug 18 2024 at 13:57):

Would certainly be easier to maintain a plugin that could be immediately updated as features are added and removed and I’m not sure there would be any demand for Roc support in this library outside of this Zulip

view this post on Zulip Richard Feldman (Aug 18 2024 at 14:00):

I don't think we need to update it that frequently honestly

view this post on Zulip Richard Feldman (Aug 18 2024 at 14:01):

the syntax doesn't change that much haha

view this post on Zulip Richard Feldman (Aug 18 2024 at 14:02):

almost all of the benefit is just getting the basic language in there...if we add some new sugar or something that isn't highlighted right away, it's not a big deal

view this post on Zulip batyoullfly (Aug 18 2024 at 14:07):

Fair enough, I’ll see if I can go ahead and get a PR raised today. Just need to do a little more testing

view this post on Zulip Sam Mohr (Aug 18 2024 at 15:08):

@batyoullfly can you add me to that? There have been some changes I've made recently I'd like to validate make it in!

view this post on Zulip batyoullfly (Aug 18 2024 at 15:17):

Yeah sure I haven’t published a fork or branch or anything yet but I’ll share it here later

view this post on Zulip batyoullfly (Aug 18 2024 at 21:28):

@Sam Mohr Here's the fork: https://github.com/ethannixon66/pygments

view this post on Zulip batyoullfly (Aug 18 2024 at 23:05):

Fixed a few issues and here's how things are looking now (using roc-parser as the example code, and the zulip css theme):
image.png
image.png
image.png

view this post on Zulip batyoullfly (Aug 18 2024 at 23:06):

Would love some feedback if some things aren't looking quite right to anyone

view this post on Zulip batyoullfly (Aug 19 2024 at 01:05):

Just noticed an issue in one of those screenshots with the character regex matching the wrong apostrophe I gotta fix that lol

view this post on Zulip batyoullfly (Aug 19 2024 at 01:06):

other than that the only thing I’m not sure on is if the \ should be considered an operator or not. Considering it as punctuation makes it the same color as identifiers which seems harder to read

view this post on Zulip Richard Feldman (Aug 19 2024 at 01:13):

this is looking sweet! :smiley:

view this post on Zulip Richard Feldman (Aug 19 2024 at 01:13):

personally I like having \ and { and [ be highlighted :thumbs_up:

view this post on Zulip Richard Feldman (Aug 19 2024 at 01:14):

I also like to not highlight tag names like Ok and Err because they're not language keywords, they're just user-chosen names (so I prefer them to be highlighted like record field names or variable names, which are also user-chosen)

view this post on Zulip Sam Mohr (Aug 19 2024 at 01:47):

Looks like everything will work (from my phone), with the exception of Rich's requests. Can you check the new builder syntax looks good? Otherwise, I think it's pretty much ready for a PR?

view this post on Zulip batyoullfly (Aug 19 2024 at 02:11):

Richard Feldman said:

I also like to not highlight tag names like Ok and Err because they're not language keywords, they're just user-chosen names (so I prefer them to be highlighted like record field names or variable names, which are also user-chosen)

Hmmm, not sure if there’s a good way to deal with any of these other than / which I can reclassify as an operator. Tag names are highlighted because I started from a base of the existing elm lexer which considers any token to be a type if it starts with a capital and has valid characters. And brackets are white because that’s how the theme styles punctuation tokens and there’s not really anything else I could reclassify those as I think

view this post on Zulip batyoullfly (Aug 19 2024 at 02:12):

Could definitely narrow the type tokens down to just those that appear in a type annotation but doing the parsing for that using regex is gonna be a lot more difficult than just finding capitals lol

view this post on Zulip Richard Feldman (Aug 19 2024 at 02:45):

oh I think types should also be highlighted the same way as variables - does that help?

view this post on Zulip batyoullfly (Aug 19 2024 at 10:55):

Like everything would be white except for keywords and symbols?

view this post on Zulip batyoullfly (Aug 19 2024 at 11:05):

I guess the issue here is more just the theme Zulip uses. Looking through the lexers for other ML adjacent languages, they all seem to use the namespace, class, or type token for types / module names so I think one of those is what we should stick with probably

view this post on Zulip Richard Feldman (Aug 19 2024 at 23:34):

batyoullfly said:

Like everything would be white except for keywords and symbols?

sorry, I should have been more specific on that - I don't like tags being highlighted the same way as language keywords because I think things that are hardcoded into the language should get separate highlighting from things that are user-specified

view this post on Zulip Richard Feldman (Aug 19 2024 at 23:34):

so for example, I don't think Account (whether it's a type or a tag) should be highlighted the same way as then

view this post on Zulip Richard Feldman (Aug 19 2024 at 23:35):

because Account is either a module name or a tag or a type, but regardless, a user chose that name

view this post on Zulip Richard Feldman (Aug 19 2024 at 23:36):

personally I actually like having it be highlighted the same way as either string or number literals if it's a tag, but I realize it may not be realistic for the parser to tell whether it's a tag vs a module name vs a type :big_smile:

view this post on Zulip Richard Feldman (Aug 19 2024 at 23:38):

so maybe a better way to say it is that my preference is:

view this post on Zulip batyoullfly (Aug 19 2024 at 23:56):

Oh okay. I can try switching to Class or Namespace to see if that highlights different from keywords, and I can reclassify , and \ as operators or something similar

view this post on Zulip batyoullfly (Aug 20 2024 at 01:22):

Okay, here it is with \ and , reclassified as operators and types/tags/modules reclassified as Classes
image.png

view this post on Zulip batyoullfly (Aug 20 2024 at 01:24):

Should = and : also be classified as operators rather than punctuation?

view this post on Zulip batyoullfly (Aug 20 2024 at 01:31):

That would look like this
image.png

view this post on Zulip batyoullfly (Aug 20 2024 at 01:35):

Sam Mohr said:

Looks like everything will work (from my phone), with the exception of Rich's requests. Can you check the new builder syntax looks good? Otherwise, I think it's pretty much ready for a PR?

image.png
This synatx?

view this post on Zulip Sam Mohr (Aug 20 2024 at 01:52):

batyoullfly said:

Sam Mohr said:

Looks like everything will work (from my phone), with the exception of Rich's requests. Can you check the new builder syntax looks good? Otherwise, I think it's pretty much ready for a PR?

image.png
This synatx?

Yes that one, looks good!

view this post on Zulip Luke Boswell (Aug 20 2024 at 02:29):

Can we check the unicode literal interpretation syntax "\u(001b)[32m" please?


Last updated: Jul 05 2025 at 12:14 UTC