I know this probably should be another topic... but if we're doing multi-line strings like that, I'd vote for just two quotes "". My thinking is that inline strings are contained between two " characters, and so multi-line strings are also using two "", and it's less to type out, and my editor automatically adds two when I type " or four if I type """ (so I'd need to remove the last " using delete if we went with three.
I also don't mind zig's syntax. In a world where Zig and Roc are mainstream it would be nice to use the same syntax.
How would you write an empty string then?
You don’t want that to accidentally trigger this multiline behavior
I guess we can't do the zig thing because we don't have a ; delimiter
I guess then mine is just a comment that three """ is a pain right now with my editor. But maybe that could be changed using some config?
Sounds like multiline strings will already be a problem for you, for the same reason.
Luke Boswell said:
I guess we can't do the zig thing because we don't have a
;delimiter
Or maybe we could? just between \\ and \n
Yeah the zig double backslash syntax would work fine
Also surely this editor does the right thing for python (which already does “”” strings)
Joshua Warner said:
Sounds like multiline strings will already be a problem for you, for the same reason.
No it's not currently... if I type five " I get six, and then I just move the cursor back to the middle and split it.
Ohhhh interesting
What editor is this? I feel like this ought to be fixable with proper per-language configuration
Zed
Fortunately I know someone who can make sure this works properly there :wink:
Yeah, it feels like a really minor point to be bringing up
Side by side
multiline_str =
\\{
\\ "foo": "bar",
\\ "baz": true
\\}
multiline_str =
"""{
""" "foo": "bar",
""" "baz": true
"""}
I feel like odd number quotes never works correctly for me
Helix and I don't recall it working in my vim setup before that
Always have to manually delete 1 cause it makes even pairings
I'm open to experimenting with other syntaxes for that!
What does helix do for python's """ strings?
We aren't currently using backticks (`) for anything, what about using those Zig-style?
multiline_str =
`{
` "foo": "bar",
` "baz": true
`}
I don't think it would have any problems for markdown snippets since only one at a time doesn't confuse full ``` blocks
And though this wouldn't work for inline markdown blocks, this is specifically a multiline syntax, so I don't think that's a problem
I also think just single quotes would work for this, since they currently need to end on the same line for char literals:
multiline_str =
'{
' "foo": "bar",
' "baz": true
'}
I don't like backticks in language syntax becuase they mess up markdown when you're trying to talk about code :sweat_smile:
single quote is interesting!
:thinking: I suppose we could even do just normal double quotes at that point
Actually yeah lmao
Just do double quotes
just allow unclosed double quotes as long as there's more than one of them consecutively
It was too obvious
A downside to using double quotes for single-line and multi-line quotes is that auto-"bracket" closing in editors will usually make a second quote when you type the first quote
So when you're trying to type these, you'll have to manage the second quote created all the time
Maybe we'd want to avoid adding double quotes to the auto closing "brackets" in community-maintained editor configs
good point
do they do that with single quotes?
Zed does
hey, you know what's unused?
multiline_str =
;{
; "foo": "bar",
; "baz": true
;}
You know what? Sure
And it means that we can't add them for anything else
Which is the real win
I would like this because it's so easy to see where the edge is...
multiline_str =
|{
| "foo": "bar",
| "baz": true
|}
...but then you couldn't use them in patterns
which is probably fine?
Richard Feldman said:
do they do that with single quotes?
It's per-language config, seems like Rust doesn't auto-pair single quotes
but at the same time, something that feels it should work
I really like how the pipes line up
But or for pattern alternation makes less and less sense in the brackets-dominated Roc we're moving towards
I think it'd be okay, though
you think or in patterns would be ok?
Yes
I do too, although I don't have strong feelings about it
I just still think | is much more visually distincting between adjacent patterns, so I'd prefer to find something else for multiline string prefixes
I think the auto-pair thing for strings is really minor, and using " for all strings just Makes Sense
multiline_str =
"{
" "foo": "bar",
" "baz": true
"}
definitely confuses today's formatters when there are quotes in there :laughing:
multiline_str =
"Let's think the unthinkable,
"let's do the undoable.
"Let us prepare to grapple with the ineffable itself,
"and see if we may not eff it after all.
How much do multiline strings have single quotes in them?
I dunno, but I think double quotes seem like the most reasonable default choice
just because you look at it and are like "yep, got it"
They'd all need to be escaped if they aren't followed by something else
I’d bet a lot of use cases for multiline strings have double quotes in them
oh to be clear, I think that's fine
actually...the lexer might not think it's fine :sweat_smile:
without looking ahead a line
I mean it's doable but not efficient
Which was the whole point of having a prefix per line
Would be nice to have a good answer to “the regex problem” - ie regexes start to be really hard to read if you have to escape things both at the regex level and the string level
well, a prefix that wouldn't need to be escaped I suppose
Currently roc multiline strings do that pretty well
Don’t want to lose that
personally, I would describe the regex problem as the problem where you don't have a more ergonomic way to describe type-safe string pattern searching than regexes
I think regexes should be an end-userspace thing
e.g. you're writing a tool that supports letting the user do regex search
that is, I think hardcoding regexes in code is a code smell
in significant part because they're either so trivial you could write them without a regex, or else they're so complicated that doing them as a regex (e.g. instead of a simple, composable parser library) makes them "write once, read never"
but anyway, neither here nor there :big_smile:
multiline_str =
"""{
""" "foo": "bar",
""" "baz": true
"""}
Yeah, trying to use that as a typical example, but far from the only case
the triple quotes work for that purpose
since triple quotes tend not to occur in string literals
Is there a way to get popular editors to put the triple quote prefix on the next line when you hit enter?
If so, then it's okay
I know helix does that for comments
Maybe there's some equivalent for VS code?
yeah and bullet points in markdown
not sure how much it generalizes
Oh, that would probably generalize
I'll look into it later
JK, we just use "comments" in Helix: https://github.com/helix-editor/helix/issues/12782
VS code has an actual tool for this which seems to be per-language: https://code.visualstudio.com/api/language-extensions/language-configuration-guide#on-enter-rules
Okay, so since we can get the editor to add the """ it should be a good candidate
Assuming we don't use | after changing to or for pattern alternation
A downside to using a single " as the prefix is that then you couldn't have a single line multi line string with unescaped double quotes in it. I like that feature of the current multiline string syntax
"""{"these quotes": "aren't manually escaped""""
I'm a bit worried that using this style of multiline comments will be annoying to copy and paste text.
When you want to paste multiline text in roc, then you will have to add e.g. """ before each line.
And when you want to copy this text out of roc the you will have to remove the """ before each line.
"""Lorem ipsum dolor sit amet
"""consectetur adipiscing elit,
"""sed do eiusmod tempor.
Where this problem doesn't occur in the current syntax:
"""
Lorem ipsum dolor sit amet
consectetur adipiscing elit,
sed do eiusmod tempor.
"""
repeated """ on every line feels strictly worse than wrapping with """. Still has the odd number quote problem and doesn't really have any gains for users.
I think repeated anything on every line is probably worse. However, it solves the question of “is the indentation included in the resulting string or not?”
!No question,
!the resulting string won’t have any
!indentation at the start of each line
yeah, if you want to copy/paste a big chunk of text, I'd say putting it in a separate file and importing it as a Str is a reasonable choice
the problem with most multiline string implementations is that either they silently trim leading whitespace or else they silently mean something else if you change the indentation level of whatever code block they're in, both of which are really undesirable properties of a string literal
a "leading edge is marked explicitly" design may be less convenient to write, but once you've written it, it's really clear what the actual string in question is, and you can move it around in the code base without changing its meaning
as an aside, I could see an argument for optionally allowing a \ at the end of each line of these
they'd get discarded by the parser, but the purpose would be:
an alternative way to support that use case would be to require a closing delimiter on every line of the multiline string, but that would be annoying and not beneficial in 99% of cases :sweat_smile:
I also heard someone mention in another thread that using quotes on every line makes parsing easier, more parallelizable, and more error tolerant, cuz a parser can figure out if it's in the middle of a multi-line string.
But is this ambiguous?
x =
"".string_method()
Is that the string ".string_method()", or is it actually calling the method on the empty string?
The nice thing about an odd number of quotes is that you know you're in the middle of a string, and haven't closed it.
(But I also have editor problems with 3 quotes)
yeah triple quotes seems like the frontrunner
Could always go with the basic solution of multiple single line strings (though that doesn't help with character escaping)
In languages that allow that, there's a significant footgun where you can be making a list of strings (each on a separate line), and you forget the comma between two of them (at the end of the line) and -boom- surprising behavior.
> is commonly used to mean a text quote. It's also only one character instead of """. Would that be a reasonable symbol?
>This is how we Roc, this is how we win
>This is how we move it
>This is how we Roc, this is how we win
>This is how we move it
I think that's ambiguous with greater than:
is_greater_than =
long_variable_name
> other_long_variable_name
multiline_text =
>start of text
> other_long_variable_name
We can disambiguate by knowing if some first line starts with >, but then we have to know the shape of prior lines to tokenize
Great, because I still prefer:
"""
foo
bar
"""
:big_smile:
Yep, that's what most languages use, it's really obvious
I wonder if this is the only feature that blocks us from being able to tokenize a file in chunks
It’s also not clear whether (and how much) of the indentation white space is included
Prefixing each line with something makes that super clear
Also this is the last remaining “indentation sensitive” thing (for exactly that reason)
It’s kinda ambiguous how much indentation we should remove if the multiline string didn’t start on its own line (eg perhaps there were some utf8 chars earlier on the line - how wide are those?)
yeah my overall feeling is that maximum clarity is good with these
so when you look at it you can instantly and unequivocally tell where the leading edge is
no stress about indentation level potentially changing anything
Separately, maybe requiring a backslash at the end of lines that want to preserve trailing whitespace would be good? Seems invisible otherwise
Can we just do the basic quotes that auto merge. I guess it doesn't help with escape characters and that is the main drawback:
str =
"This is line one"
" This is line two with whitespace. "
"Line 3"
Yep, escape chars look better
I also worry about putting those in a list and not realizing you missed a comma at the end of one line
(e.g. they were meant to be separate strings but they were accidentally combined)
Good point
Yeah, python has that syntax with the idea that you can split long strings into multiple lines with the lines automatically being concatenated. Accidentally using that syntax has caught me and my colleagues out so many times that I always enable the linting rule to warn about it. :sweat_smile:
I want to register that I _really_ like the idea of the | syntax. But I also really like or for pattern alternatives.
I think you can also keep | for pattern alternatives even using it for multiline strings - if we don't allow multiline string patterns
Do we even allow multiline string patterns today?
Here's the worst case scenario doing that:
Blue | Red ->
|This is either red or blue.
|It's a color I like.
|This is my poem.
Green | Yellow ->
|I don't know how I feel
|about these colors.
So pattern alternation on either side of a multiline string with the | syntax
Here's another interesting take that's less ambiguous:
Blue | Red ->
"|This is either red or blue.
|It's a color I like.
|This is my poem.
"
Green | Yellow ->
"|I don't know how I feel
|about these colors.
"
A double quote followed by a | mean we consumed all of this line, and all lines following that begins (after whitespace) with a | until we encounter a line that has " as the first non-whitespace character and then we stop. We trim all | and the trailing newline and concat the lines with newlines.
The only downside I see here is that a | at the start of a string would have to be escaped in a normal string
Blue | Red ->
"|This is either red or blue.
|It's a color I like.
|This is my poem.
"
just some polishing of the idea from my side. "| and | are not aligned. there's a whitespace before the |. maybe the first line might be without the vertical line? the same way as the last one?
Blue | Red ->
"This is either red or blue.
|It's a color I like.
|This is my poem.
"
from what I understand - the main problem is that there should be a delimiter to help visually see the indentation of the multistring contents? and avoid much the hussle of escaping?
I don't know what to suggest for the first problem, but for the second, I really like how raw strings in rust work
We can’t do this. We need to know it’s multi line without looking at the next line
Kiryl Dziamura said:
Blue | Red -> "|This is either red or blue. |It's a color I like. |This is my poem. "just some polishing of the idea from my side.
"|and|are not aligned. there's a whitespace before the|. maybe the first line might be without the vertical line? the same way as the last one?
I guess I personally don’t see how they are not aligned? All white space before the | on the continuing lines is ignored so you can line them up perfectly.
I mean, the number of spaces. Not a problem for the formatter ofc
| alone doesn't work because that looks waaaay too much like the start of a closure
"| doesn't work because that looks waaaay too much like a string that happens to start with |
"| could be made to work if you had to escape pipes in strings (or at a minimum, pipes that occur as the first character of a string), but IMO that's getting rather crazy
Joshua Warner said:
"|could be made to work if you had to escape pipes in strings (or at a minimum, pipes that occur as the first character of a string), but IMO that's getting rather crazy
This is exactly what I was suggested. It's either that or go back to """ at the beginning of the line like Richard originally suggested
Anthony Bullard said:
The only downside I see here is that a
|at the start of a string would have to be escaped in a normal string
For reference Joshua.
Multiline strings will be rare, right? The tradeoff of or for patterns and | for multiline strings vs. | for patterns and """ for multiline strings is mainly predicated on frequency for me
And I'm pretty sure that we'll have dozens of times more match statements than multiline strings
That's a fair point, and why I'm suggesting "| alternative syntax
Which I actually kinda love now, and think pipe's in the initial position of a string having to be escaped is a small price to pay
I think that even if multiline strings are rare, A string beginning with a pipe will be even more rare
I like the look, works for me
An completely different alternative is....remove multiline strings in favor of imported files
Nah, we should have at least some awkward option instead of nothing
It's good to colocate multiline strings with their usage
I actually have found in Go that I use //go:embed for multiline strings more often than not
My biggest usage of them is SQL
(Until I discovered sqlc)
sqlc and equivalent libraries are bae
I prefer both """ and \\ over "|
I also prefer status quo (indentation-sensitive """), as bad as it is, over "|
I don't think all forms of | are DOA, but IMO anything that requires you to escape pipes inside strings is going to cause a bunch of confusion for users
What about:
Blue | Red -> "
|This is either red or blue.
|It's a color I like.
|This is my poem.
"
Would there be trouble in tokenizing a match branch that starts with a pipe?
Note the hanging quote on the previous line
Yes, but now we can't tokenize per-line, right?
Or was that not gonna be an option
True, but that wasn't an option with the "| syntax either
This would require us to know that the last line started with a "
I also don't think that's a huge loss
Is it an option with """?
Yes
Seems like a nice benefit for larger files
We only need to read backwards (or forwards) until we see the hanging ". It's no longer per-line, but probably close enough in many cases.
The practical benefit of being able to tokenize lines separately is per-line regex-based syntax highlighting "just works"
Not sure if that's something that we need to optimize for tho?
Is there a good reason for distinguishing between single line and multiline strings synthax?
I was thinking about this:
singleline = "hello world"
singleline_multiline = "hello\nworld"
multiline = "
hello
world
"
multiline_singleline = "
hello \
world
"
Although confusing for C/C++ people
That would likely introduce some difficulties in parsing although I'm not sure how serious those difficulties would be.
One of the nice parts of multiline strings is not needing to escape anything. You lose that with making it like single line strings. Have to escape quotes.
Why not raw strings then? Similar to rust but with interpolation
I think a very important property that we currently have and shouldn't give up is that you can indent multiline strings as much or as little as you want without affecting their meaning
it's a huge pain point to me that Rust's multiline strings don't have this property, and I don't think we should reintroduce that pain point
our original multiline string design didn't have that problem, but it did have the problem that it wasn't obvious where the leading edge was, plus it was whitespace-sensitive. I think it would be desirable to maintain that property too.
I've been thinking about the earlier discussion regarding whether newlines should be preserved (e.g. the "should you put a \ at the end) and one idea we hadn't discussed is that we could have that at the beginning of the line instead, e.g.
multiline_str =
\"{
\" "foo": "bar",
\" "baz": true
\"}
# as if you'd ended each line with \
multiline_str =
\\{
\\ "foo": "bar",
\\ "baz": true
\\}
so the second one wouldn't end up with any newlines, it's just written in a multiline style
the syntax idea would basically be that you put the \ at the start of the line (replacing the quote) instead of at the end, and both styles are equally concise
How to understand where the multiline ends then? It was raised above - how would you put multiline to an array or tuple?
Comma on a new line should work well. But then, how to be sure that no comma was missed as it was pointed above?
What do you think about raw strings tho? Is it possible to have them? I can see how good they might work with comptime custom strings.
Kiryl Dziamura said:
What do you think about raw strings tho?
I think the indentation problem is a deal-breaker. I don't think Rust ever should have shipped them in their current form. :sweat_smile:
also I'm not worried about missing commas being a problem at all. I've used multiline strings in several languages and I can't think of a time I've ever seen one in a comma-delimited collection literal - so even if it did turn out to be a source of mistakes (which I don't think it would be), and then even if people didn't notice and correct the problem quickly (which I think they would), the blast radius of impact would be so small I don't think it's worth worrying about :smile:
another idea:
multiline_str =
\"{
\+ "foo": "bar",
\+ "baz": true
\+}
kinda like how some languages do:
multiline_str =
"{"
+ "\"foo\": \"bar\","
+ "\"baz\": true"
+ "}"
but without the indentation problems
eh I still prefer how this looks for that use case:
multiline_str =
\\{
\\ "foo": "bar",
\\ "baz": true
\\}
But trailing whitespace suffers then
I see these problems with multiline strings. Different approaches cover different needs
Is there anything else to add to the list?
I think it should always be safe to delete all trailing whitespace from a .roc file without changing what the program does
so an advantage of having a multiline string syntax where the likes get concatenated together without newlines is that you can put spaces at the start of a line instead
another potential design: only have one syntax for multiline strings (e.g. """ or \\ like Zig) and make it so all the strings get concatenated together without newlines, and you insert \n wherever you want a newline
this would be straightforward to understand but annoying when trying to write something "verbatim" in tests
another option: always keep the newlines, and then you can do something like .replace_each("\n", " ") - which would get evaluated at compile time
as an aside, something that's growing on me about zig's \\ syntax compared to """ is that it's super obvious that quotes don't need to be escaped inside zig's literals, and there's no awkwardness when you want to start out with a quote (\\"quoted" vs """"quoted")
Just don't start from backslash for win path :grinning_face_with_smiling_eyes:
Only joking
Richard Feldman said:
so an advantage of having a multiline string syntax where the likes get concatenated together without newlines is that you can put spaces at the start of a line instead
That sounds pretty awful. And if the line as a whole ends with a space you still hit issues.
Though trailing white space is rare in general...especially trailing significant whitespace...so not sure how much it matters
Personally my favorite so far is the simple \\ with no other rules. It just takes all text after that verbatim. No special line splitting. No special trailing white space handling...
That said, I feel like either of those could fit, but I think it gets cumbersome to think about
Brendan Hansknecht said:
Though trailing white space is rare in general...especially trailing significant whitespace...so not sure how much it matters
I'm definitely intending to emit a compiler warning for it. It's super error-prone to have trailing whitespace have semantic meaning.
Yeah, I both agree and debate if it matters for multiline strings..... I guess if you really want trailing whitespace, you can use txt file imports instead.
In markdown, trailing whitespaces are used to force a line break
https://www.markdownguide.org/basic-syntax/#line-breaks
yes, my experience with this design choice was what convinced me that semantic trailing whitespace is categorically a mistake :laughing:
I assume that the compiler just puts out trailing whitespaces as they are for multiline strings. I think it's strange to have those (WTF markdown?), but would not assume that the compiler just removes/ignores them.
My 2 cents: I love Zig's multi-line string syntax. I notice that in Zig I've never forgotten how multi-line string syntax works (because it's so simple), whereas in other languages I frequently find myself looking up the details to ensure I use it right (ruby in particular comes to mind).
Zig format's behavior of keeping whitespace at the end of lines has tripped me up though, in particular in tests. It's a combination of not seeing that the whitespace is there in the string literal, and then string-comparison asserts not giving the clearest errors when the difference between two strings is only in whitespace.
A formatter that removes all trailing whitespace seems like a net-positive to me.
Tossing some ideas around... A way to indicate the format of the thing inside the string, that could be used by editors to provide syntax highlight, and maybe change the formatter behaviour wrt trailing whitespaces? Or is it too much complexity?
Trailing whitespaces can be added with interpolation anyway
or Unicode escape sequences :smile:
Last updated: Jun 16 2026 at 16:19 UTC