I'll see what I can do to help @Martin Janiczek
I'll get started on implementing toU8
and work my way through all the Num types. Once that's done, I can use them all for toNum
.
it's easier to make a toNum
builtin I think
then toU8
can just be a specialized version of that
I can make some zig foundations for that too
Don't feel pressured @Anton , thanks for the willingness to help :)
@Martin Janiczek it's cool, I am happy to do it!
This might be a hard first builtin :sweat_smile:
I mostly got the backend part I think
an open question is what the errors should be. Zig gives some info but we probably don't want to just follow that
with backend I mean the lowlevel
in fact, I can push my current state which kinda should be good enough for integers, and you can work from there @Anton ?
Pushed my work to the str-to-num
branch
it should have working code for the integers, floats should follow from that straightforwardly, decimal I didn't really look at yet
and I guess for now we can make the type Str -> Result (Num *) {}
?
Awesome, thanks @Folkert de Vries I'll check it out
I think a type of Str -> Result (Num *) [ InvalidStr ]*
would be best
the wording on the error type for this one is a tricky because (for example) if Num *
becomes I8
then a string of "1.1"
becomes an invalid input, and an error type of InvalidNum
might look weird there ("huh? 1.1 is a valid number!")
and we can't really get more specific than that without making exhaustiveness checks weird - e.g. if it were like [ IntWithDecimalPoint, UnexpectedNonDigits ]*
the first one wouldn't be possible if Num *
unified to F64
fortunately, in practice I expect it to be extremely rare for anyone to want a more specific error than "it didn't work," so I think we're okay just having one error type :big_smile:
maybe, Str -> Result (Num *) [ NotANumber ]*
lol
I love these tags
I'm kinda joking, InvalidStr is fine
NaN
is totally a valid tag, but of course if num == NaN
won't type check if num
is actually a number :big_smile:
:thinking: actually I wonder if we'll end up wanting to special-case error messages around the NaN
tag in case people actually try to do that!
example?
like what kind of error message
oh just like a hint "In Roc, NaN is not a number type, but rather a tag like Ok, Err, True, False, etc."
amusingly, == NaN
would always return false in languages where NaN is a number, so arguably the type mismatch Roc would give you is actually more helpful than having it "Just Work" :laughing:
How about InvalidNumStr
? InvalidStr
makes me think I have an improperly decoded String or something.
The original ExpectedNum a
is also pretty good or can't we make builtins with type variables in the errors yet?
InvalidNumStr
sounds good! :thumbs_up:
I think we shouldn't actually do the ExpectedNum a
thing because it wouldn't work with an Abilities design for numbers
Richard Feldman said:
amusingly,
== NaN
would always return false in languages where NaN is a number, so arguably the type mismatch Roc would give you is actually more helpful than having it "Just Work" :laughing:
I see so this is more for giving a helping hand to people coming from a language like JavaScript for example
cool idea, that kind of friendliness goes a long way
Anton passed me the baton on this, I'll see if I can wrap it up within the hr or so
how do I use a Dec again?
I have to force it in the type signature right?
yes
ok, I almost have this done, I'm just running into an issue with the bitcode functions atm
'Unrecognized builtin function: "roc_builtins.str.to_int.i64"
I see the issue
you know Anton was working on this too?
He told me to take over and that he'll pick up where I left off tomorrow
I'm using his branch
thread 'main' panicked at 'Found StructValue(StructValue { struct_value: Value { name: "call_builtin", address: 0x600001773e20, is_const: false, is_null: false, is_undef: false, llvm_value: " %call_builtin = call %\"num.NumParseResult(i64)\" @roc_builtins.str.to_int.i64(%str.RocStr %\"#arg1\"), !dbg !488", llvm_type: "%\"num.NumParseResult(i64)\" = type { i8, i64 }" } }) but expected PointerValue variant', /Users/rvcas/.cargo/git/checkouts/inkwell-85610d8ccb0c28f9/14b78d9/src/values/enums.rs:285:13
I kinda have no clue what that means tbh
set RUST_BACKTRACE=1
to figure out where that cast was called
oh right
Screen-Shot-2021-12-01-at-4.45.33-PM.png
yo this M1 is so fast, I have zero fear of a full rebuild takes me max 48s lmao
seems to be related to how things are returned from zig
I probably need to look at fromUtf8C
ok I know what to do, just a sec
hmm for decimal it may be that zig returns via a pointer
because the { errcode, value }
struct is bigger than an i128
oh sorry, Int doesn't work yet
I think I'm stuck, trying to mimic fromUtf8C didn't work :'(
Folkert de Vries said:
hmm for decimal it may be that zig returns via a pointer
the above output was from running Str.toNum "1"
so where is that invalid cast?
also when dealing with records, keep in mind that we reorder fields
and so on the zig side the fields must also be in the right order
Folkert de Vries said:
so where is that invalid cast?
oh I'm dumb, I should have walked down the stack trace further
let me do that again
it's hard to tell to be honest, I just ran the stack trace again
I have everything as is pushed, I didn't bother committing my attempt at mimicking what fromUtf8 does
found the issue, can't fix it now. The problem is t hat the zig code returns a struct { errcode, value }, but the roc function needs to return a result
we have some examples of that
num_overflow_checked
in can/src/builtins.rs is probably a good example
ah I see, so my suspicions were correct but I was looking at a bad example. Thanks, I'll see if I can work with that
I'm about to continue this
@Folkert de Vries I tried your suggestion, I think I've made progress but it's not quite there yet
I pushed those changes, I use an If Expr to check the returned record's "error code" field for a value greater than zero
I might be doing something wrong with the vars tho, it's a little hard to tell for me
got a test working, pushed, see bottom of gen_str.rs
nice
thanks
cool, I see your commit, a few subtle but key changes
I'm going to have to think about how to do Float
Screen-Shot-2021-12-02-at-2.50.00-PM.png
that layout makes sense because how would it even know the string contains a float in it before hand
I'm not sure it would make sense to try the parseFloat if if parseInt fails
is there a run time cost to that?
or does it make more sense to have 3 builtins for this instead ?
float should work in an actual program
@Richard Feldman thoughts here? I'm not sure there is anything good we can do in this case
default to float?
Folkert de Vries said:
float should work in an actual program
I see, cause it can infer it based on usage
makes sense
it would be nice for it to just work in the repl though
but how could it?
I know, that's the question :'(
x = Str.toNum "1.0"
x + 2.0
this should work tho like you said
modulo error handling
yes
right right lol
I can add a test case for that at least
ok that passes now
should we bother supporting dec right now? or should we take the WIP prefix off the PR
dec has a fromstr already, so might as well implement that one?
oh right
I just need to switch the function that gets called in the build.rs match case then
I have a failing test case written already so I'll go ahead and try to make it pass
ok almost there: 'LLVM error: Did not get return value from bitcode function "roc_builtins.dec.from_str"'
pushed the changes as well
pub fn fromStr(arg: RocStr) callconv(.C) num_.NumParseResult(i128) {
if (@call(.{ .modifier = always_inline }, RocDec.fromStr, .{arg})) |dec| {
return .{ .errorcode = 0, .value = dec.num };
} else {
return .{ .errorcode = 1, .value = 0 };
}
}
not exactly sure what I'm missing here
look at the LLVM IR,
how do I dump that from the test?
is there a print statement somewhere I can uncomment or add?
yes, in compiler/test_gen/src/helpers/llvm.rs
line 242 I think
right ok, that's what I thought, I'm going to take a break and come back to this in a bit
so I think this edge case means it can't be Num.fromStr
and instead the Roc API needs to be like Str.toU8
, Str.toF64
, etc. :sweat_smile:
because other things depend on any Num *
value being an integer
and if we do Num.fromStr "1.1"
and that fails because it was expecting an int, that's super confusing
whereas Str.toF64 "1.1"
will obviously succeed, and Str.toI64 "1.1"
will obviously fail
right
Str.toI64 "1.0"
is a little less clear, but I think either design is fine; we could either allow it because it's an integer or give an error because it' has a decimal point
What if we had the tag be an argument?
Str.toNum “1.0” F64
:thinking: what would the type of that function be?
oh
A tag union could be the signature for the second arg?
I mean what's the type of that Str.toNum
function?
And we fill it in with all available num types?
(like what's the return type specifically)
Hm right
e.g. Str.toI64 : Str -> Result I64 [ InvalidI64Str ]*
- we know it gives an I64
but yeah
could it still not be Num?
I think if we try to have a more flexible function than that, we end up returning Num *
which has the edge case problem
like we could try to do Num.toStr : Str, a -> Result (Num a) [ InvalidNumStr ]*
The return type isn’t exactly the issue it’s not knowing the type within the string for the layout without some usage for inference
unless I misunderstood the question
but now you can actually pass an a
value, which we don't want because that's supposed to be a phantom type in Num
so there's no runtime overhead
oh I see
yea so it works fine with usage allowing F64 to be inferred and then it hits the correct branch
but just stand alone in the repl it defaults to Int I64
x = Str.toNum "1.1" F64 |> Result.withDefault 0
x + 999999999999
so there we're saying "hey parse this as an F64" but its return type is still Num *
yea
Then I can match on the layout of ARG_2
so when we add it to 9999999999, all the Num.add
is going to know is that it's a Num *
+ Num *
yep
that info won't make it to Num.add
dam
lol
I guess separate functions makes the most sense? But realistically it’ll be fine as is in a real program
yeah I think separate functions is the way to go
but we can probably get some good code reuse behind the scenes
Str.toInt and Str.toFloat then
the reverse doesn’t have that problem of course
Num.toStr works fine
This probably explains why most systems langs have different functions for that
agreed! :100:
Also Str.toDec of course
I’ll adjust that today
I think Str.toNum
still works right?
oops
I know it’s been confusing me too
I mean Num.toStr
still works
100%
one thing we could do is use the word parse
yea
Num.toStr takes Num * as the arg so we concretely know the layout before hand
like Num.parseI64
or Str.parseI64
or something
parse is cool, although I like the consistency with toStr
Do we want I64 etc. or all Int *?
So parseInt?
oh that's interesting
Oh actually you’re right
It has to be per type
so Int
is probably fine, but if we did a parseFrac
then it might need to be different between float vs dec because of NaN
, Infinity
, and -Infinity
Cause again it defaults to I64
oh yeah that's another point
:joy:
Ima sit down to eat brb
yeah like what if you do parseInt
on a number that's too big for I64, but it would have fit in I128?
yea you right or even like if you want an I32 but it defaults to I64
If I'm following this correctly these are issues that arise only in the REPL because you don't have the full program to do type inference.
Could we say that the REPL input must have type annotations in certain cases? Do these issues go away then? Or is it too hard to make good error messages for that?
Like maybe in REPL mode it's an error to define a value whose type we can't resolve. You have to annotate it? Maybe that's too annoying, just an idea.
there is not really a good place for them to go
haskell has type applications for this sort of thing, so you can say identity @Word32 42
and that would provide all the required type info
these are issues that arise only in the REPL because you don't have the full program to do type inference.
they can happen even in a full program, e.g.
x = 5
if x - 1 > 0 then
should the branch get taken? In order to decide that, we need to evaluate x - 1
where both x
and 1
have the type Num *
so in general I consider it valuable for Roc to maintain the invariant that you never need to add a type annotation to anything for any reason
among other things, this means that the editor feature of "highlight this to find out its type" works 100% of the time
I'd be very hesitant to give that up, especially when the parseI64
design is very simple and doesn't have this problem :big_smile:
Yea it’s just a bunch of copy and paste work tbh, it’ll be fine
did we settle on parse or to as the prefix?
parseI64 or toI64
toI64
seems better in every way :p
I like that it’s consistent with Num.toStr
yeah let's try Str.toI64
and see if people get confused in practice; can always try the longer parse
if so!
I’m happy either way
I’ll carry on a bit later btw, it’s art Basel weekend in Miami right now so stuff is pretty crazy and I have to do some running around
if anyone else wants they are welcome to continue no need to wait for me
I’m back home, I’ll pick this back up again in a sec
I haven’t forgotten about this. My b for not getting it done yesterday
tomorrow I should be good to go
Screen-Shot-2021-12-06-at-10.07.23-PM.png
we're cooking. so I was able to do it with the same low-level for all aliases
so it's just a matter of defining the builtins and mapping them to str_to_num
in can builtins.rs
are we happy with this PR? anyone got time to review?
https://github.com/rtfeldman/roc/pull/2116
I reviewed it but I couldn't approve because I started the PR.
merged! :raised_hands:
IT'S ALIVE!!!!
Awesome thank you
I’m gonna crawl back to wasm now :p
If anyone wants to explore why i128 and LLVM are not happy for the Dec/I128 functions feel free
Last updated: Jul 06 2025 at 12:14 UTC