Str.toNum · beginners · Zulip Chat Archive

the wording on the error type for this one is a tricky because (for example) if Num * becomes I8 then a string of "1.1" becomes an invalid input, and an error type of InvalidNum might look weird there ("huh? 1.1 is a valid number!")

Richard Feldman (Dec 01 2021 at 16:49):

and we can't really get more specific than that without making exhaustiveness checks weird - e.g. if it were like [ IntWithDecimalPoint, UnexpectedNonDigits ]* the first one wouldn't be possible if Num * unified to F64

Richard Feldman (Dec 01 2021 at 16:51):

fortunately, in practice I expect it to be extremely rare for anyone to want a more specific error than "it didn't work," so I think we're okay just having one error type :big_smile:

Lucas Rosa (Dec 01 2021 at 17:15):

maybe, Str -> Result (Num *) [ NotANumber ]* lol

Lucas Rosa (Dec 01 2021 at 17:15):

I love these tags

Lucas Rosa (Dec 01 2021 at 17:16):

I'm kinda joking, InvalidStr is fine

Richard Feldman (Dec 01 2021 at 17:38):

NaN is totally a valid tag, but of course if num == NaN won't type check if num is actually a number :big_smile:

Richard Feldman (Dec 01 2021 at 17:38):

:thinking: actually I wonder if we'll end up wanting to special-case error messages around the NaN tag in case people actually try to do that!

Lucas Rosa (Dec 01 2021 at 17:39):

example?

Lucas Rosa (Dec 01 2021 at 17:39):

like what kind of error message

Richard Feldman (Dec 01 2021 at 17:40):

oh just like a hint "In Roc, NaN is not a number type, but rather a tag like Ok, Err, True, False, etc."

Richard Feldman (Dec 01 2021 at 17:41):

amusingly, == NaN would always return false in languages where NaN is a number, so arguably the type mismatch Roc would give you is actually more helpful than having it "Just Work" :laughing:

Anton (Dec 01 2021 at 18:56):

How about InvalidNumStr? InvalidStr makes me think I have an improperly decoded String or something.

Anton (Dec 01 2021 at 19:00):

The original ExpectedNum a is also pretty good or can't we make builtins with type variables in the errors yet?

Richard Feldman (Dec 01 2021 at 19:42):

InvalidNumStr sounds good! :thumbs_up:

Richard Feldman (Dec 01 2021 at 19:44):

I think we shouldn't actually do the ExpectedNum a thing because it wouldn't work with an Abilities design for numbers

Lucas Rosa (Dec 01 2021 at 20:49):

Richard Feldman said:

amusingly, == NaN would always return false in languages where NaN is a number, so arguably the type mismatch Roc would give you is actually more helpful than having it "Just Work" :laughing:

I see so this is more for giving a helping hand to people coming from a language like JavaScript for example

Lucas Rosa (Dec 01 2021 at 20:49):

cool idea, that kind of friendliness goes a long way

Lucas Rosa (Dec 01 2021 at 20:59):

Anton passed me the baton on this, I'll see if I can wrap it up within the hr or so

Lucas Rosa (Dec 01 2021 at 21:22):

how do I use a Dec again?

Lucas Rosa (Dec 01 2021 at 21:23):

I have to force it in the type signature right?

Folkert de Vries (Dec 01 2021 at 21:23):

yes

Lucas Rosa (Dec 01 2021 at 21:24):

ok, I almost have this done, I'm just running into an issue with the bitcode functions atm

Lucas Rosa (Dec 01 2021 at 21:24):

'Unrecognized builtin function: "roc_builtins.str.to_int.i64"

Lucas Rosa (Dec 01 2021 at 21:33):

I see the issue

Folkert de Vries (Dec 01 2021 at 21:38):

you know Anton was working on this too?

Lucas Rosa (Dec 01 2021 at 21:41):

He told me to take over and that he'll pick up where I left off tomorrow

Lucas Rosa (Dec 01 2021 at 21:42):

I'm using his branch

Lucas Rosa (Dec 01 2021 at 21:43):

thread 'main' panicked at 'Found StructValue(StructValue { struct_value: Value { name: "call_builtin", address: 0x600001773e20, is_const: false, is_null: false, is_undef: false, llvm_value: " %call_builtin = call %\"num.NumParseResult(i64)\" @roc_builtins.str.to_int.i64(%str.RocStr %\"#arg1\"), !dbg !488", llvm_type: "%\"num.NumParseResult(i64)\" = type { i8, i64 }" } }) but expected PointerValue variant', /Users/rvcas/.cargo/git/checkouts/inkwell-85610d8ccb0c28f9/14b78d9/src/values/enums.rs:285:13

Lucas Rosa (Dec 01 2021 at 21:44):

I kinda have no clue what that means tbh

Folkert de Vries (Dec 01 2021 at 21:44):

set RUST_BACKTRACE=1 to figure out where that cast was called

Lucas Rosa (Dec 01 2021 at 21:44):

oh right

Lucas Rosa (Dec 01 2021 at 21:45):

Screen-Shot-2021-12-01-at-4.45.33-PM.png

Lucas Rosa (Dec 01 2021 at 21:46):

yo this M1 is so fast, I have zero fear of a full rebuild takes me max 48s lmao

Lucas Rosa (Dec 01 2021 at 21:50):

seems to be related to how things are returned from zig

Lucas Rosa (Dec 01 2021 at 21:52):

I probably need to look at fromUtf8C

Lucas Rosa (Dec 01 2021 at 21:54):

ok I know what to do, just a sec

Folkert de Vries (Dec 01 2021 at 21:54):

hmm for decimal it may be that zig returns via a pointer

Folkert de Vries (Dec 01 2021 at 21:54):

because the { errcode, value } struct is bigger than an i128

Lucas Rosa (Dec 01 2021 at 22:07):

oh sorry, Int doesn't work yet

Lucas Rosa (Dec 01 2021 at 22:07):

I think I'm stuck, trying to mimic fromUtf8C didn't work :'(

Lucas Rosa (Dec 01 2021 at 22:10):

Folkert de Vries said:

hmm for decimal it may be that zig returns via a pointer

the above output was from running Str.toNum "1"

Folkert de Vries (Dec 01 2021 at 22:20):

so where is that invalid cast?

Folkert de Vries (Dec 01 2021 at 22:21):

also when dealing with records, keep in mind that we reorder fields

Folkert de Vries (Dec 01 2021 at 22:21):

and so on the zig side the fields must also be in the right order

Lucas Rosa (Dec 01 2021 at 22:22):

Folkert de Vries said:

so where is that invalid cast?

oh I'm dumb, I should have walked down the stack trace further

Lucas Rosa (Dec 01 2021 at 22:22):

let me do that again

Lucas Rosa (Dec 01 2021 at 22:23):

it's hard to tell to be honest, I just ran the stack trace again

Lucas Rosa (Dec 01 2021 at 22:26):

I have everything as is pushed, I didn't bother committing my attempt at mimicking what fromUtf8 does

Folkert de Vries (Dec 01 2021 at 22:39):

found the issue, can't fix it now. The problem is t hat the zig code returns a struct { errcode, value }, but the roc function needs to return a result

Folkert de Vries (Dec 01 2021 at 22:39):

we have some examples of that

Folkert de Vries (Dec 01 2021 at 22:40):

num_overflow_checked in can/src/builtins.rs is probably a good example

Lucas Rosa (Dec 01 2021 at 22:48):

ah I see, so my suspicions were correct but I was looking at a bad example. Thanks, I'll see if I can work with that

Lucas Rosa (Dec 02 2021 at 18:18):

I'm about to continue this

Lucas Rosa (Dec 02 2021 at 19:07):

@Folkert de Vries I tried your suggestion, I think I've made progress but it's not quite there yet

Lucas Rosa (Dec 02 2021 at 19:07):

I pushed those changes, I use an If Expr to check the returned record's "error code" field for a value greater than zero

Lucas Rosa (Dec 02 2021 at 19:07):

I might be doing something wrong with the vars tho, it's a little hard to tell for me

Folkert de Vries (Dec 02 2021 at 19:31):

got a test working, pushed, see bottom of gen_str.rs

Lucas Rosa (Dec 02 2021 at 19:41):

nice

Lucas Rosa (Dec 02 2021 at 19:41):

thanks

Lucas Rosa (Dec 02 2021 at 19:45):

cool, I see your commit, a few subtle but key changes

Lucas Rosa (Dec 02 2021 at 19:50):

I'm going to have to think about how to do Float

Lucas Rosa (Dec 02 2021 at 19:50):

Screen-Shot-2021-12-02-at-2.50.00-PM.png

Lucas Rosa (Dec 02 2021 at 19:51):

that layout makes sense because how would it even know the string contains a float in it before hand

Lucas Rosa (Dec 02 2021 at 19:53):

I'm not sure it would make sense to try the parseFloat if if parseInt fails

Lucas Rosa (Dec 02 2021 at 19:53):

is there a run time cost to that?

Lucas Rosa (Dec 02 2021 at 19:56):

or does it make more sense to have 3 builtins for this instead ?

Folkert de Vries (Dec 02 2021 at 19:59):

float should work in an actual program

Folkert de Vries (Dec 02 2021 at 20:00):

@Richard Feldman thoughts here? I'm not sure there is anything good we can do in this case

Folkert de Vries (Dec 02 2021 at 20:00):

default to float?

Lucas Rosa (Dec 02 2021 at 20:01):

Folkert de Vries said:

float should work in an actual program

I see, cause it can infer it based on usage

Lucas Rosa (Dec 02 2021 at 20:01):

makes sense

Lucas Rosa (Dec 02 2021 at 20:01):

it would be nice for it to just work in the repl though

Folkert de Vries (Dec 02 2021 at 20:01):

but how could it?

Lucas Rosa (Dec 02 2021 at 20:06):

I know, that's the question :'(

Lucas Rosa (Dec 02 2021 at 20:07):

x = Str.toNum "1.0"

x + 2.0

Lucas Rosa (Dec 02 2021 at 20:07):

this should work tho like you said

Folkert de Vries (Dec 02 2021 at 20:09):

modulo error handling

Folkert de Vries (Dec 02 2021 at 20:09):

yes

Lucas Rosa (Dec 02 2021 at 20:10):

right right lol

Lucas Rosa (Dec 02 2021 at 20:10):

I can add a test case for that at least

Lucas Rosa (Dec 02 2021 at 20:15):

ok that passes now

Lucas Rosa (Dec 02 2021 at 20:20):

should we bother supporting dec right now? or should we take the WIP prefix off the PR

Folkert de Vries (Dec 02 2021 at 20:21):

dec has a fromstr already, so might as well implement that one?

Lucas Rosa (Dec 02 2021 at 20:21):

oh right

Lucas Rosa (Dec 02 2021 at 20:21):

I just need to switch the function that gets called in the build.rs match case then

Lucas Rosa (Dec 02 2021 at 20:22):

I have a failing test case written already so I'll go ahead and try to make it pass

Lucas Rosa (Dec 02 2021 at 20:35):

ok almost there: 'LLVM error: Did not get return value from bitcode function "roc_builtins.dec.from_str"'

Lucas Rosa (Dec 02 2021 at 20:36):

pushed the changes as well

Lucas Rosa (Dec 02 2021 at 20:36):

pub fn fromStr(arg: RocStr) callconv(.C) num_.NumParseResult(i128) {
    if (@call(.{ .modifier = always_inline }, RocDec.fromStr, .{arg})) |dec| {
        return .{ .errorcode = 0, .value = dec.num };
    } else {
        return .{ .errorcode = 1, .value = 0 };
    }
}

Lucas Rosa (Dec 02 2021 at 20:47):

not exactly sure what I'm missing here

Folkert de Vries (Dec 02 2021 at 20:58):

look at the LLVM IR,

Lucas Rosa (Dec 02 2021 at 21:09):

how do I dump that from the test?

Lucas Rosa (Dec 02 2021 at 21:09):

is there a print statement somewhere I can uncomment or add?

Folkert de Vries (Dec 02 2021 at 21:11):

yes, in compiler/test_gen/src/helpers/llvm.rs

Folkert de Vries (Dec 02 2021 at 21:12):

line 242 I think

Lucas Rosa (Dec 02 2021 at 21:13):

right ok, that's what I thought, I'm going to take a break and come back to this in a bit

Richard Feldman (Dec 02 2021 at 22:01):

so I think this edge case means it can't be Num.fromStr and instead the Roc API needs to be like Str.toU8, Str.toF64, etc. :sweat_smile:

Richard Feldman (Dec 02 2021 at 22:02):

because other things depend on any Num * value being an integer

Richard Feldman (Dec 02 2021 at 22:02):

and if we do Num.fromStr "1.1" and that fails because it was expecting an int, that's super confusing

Richard Feldman (Dec 02 2021 at 22:03):

whereas Str.toF64 "1.1" will obviously succeed, and Str.toI64 "1.1" will obviously fail

Lucas Rosa (Dec 02 2021 at 22:03):

right

Richard Feldman (Dec 02 2021 at 22:03):

Str.toI64 "1.0" is a little less clear, but I think either design is fine; we could either allow it because it's an integer or give an error because it' has a decimal point

Lucas Rosa (Dec 02 2021 at 22:04):

What if we had the tag be an argument?

Lucas Rosa (Dec 02 2021 at 22:04):

Str.toNum “1.0” F64

Richard Feldman (Dec 02 2021 at 22:05):

:thinking: what would the type of that function be?

Lucas Rosa (Dec 02 2021 at 22:05):

A tag union could be the signature for the second arg?

Richard Feldman (Dec 02 2021 at 22:06):

I mean what's the type of that Str.toNum function?

Lucas Rosa (Dec 02 2021 at 22:06):

And we fill it in with all available num types?

Richard Feldman (Dec 02 2021 at 22:06):

(like what's the return type specifically)

Lucas Rosa (Dec 02 2021 at 22:06):

Hm right

Richard Feldman (Dec 02 2021 at 22:06):

e.g. Str.toI64 : Str -> Result I64 [ InvalidI64Str ]* - we know it gives an I64

Richard Feldman (Dec 02 2021 at 22:06):

but yeah

Lucas Rosa (Dec 02 2021 at 22:07):

could it still not be Num?

Richard Feldman (Dec 02 2021 at 22:07):

I think if we try to have a more flexible function than that, we end up returning Num *

Richard Feldman (Dec 02 2021 at 22:07):

which has the edge case problem

Richard Feldman (Dec 02 2021 at 22:07):

like we could try to do Num.toStr : Str, a -> Result (Num a) [ InvalidNumStr ]*

Lucas Rosa (Dec 02 2021 at 22:07):

The return type isn’t exactly the issue it’s not knowing the type within the string for the layout without some usage for inference

Lucas Rosa (Dec 02 2021 at 22:07):

unless I misunderstood the question

Richard Feldman (Dec 02 2021 at 22:08):

but now you can actually pass an a value, which we don't want because that's supposed to be a phantom type in Num so there's no runtime overhead

Richard Feldman (Dec 02 2021 at 22:08):

oh I see

Lucas Rosa (Dec 02 2021 at 22:08):

yea so it works fine with usage allowing F64 to be inferred and then it hits the correct branch

Lucas Rosa (Dec 02 2021 at 22:09):

but just stand alone in the repl it defaults to Int I64

Richard Feldman (Dec 02 2021 at 22:09):

x = Str.toNum "1.1" F64 |> Result.withDefault 0

x + 999999999999

Richard Feldman (Dec 02 2021 at 22:09):

so there we're saying "hey parse this as an F64" but its return type is still Num *

Lucas Rosa (Dec 02 2021 at 22:09):

yea

Lucas Rosa (Dec 02 2021 at 22:10):

Then I can match on the layout of ARG_2

Richard Feldman (Dec 02 2021 at 22:10):

so when we add it to 9999999999, all the Num.add is going to know is that it's a Num * + Num *

Lucas Rosa (Dec 02 2021 at 22:10):

yep

Richard Feldman (Dec 02 2021 at 22:10):

that info won't make it to Num.add

Lucas Rosa (Dec 02 2021 at 22:10):

dam

Lucas Rosa (Dec 02 2021 at 22:10):

lol

Lucas Rosa (Dec 02 2021 at 22:11):

I guess separate functions makes the most sense? But realistically it’ll be fine as is in a real program

Richard Feldman (Dec 02 2021 at 22:11):

yeah I think separate functions is the way to go

Richard Feldman (Dec 02 2021 at 22:11):

but we can probably get some good code reuse behind the scenes

Lucas Rosa (Dec 02 2021 at 22:11):

Str.toInt and Str.toFloat then

Lucas Rosa (Dec 02 2021 at 22:12):

the reverse doesn’t have that problem of course

Lucas Rosa (Dec 02 2021 at 22:12):

Num.toStr works fine

Lucas Rosa (Dec 02 2021 at 22:12):

This probably explains why most systems langs have different functions for that

Richard Feldman (Dec 02 2021 at 22:13):

agreed! :100:

Lucas Rosa (Dec 02 2021 at 22:13):

Also Str.toDec of course

Lucas Rosa (Dec 02 2021 at 22:13):

I’ll adjust that today

Richard Feldman (Dec 02 2021 at 22:13):

I think Str.toNum still works right?

Richard Feldman (Dec 02 2021 at 22:13):

oops

Lucas Rosa (Dec 02 2021 at 22:13):

I know it’s been confusing me too

Richard Feldman (Dec 02 2021 at 22:13):

I mean Num.toStr still works

Lucas Rosa (Dec 02 2021 at 22:14):

100%

Richard Feldman (Dec 02 2021 at 22:14):

one thing we could do is use the word parse

Lucas Rosa (Dec 02 2021 at 22:14):

yea

Lucas Rosa (Dec 02 2021 at 22:15):

Num.toStr takes Num * as the arg so we concretely know the layout before hand

Richard Feldman (Dec 02 2021 at 22:15):

like Num.parseI64 or Str.parseI64 or something

Lucas Rosa (Dec 02 2021 at 22:16):

parse is cool, although I like the consistency with toStr

Lucas Rosa (Dec 02 2021 at 22:16):

Do we want I64 etc. or all Int *?

Lucas Rosa (Dec 02 2021 at 22:16):

So parseInt?

Richard Feldman (Dec 02 2021 at 22:17):

oh that's interesting

Lucas Rosa (Dec 02 2021 at 22:18):

Oh actually you’re right

Lucas Rosa (Dec 02 2021 at 22:18):

It has to be per type

Richard Feldman (Dec 02 2021 at 22:18):

so Int is probably fine, but if we did a parseFrac then it might need to be different between float vs dec because of NaN, Infinity, and -Infinity

Lucas Rosa (Dec 02 2021 at 22:19):

Cause again it defaults to I64

Richard Feldman (Dec 02 2021 at 22:19):

oh yeah that's another point

Lucas Rosa (Dec 02 2021 at 22:19):

:joy:

Lucas Rosa (Dec 02 2021 at 22:19):

Ima sit down to eat brb

Richard Feldman (Dec 02 2021 at 22:20):

yeah like what if you do parseInt on a number that's too big for I64, but it would have fit in I128?

Lucas Rosa (Dec 02 2021 at 22:30):

yea you right or even like if you want an I32 but it defaults to I64

Brian Carroll (Dec 03 2021 at 12:12):

If I'm following this correctly these are issues that arise only in the REPL because you don't have the full program to do type inference.
Could we say that the REPL input must have type annotations in certain cases? Do these issues go away then? Or is it too hard to make good error messages for that?

Brian Carroll (Dec 03 2021 at 12:15):

Like maybe in REPL mode it's an error to define a value whose type we can't resolve. You have to annotate it? Maybe that's too annoying, just an idea.

Folkert de Vries (Dec 03 2021 at 12:34):

there is not really a good place for them to go

Folkert de Vries (Dec 03 2021 at 12:35):

haskell has type applications for this sort of thing, so you can say identity @Word32 42 and that would provide all the required type info

Richard Feldman (Dec 03 2021 at 13:39):

these are issues that arise only in the REPL because you don't have the full program to do type inference.

they can happen even in a full program, e.g.

x = 5

if x - 1 > 0 then

should the branch get taken? In order to decide that, we need to evaluate x - 1 where both x and 1 have the type Num *

Richard Feldman (Dec 03 2021 at 13:40):

so in general I consider it valuable for Roc to maintain the invariant that you never need to add a type annotation to anything for any reason

Richard Feldman (Dec 03 2021 at 13:40):

among other things, this means that the editor feature of "highlight this to find out its type" works 100% of the time

Richard Feldman (Dec 03 2021 at 13:41):

I'd be very hesitant to give that up, especially when the parseI64 design is very simple and doesn't have this problem :big_smile:

Lucas Rosa (Dec 03 2021 at 15:15):

Yea it’s just a bunch of copy and paste work tbh, it’ll be fine

Lucas Rosa (Dec 03 2021 at 15:15):

did we settle on parse or to as the prefix?

Lucas Rosa (Dec 03 2021 at 15:15):

parseI64 or toI64

Anton (Dec 03 2021 at 15:18):

toI64 seems better in every way :p

Lucas Rosa (Dec 03 2021 at 15:20):

I like that it’s consistent with Num.toStr

Richard Feldman (Dec 03 2021 at 15:24):

yeah let's try Str.toI64 and see if people get confused in practice; can always try the longer parse if so!

Lucas Rosa (Dec 03 2021 at 15:44):

I’m happy either way

Lucas Rosa (Dec 03 2021 at 16:39):

I’ll carry on a bit later btw, it’s art Basel weekend in Miami right now so stuff is pretty crazy and I have to do some running around

Lucas Rosa (Dec 03 2021 at 16:40):

if anyone else wants they are welcome to continue no need to wait for me

Lucas Rosa (Dec 03 2021 at 20:22):

I’m back home, I’ll pick this back up again in a sec

Lucas Rosa (Dec 05 2021 at 04:27):

I haven’t forgotten about this. My b for not getting it done yesterday

Lucas Rosa (Dec 05 2021 at 04:27):

tomorrow I should be good to go

Lucas Rosa (Dec 07 2021 at 03:07):

Screen-Shot-2021-12-06-at-10.07.23-PM.png

Lucas Rosa (Dec 07 2021 at 03:08):

we're cooking. so I was able to do it with the same low-level for all aliases

Lucas Rosa (Dec 07 2021 at 03:09):

so it's just a matter of defining the builtins and mapping them to str_to_num in can builtins.rs

Lucas Rosa (Dec 08 2021 at 23:25):

are we happy with this PR? anyone got time to review?
https://github.com/rtfeldman/roc/pull/2116

Anton (Dec 09 2021 at 09:21):

I reviewed it but I couldn't approve because I started the PR.

Richard Feldman (Dec 09 2021 at 14:48):

merged! :raised_hands:

IT'S ALIVE!!!!

Lucas Rosa (Dec 09 2021 at 19:35):

Awesome thank you

Lucas Rosa (Dec 09 2021 at 19:36):

I’m gonna crawl back to wasm now :p

Lucas Rosa (Dec 09 2021 at 19:36):

If anyone wants to explore why i128 and LLVM are not happy for the Dec/I128 functions feel free

Last updated: Aug 17 2025 at 12:14 UTC