basic-cli: Arg.list as just a List Str · ideas

I do not know if different OS platforms require syscalls to retrieve program arguments, or if those syscalls could be expected to fail. On Linux, at least, afaik, program arguments (and environment variables) are loaded into the main thread's stack prior to handing off control to the program, and thus neither argument reading (nor env var reading) would be expected to fail. As such, needing to deal with a Task to just read what is essentially an in-memory array of strings is a bit of an annoyance.

Is there a way for a Roc platform to provide constant data to the Roc application? Granted, the ABI string (char*) and list (char**) formats are likely not immediately compatible with Roc (null terminated instead of length-encoded), but it seems like there could be value in Roc providing platforms with a way to intern a program-lifecycle List Str that stores Roc-formatted lengths, but otherwise points at/shares the underlying string data.

Env vars could likewise be exposed through a Dict Str Str and List Str(original "key=val" pairs), also sharing the original data.

These would, of course, need to be marked as shared so that compiled Roc code is unable to [attempt to] manipulate them in place.

Brian Carroll (Dec 09 2022 at 09:26):

That's a good point, there's no failure mechanism (other than allocation failure, which we don't handle anywhere in the language anyway)
There is a way to mark strings as constants, but it involves inserting a few extra bytes of data in front of the characters. There's no room to do that, so we'd have to copy the strings to somewhere else where we made room. But that's OK.

Brian Carroll (Dec 09 2022 at 09:30):

I think it's really just a matter of the platform API design. I know Richard has been through a few iterations of it. One of the considerations I think was whether always passing a List Str to the app was a nice experience for all use cases.
List Str seems like a very low-level API. In most cases I assume you'd want argument parsing. Even in C programs you'd usually use a library to convert to some more usable data structure.

Brian Carroll (Dec 09 2022 at 09:35):

So while it avoids Task it has other issues. I suppose ideally you'd have a way to only provide parsed arguments without a Task, maybe using a Result instead. But I bet if you try to do it there are tradeoffs.

Kevin Gillette (Dec 09 2022 at 10:00):

I think you're onto a good idea with passing arguments into main, though I'd imagine it'd be a record type with args and env fields, and perhaps pid, ppid, etc (though like you say, that's a tradeoff question for a given platform author).

If the application author doesn't care about args, or envs, they'd just define that main accepts an open record listing the fields they _do_ care about, if any.

Having Exit U8 as a _tag_ (rather than a Task) that main is defined as returning would also be, imo, nice, for tighter control, and it would be convenient if such a platform could do _something_ with other tags (i.e. print out the tag name or a generic failure message and then exit with status 1).

Richard Feldman (Dec 09 2022 at 12:08):

Richard Feldman (Dec 09 2022 at 12:09):

Richard Feldman (Dec 09 2022 at 12:11):

and a Program type for main (instead of Task) to allow different ways of specifying things, e.g. "give me args as a List Str" vs "let me give you an arg parser and you translate directly from OS strings into the actual representation I want, without spending time creating the intermediate List Str that will get thrown away anyway"

Richard Feldman (Dec 09 2022 at 12:18):

regarding being able to access args as a constant: I've thought about that, and supposing that were the API, it would make accessing them more concicse but would have 2 downsides:

With Task the answer is "it's the same as how any other task gets tested" (which is currently WIP, but I like the design of that API - although it's not implemented yet )

Richard Feldman (Dec 09 2022 at 12:22):

the other consideration here is that I think the inconvenience here is so small as to be not really worth the amount of discussion that's already happened here - if you're handling CLI args at the very beginning of the program (which I think is best), then we are literally talking about the difference between:

args = Args.list

args <- Args.list |> Task.await

Richard Feldman (Dec 09 2022 at 12:23):

so I think a fine answer is "this is not worth any nontrivual effort to change" but I wanted to talk through the testing consideration because I think it's relevant to other, similar api design considerations in the future :smiley:

Brendan Hansknecht (Dec 09 2022 at 16:02):

As an extra note, the overhead of loading the args through a task should be relatively minimal, so i would not worry too much about loading them later.

Going through task is really just a matter of interacting with the platform, it does not necessarily mean you are causing an effect. You may just be requesting some data from the platform so that you don't have to store and pipe that data around your entire roc program.

Kevin Gillette (Dec 09 2022 at 17:28):

args <- Args.list |> Task.await

Admittedly i was trying to cheat by making a reusable Advent of Code helper function that would fetch Args and read a file, so that main could be left with the job of handling the errors. It was my first foray into tasks, and on stumbling to exclusively use await + backpassing while also dealing with errors all within main, i failed to notice how trivial fetching Args actually is.

Excellent points. This is a "mentality of Roc/PFP" aspect that I haven't yet internalized.

Clearly the downsides of my proposal are stark and the benefits negligible. Consider me solidly convinced that what's there today is a better, more measured approach.

Brendan Hansknecht (Dec 09 2022 at 17:33):

One interesting aspect that I will be keen to follow in the future is where platforms end up settling. As the Roc ecosystem grows, I am sure a lot of alternative apis will be tried. People will try different platforms and ones with nicer apis that work better will likely grow in popularity. The current CLI platform is more minimal than what it used to be because it helped to remove some confusion that beginners kept hitting. Eventually platforms will get optimized for experienced developers rather than beginners. On top of that, as tooling for the language matures, but will hopefully help to reduce the cognitive load. That will also enable more interesting platforms that with todays compiler might have too confusing of error messages for example.

Joshua Warner (Dec 09 2022 at 22:00):

<rant>
This is maybe getting too far into the details here - but FWIW, on windows, there's not really any such thing as multiple arguments to a process. Technically, at the kernel level, a process gets a single string as an argument. It's up to the application to parse that string into multiple arguments if it wants.

It so happens that microsoft provides some common functions to do that parsing - but not all applications do that, and you can technically implement your own parsing that's incompatible.

And, unfortunately, this happens in practice with decent regularity. Lots of (mostly legacy) applications implement parsing that's subtly incompatible with the platform convention. It's a nightmare, if you try to implement an API that can launch any program on the system based on a list of string arguments.

Now - this mostly doesn't affect the "receiving" side (parsing the command string you received into your arguments), since you almost always just want to use the platform parsing functions - but, for example, it can be important if you want to register your application to open a file type, because when launching apps based on the user double-clicking a file, IIRC windows doesn't apply any escaping to the string before passing it to your process. (I might be misremembering this - but I do recall this was in tension with doing proper argument parsing. Maybe they did escape things, but in some strange slightly-incompatible-with-the-rest-of-the-system way? Anyway...)
</rant>

Richard Feldman (Dec 09 2022 at 22:01):

Kevin Gillette (Dec 10 2022 at 00:02):

Sadly the ssh protocol also takes the single string approach. Usually we don't notice because programmers and administrators usually don't put spaces in filenames, but it's definitely another one of those inexplicable design decisions that got locked in somehow

Kevin Gillette (Dec 10 2022 at 00:04):

Not that it'd constitute a receiver-side failure to access arguments: the arguments might occasionally be _wrong_, but you can still access them every time without fail

Stream: ideas

Topic: basic-cli: Arg.list as just a List Str

Kevin Gillette (Dec 09 2022 at 09:10):