Stream: ideas

Topic: imports are global


view this post on Zulip Francesco Orsenigo (Feb 19 2022 at 12:01):

TL;DR: remove all import statements and instead define aliases and globally exposed types and values in a metafile.

I have had the luck of working for Webbhuset, where we had rather large, production Elm apps (about 200K lines of code?) which shared a lot of internally developed libraries.

THE PROBLEM

All our modules started with a long list of import statements, each of which was itself pretty long because we tried to give accurate and descriptive names to the libraries, avoiding name clashes with each other and with existing third-party Elm packages.
Because of the long module names we were aliasing all of them, and the aliases ended up being very inconsistent more often than not.

This meant that if I was working on three different modules and each referenced something inside a module like Cart.Something , that Cart. could be three different modules, each aliased to the same name in the modules that use them.

At that scale, the import system of Elm became needlessly repetitive and verbose and let inconsistency creep in, which in turn significantly increased the cognitive load of understanding what happens across modules.

Another big problem we had is that because Elm does not allow different versions of the same library to coexist, new library versions cannot be adopted incrementally, which for a large codebase is crucial.

Also I hate writing import Dict exposing (Dict) in every single module I create.

THE SOLUTION

Modules are not any more responsible to declare aliases.
And if you don't have aliases, declaring your imports is not that important any more.

Instead, you have a metafile, not too unlike elm.json, that defines what modules are available, where do they come from, which alias they have, and whether any of their types or values are made available without qualifiers.

It will look something like this: https://github.com/xarvh/squarepants/blob/main/modules.sp

Ok, but what if there is a self-contained part of the code that doesn't really work well with those aliases and modules?
Then you define a library, which has its own metafile and uses its own conventions.

If you want to turn a piece of your codebase into a library, you:

The library metafile will have a lot of unused stuff, but it can be cleaned up automatically and referencing modules that are not available should not be an error if they are not used.

The metafile allows you to give any alias to any library, which means that different versions of the same library can coexist, as long as you give them different aliases.

I'm implementing this system for my language, Squarepants, and so far it has been working very well, I really like that I don't have to think about imports when I create a new module or move an old one, usually all I have to do is change a few lines in the metafile.

The details of how the system works in Squarepants are here https://github.com/xarvh/squarepants/blob/main/NOTES/Modules.md

As an aside: I didn't know about Roc until a few months ago and it's very interesting to see that Roc and Squarepants independently developed very, very similar ideas on how to improve Elm; however there are some ideas in Roc I really like and I will likely be stealing for my language, so the minimum I can do is see if Roc is interested in any of the ideas behind Squarepants.

view this post on Zulip Johannes Maas (Feb 19 2022 at 13:40):

Just a quick note: I feel the editor might help with this by managing the imports for you. Imports are only really necessary, if your source of truth is text.

So I could imagine you wouldn't need an import frontmatter because the editor could just show you where the function or type comes from and if there is any ambiguity, it could ask you to clear it up.

I think Unison might do something like that, at least they have features that feel similar to this.

You seem to have thought this through, I just wanted to point to this other approach to the problem. :wink:

view this post on Zulip Martin Stewart (Feb 19 2022 at 13:41):

At my previous job we've also encountered issues with long lists of imports and inconsistent import aliases in Elm. We use the intellij Elm plugin which automatically hides all the imports which makes long imports much less of an issue.

image.png

As for inconsistent import aliases, I wrote an elm-review rule that would let us specify what names we should use for certain imports (I guess this is sort of like what you are proposing) and automatically fix incorrect aliases.

Also I hate writing import Dict exposing (Dict) in every single module I create.

This has been less of an issue for us again thanks to tooling. There's a hotkey in intellij to add the missing import so it only takes a second to add a missing import.

view this post on Zulip Richard Feldman (Feb 19 2022 at 14:11):

this is something I'd never considered - thanks for sharing @Francesco Orsenigo!

supposing there were one big metafile defining all the imports for a 200K LoC project (or let's say 1M lines even), how would you avoid it becoming humongous and plagued by merge conflicts? :thinking:

view this post on Zulip jan kili (Feb 19 2022 at 14:37):

@Johannes Maas I recommend the section starting at 27:54 in this video for one convincing opinion on IDE-dependent coding conventions like import hiding. While I'm excited for Editor to revolutionize my workflows, I really hope that the underlying text is still treated as the source of truth without requiring an IDE to view it cleanly.

view this post on Zulip jan kili (Feb 19 2022 at 14:37):

https://youtu.be/FyCYva9DhsI

view this post on Zulip Richard Feldman (Feb 19 2022 at 14:55):

also, what would I do if I wanted to use two different DSLs in the same project? Like where I want to import a bunch of functions unqualified from both (and both DSLs have some functions with the same names) and I want to use one DSL in one module and the other DSL in a different module?

view this post on Zulip Johannes Maas (Feb 19 2022 at 15:17):

@JanCVanB Thanks for the link!

I had a look for a few minutes from the point you referenced, but I'm not sure which point you were addressing with this. They are talking about people blindly using IDE features such as auto import and import hiding.

You seem to be answering to the point of a non-textual source of truth, but I'm having trouble making the connection to the video. :sweat_smile:

view this post on Zulip Francesco Orsenigo (Feb 19 2022 at 16:23):

Richard Feldman said:

supposing there were one big metafile defining all the imports for a 200K LoC project (or let's say 1M lines even), how would you avoid it becoming humongous and plagued by merge conflicts? :thinkingquote

The complexity of the metafile is only linear with the number of modules and it's very declarative, so I don't think it would be a big problem, but maybe I am misunderstanding the problem you describe?
This said, if a project (and its metafile) becomes too large, my approach would be to isolate some of its parts into "libraries", as described in the OP.
The idea is that the metafile doesn't change very often, the merge conflicts you can have could be comparable to what you would get when renaming a module in Elm.
Most of the time you will only be adding stuff to the metafile as you add new modules.

also, what would I do if I wanted to use two different DSLs in the same project? Like where I want to import a bunch of functions unqualified from both (and both DSLs have some functions with the same names) and I want to use one DSL in one module and the other DSL in a different module?

In our case, we stopped using any "import all" statements, we even stopped using import Html exposing (..) in favour of just import Html, which meant that yes we were writing Html.div every single time instead of just div and it worked very well for us.
We did the same at the job I had afterwards and again we never regretted that.

I was worried myself that having to qualify Html. every single time would worsen the signal-to-noise ratio of our code, but it just didn't happen.

You could select a very short alias for the DSL modules (H.div) or you could expose the functions as a record and unpack it at the module root: { div, span, label, h1 } = Html.elementsAsRecord.

I considered a dedicated syntax to allow directly exposing symbols { div, span, label, h1 } = Html but it feels like a step backwards and so far I haven't felt a need for it; allowing doing this only for some modules via the record as in the Html.elementsAsRecord example above is much better IMHO.

view this post on Zulip Yorye Nathan (Feb 19 2022 at 16:50):

IMO having some kind of known noise in the beginning of a file isn't a big deal - IDE lovers will use it's features to help them with it, while people that insist on editing roc without the editor (why?) will just have to scroll a bit. That being said - it shouldn't be noisier than it has to be! I hate over-specific imports (as the posted video suggests against). It also becomes noise in diffs. I want to trust the compiler that unused imported roc code will not make it into the executable, so my imports will usually be very few.

view this post on Zulip Yorye Nathan (Feb 19 2022 at 16:58):

Regarding name collisions, I think(?) most of the time it won't rise as an issue, but for when it does - I like the idea of fully qualifying the name every time with an option to specify an alias or unpack specific things

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 17:02):

Just a quick question. Why do the naming inconsistencies arise? If I import Foo.Bar.Baz, I would just use functions in it as Baz.*. It has a short name. Everyone who imports it would use the same name. Do inconsistencies just come from having a bunch of modules with the same name, or am I missing something about the problem?

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 17:05):

Also, not sure if we support it fully, but there are nested imports that would shorten the import list verbosity:

Import Foo.{Bar.Baz, Fizz.{Buzz, Hop}}

Of course that could be formated in whatever way is considered to look nicest, but it gives access to Foo.Bar.Baz, Foo.Fizz.Buzz and Foo.Fizz.Hop.

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:10):

yeah Rust does that nested import style - I'm definitely open to it!

view this post on Zulip Yorye Nathan (Feb 19 2022 at 17:18):

how about this? import Foo.{Bar.Baz, *} which gives access to Foo.anything.specific as well as Baz (not fully qualified)

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:20):

I don't want to have "import all" as a concept in the language because it slows down loading :big_smile:

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:20):

it means we can't start canonicalizing the current module until we've parsed its dependencies

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:23):

Modules are not any more responsible to declare aliases. And if you don't have aliases, declaring your imports is not that important any more.

so something we do pretty often at work is using versioned modules - e.g.

import Nri.Ui.Button.V10 as Button

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:24):

this is because sometimes we make a new version of something (e.g. a dropdown) that's used in a ton of places, but where the new API is backwards-incompatible with the old one

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:24):

and we don't want to be forced to go back and update all the old ones at once, since the old usages are still working fine

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:25):

so we introduce the new version in a different module with a different name (e.g. Button.V10 compared to the previous Button.V9), and then when we have time later go back and update the old usages to the latest version

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:25):

but there's a period of time where the new version and the old version are coexisting in the same code base

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:25):

which is to say, there are some files with this at the top:

import Nri.Ui.Button.V10 as Button

...but others with this:

import Nri.Ui.Button.V9 as Button

view this post on Zulip Yorye Nathan (Feb 19 2022 at 17:27):

is it a different case if I really depend on everything from Foo? I imagine some core module used everywhere and declaring a lot of types

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:29):

I think in this proposed design we'd have to do one of the following instead:

unless I'm missing something!

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:29):

is it a different case if I really depend on everything from Foo?

not from a compiler performance perspective, unfortunately!

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:29):

basically if every identifier in my file is explicitly named within my file (either in the imports or as declarations I made up myself)

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:30):

then I don't need to load or process any other files to begin the canonicalization phase

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:30):

but if there are possibly identifiers whose names are defined outside my file, then canonicalization is blocked on doing some processing on those other files too

view this post on Zulip Yorye Nathan (Feb 19 2022 at 17:30):

I see, makes sense

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 17:33):

Version modules just feel wrong to me. I get the use case, but I don't like it.

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 17:35):

Also, it feels like the issue is more one of ordering:
import Nri.Ui.V9.Button and import Nri.Ui.V10.Button. Now the change is just the import line, though it still isn't great have the V* in the name.

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:37):

it's a separate topic, but I actually like the idea of having this be a language feature instead of a convention

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:37):

(versioned modules I mean)

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:37):

separate topic though!

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 17:38):

I guess if I was actually dealing with something like this, I would either try to use a slightly more descriptive name to distinguish the library's or append version to the name. Then when updating to the new version just use Button10.api

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:39):

yeah I guess the idea here would be to have a global alias like Button.V9 as Button9 or something

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:39):

which, to be fair, doesn't seem like the end of the world to me

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 17:43):

No, I wouldn't want a global alias, I would just rename the package.

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:44):

oh in this case they come from the same package

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:44):

(which is important because several of the modules share internal code within the package)

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 17:47):

I'm not sure I understand the difference here. I would probably do Nri.Ui.ButtonV10 instead of Nri.Ui.Button.V10

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:47):

oh, sure - that works

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:48):

but you'd still use an alias to remove the Nri.Ui. part

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 17:49):

Does that need an alias? I thought import Nri.Ui.ButtonV10 would enable me to use ButtonV10.* locally.

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:49):

not in Elm

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 17:49):

Oh....yikes.

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:49):

or in Roc, currently - although I honestly never considered adopting those semantics for imports :thinking:

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:50):

that's also how Haskell does it, which is I assume where Elm got it

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 17:50):

Interesting

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:50):

but now that you mention it, it's super common to have it work the way you described

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:50):

like to do import X.Y.Z as Z

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 17:50):

Now this conversation makes a lot more sense.

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 17:51):

I see the potential complication.

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:52):

example:

import Html.Styled.Events as Events
import Json.Decode as Decode
import Nri.LmsContext as LmsContext exposing (LmsContext)
import Nri.Password as Password exposing (Password)
import Nri.Ui.Button.V10 as Button
import Nri.Ui.ClickableText.V3 as ClickableText
import Nri.Ui.Logo.V1 as Logo
import Nri.Ui.Message.V3 as Message
import Nri.Ui.TextInput.V7 as TextInput
import Nri.Ui.UiIcon.V1 as UiIcons

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:52):

I can't believe I never noticed that pattern :joy:

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:53):

like if the version number was moved one position earlier, like you suggested, the as could be there by default

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 17:55):

Also, roc doesn't currently work the elm way. import pf.Task give direct access to Task.*, but maybe that is just special for the plaform?

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:56):

yeah it's special for the platform at the moment

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 17:56):

Ah, ok

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:56):

but this makes a pretty compelling case that it shouldn't be :big_smile:

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:57):

another interesting thought: if we had that import strategy, module names could use slashes instead of dots

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:57):

e.g. Nri/Ui/TextInput

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:57):

which is the actual directory path

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 17:57):

Just wondering, why does LmsContext need exposing but the others don't?

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:58):

the UI ones tend not to need to expose any types

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:58):

there's a type called LmsContext that we want in scope

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:58):

but for the others we're probably just gonna call a Button.view function

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:58):

but there's no Button type that needs exposing

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 17:58):

Otherwise you would have to write LmsContext.LmsContext?

view this post on Zulip Richard Feldman (Feb 19 2022 at 17:58):

right

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 17:58):

Ok.

view this post on Zulip Richard Feldman (Feb 19 2022 at 18:02):

:thinking: what if the language supported "default imports" specified in either the app module or the platform module?

view this post on Zulip Richard Feldman (Feb 19 2022 at 18:02):

since those are both sort of the "root module" that everything else gets compiled from in practice

view this post on Zulip Richard Feldman (Feb 19 2022 at 18:02):

so you could say "all my modules get these imports, with these aliases, and exposing these things, by default"

view this post on Zulip Richard Feldman (Feb 19 2022 at 18:03):

and say it once, in the app module or the platform module (depending on which you're working on)

view this post on Zulip Richard Feldman (Feb 19 2022 at 18:03):

and have it apply to all the things they import

view this post on Zulip Richard Feldman (Feb 19 2022 at 18:04):

an immediate downside that come to mind compared to the status quo (which is shared by the "metadata file" approach, to be fair): you can no longer necessarily compile an individual interface module in isolation, because it might not work without its default imports

view this post on Zulip Richard Feldman (Feb 19 2022 at 18:05):

also to be fair, I'm not sure how big of a deal that would be in practice

view this post on Zulip Richard Feldman (Feb 19 2022 at 18:06):

oh wait - what does the original proposal imply for cyclic dependencies? 🤨

view this post on Zulip Richard Feldman (Feb 19 2022 at 18:06):

if everything is always imported, doesn't that imply that everything depends on everything else?

view this post on Zulip Anton (Feb 19 2022 at 18:40):

Another downside I thought about is that importing more than you need will require the autocomplete to search through more options.

view this post on Zulip Francesco Orsenigo (Feb 19 2022 at 19:21):

Brendan Hansknecht said:

Just a quick question. Why do the naming inconsistencies arise? [...]

A module uses import Foo.Bar.Baz as Baz.
However, another module requires both Foo.Bar.Baz and Meh.Baz, so it will probably use a different alias for both, in particular if the person who wrote one module was not aware of the alias conventions used in the other.
When you have a lot of modules dealing with the same domain (for example, ecommerce), it is really easy to end up with a lot of different modules all aliased to Cart.

Richard Feldman said:

so something we do pretty often at work is using versioned modules - e.g.

import Nri.Ui.Button.V10 as Button

Allowing local aliases, something like alias Button = Nri.Ui.Button.V10 is something I considered, but I feel like it's prone to abuse so for now I want to see if I can do without.

In this specific case, since we are already willing to pay the price of a module that's aware of its own version, my solution here would again be a record:

button = Nri.UI.Button.V10.asRecord

Which would make available button.primary, button.seconary, button.inline or whatever else you want.

Yorye Nathan said:

is it a different case if I really depend on everything from Foo? I imagine some core module used everywhere and declaring a lot of types

Please check the notes and the example I linked.
You can declare variables and types to be globally visible (for example, not, Bool, Result from Core)

view this post on Zulip Francesco Orsenigo (Feb 19 2022 at 19:23):

Richard Feldman said:

an immediate downside that come to mind compared to the status quo (which is shared by the "metadata file" approach, to be fair): you can no longer necessarily compile an individual interface module in isolation, because it might not work without its default imports

Indeed.
A single module can be compiled only with the metafile present.
However do notice that this is the case with Elm too.

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:27):

yeah it's the case in Roc too, although only if the module imports things from packages. If it doesn't import anything from packages it can be compiled on it its own

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:27):

but I don't think that's necessarily a critical use case

view this post on Zulip Francesco Orsenigo (Feb 19 2022 at 19:29):

Richard Feldman said:

oh wait - what does the original proposal imply for cyclic dependencies? 🤨

Yes.
At least in my case, I decided that cyclic dependencies between modules was a net improvement on the ergonomics of the language.
I don't think it gave me particular problems in the implementation.
I haven't yet implemented compiler caching, but I don't think it will be too big of a problem.

view this post on Zulip Francesco Orsenigo (Feb 19 2022 at 19:31):

Richard Feldman said:

if everything is always imported, doesn't that imply that everything depends on everything else?

Not really.
Everything is always aliased, but stuff is loaded only when it's actually referenced.
If the metafile references stuff that does not exist it won't even be an error.

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 19:34):

I still feel that a lot of the aliasing past import Foo.Bar as Bar is really more a sign of naming or architectural problems rather than a programming language issue.
I used to program in Go on some pretty large projects and they have a lot of guidelines around package naming that really help to simplify code.

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 19:34):

Note: Go still has aliases, but they are local and used very sparingly.

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 19:35):

Their default import is essentially equivalent to import Foo.Bar as Bar without the explicit alias.

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 19:36):

They also ban cyclic dependencies.

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 19:38):

I think with default aliases and good package naming, naming doesn't tend to pop up as a major problem. Our compiler is a good example. Rust has aliases, but they are rarely used in our compiler.

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:38):

ok so I guess the idea is that the module dependency graph would get determined during canonicalization rather than after parsing the import headers like it does today

view this post on Zulip Francesco Orsenigo (Feb 19 2022 at 19:39):

Richard Feldman said:
Yes, that is what SquarePants is doing.

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:47):

hmm, so there's actually an interesting potential performance optimization here

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:47):

a very strange one haha

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:47):

so right now Symbol is a 64-bit integer

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:47):

and whenever we encounter a string identifier, we intern it in there

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:48):

it is used all over the place and if we could reduce it to a 32-bit integer, that would absolutely speed up compile times

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:48):

but I haven't been able to think of a way to do that

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:48):

this actually seems like it might be a way to do it, which might make up for other performance downsides it might have

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:48):

so the idea is this:

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:49):

let's suppose for simplicity's sake that the design here is that the app or platform module gets to have an imports section, but no other modules do

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:49):

and that root module's imports section applies to all the other modules

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:49):

but it only applies to them in terms of what's exposed

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:49):

and the actual dependency graph of the modules is determined by what's referenced from other modules during canonicalization

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:50):

ok, so each module gets its own intern table which maps a Symbol to a string

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:50):

the idea is this:

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:51):

since we now know every module in one place that's ever going to be loaded

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:52):

we can start by loading each of those modules into memory, all in parallel, and adding their exposes to the global symbol interns

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:52):

while also parsing them at the same time, I suppose

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:53):

so basically we massively parallelize just the parsing step, and then canonicalize just each module's exposes

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:53):

now we can have one Symbol in this global symbol interns table for each module, in a 32-bit integer, and we can just drop the part of the Symbol that records which module it's from

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:54):

instead, we can just write down out-of-band which modules map to which ranges in Symbol (e.g. "if it's between 12 and 17, it's from the Foo module)

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:54):

which in turn can be in a sorted tree or something for fast lookups (side detail, not really relevant to the main point here)

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:54):

so then, we have this global symbol table, and we can just clone it for each of the modules when we go to canonicalize them

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:55):

actually, even better - don't clone it, just share it immutably

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:55):

and have 2 symbol tables, one for external lookups and one for internal lookups

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:55):

(internal to the module)

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:55):

and have the internal module have its first Symbol be 1 greater than the highest Symbol number of the globally shared one

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:58):

so basically what this would mean is:

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:59):

which in turn means...I think this would mean we could have a 32-bit Symbol, which would probably be such a big performance savings that it would justify less parallelism in other places :astonished:

view this post on Zulip Richard Feldman (Feb 19 2022 at 19:59):

although, now that I say that out loud...is there some way we could do that today? :thinking:

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:00):

I guess the main difference would be that threads would get blocked

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:00):

like I open the root module, then parse its imports, then load its dependencies and parse their imports, before I can find their exposes...and I have to do that whole procedure across the whole import graph

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:01):

and I can't start working on (for example) canonicalizing the root module until I've completed that for every other module, but I don't necessarily discover them until I'm done with I/O from the previous module

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:01):

so I could see a potential performance cost there

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:01):

very interesting!

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:05):

I guess in a similar vein, this would probably speed up clean builds

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:05):

of large projects in particular

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:05):

because it would literally be able to kick off all the async I/O at once

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:05):

and not be blocked on I/O to discover more dependencies

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:08):

this is such a radical idea, I'm finding it very challenging to think through all the implications :stuck_out_tongue:

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 20:13):

Would this lead to always parsing every file? Currently if I never import a file, it is never looked at.

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 20:14):

Also, would each library I depend on have it's own imports declaration as well, just like the app and platform?

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:14):

yeah so if the idea is "the root module's imports is the only imports" then the same amount of parsing happens

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:14):

if the entire project doesn't use a module, it doesn't get parsed; otherwise, it does

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:14):

which is already true today

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:15):

the difference would be how and when those dependencies get discovered

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 20:17):

So parse imports/ aliases in root, but don't actually go and parse the related files until I see a call using one of those imports/aliases?

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:17):

no, parse them all anyway

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:17):

but that's what will happen anyway

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:17):

so for example

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:18):

let's say I have an app module which imports Foo, and Foo imports Bar

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:18):

the 3 modules that will get parsed when I roc build MyApp.roc are the app module, Foo, and Bar

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:18):

now let's say I have the app module declare "the imports for this whole project are Foo and Bar"

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:19):

so the same 3 modules end up getting parsed

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:19):

the difference would be that

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 20:19):

I am thinking specifically of the library case. If I depend on library Foo, but only on the specific sub module Foo.Bar.Baz. Baz may be stand alone, or only import a small subset of the library, not all of it. So I shouldn't have to pay the cost for parsing the entire library.

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:19):

oh I think you still wouldn't

view this post on Zulip Brendan Hansknecht (Feb 19 2022 at 20:20):

But if we read the library root aliases and then parse all of those files, it would parse way more files.

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:20):

wait, is that true? :thinking:

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:20):

yeah I see it now

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:20):

oh

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:21):

yeah so that would negatively impact cold builds

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:21):

for sure

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:23):

actually, I think that's probably not a downside in practice

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:24):

because if I'm downloading a package from the package repo (the most common case), it can ship with a pre-generated metadata file that has the dependency graph of all the modules in it

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:24):

so we could read from that and not have to parse everything to discover which modules expose what, and which ones depend on which others

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:27):

and if I'm developing a package locally rather than having installed it from the package repository, that would probably be for one of three reasons:

  1. I'm developing the package in isolation, so I'm probably going to want to parse everything anyway in order to run tests etc.
  2. I've split part of my project up into a separate local package for some reason, in which case I'm using everything in it anyway and it's fine if everything gets parsed because anything that wouldn't get parsed would be unused code in my project and could just be deleted.
  3. I've vendored someone else's package, in which case the downside will indeed manifest. But maybe I could check in that cache manifest to mitigate it? (Or just be okay with it, because this would only be on the first build, and presumably we'd generate and then cache that manifest for subsequent builds anyway.)

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:27):

so I think in practice that shouldn't be a downside except maybe in the case of vendoring? :thinking:

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:28):

btw I swear I'm not just pursuing this idea because it would probably make our extremely complicated file loading code a lot simpler :laughing:

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:29):

it seems to have a lot of non-obvious benefits!

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:34):

for example, I think this would improve autocomplete

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:37):

because if as conventions happen (e.g. import Json.Decode as JD is something I've seen people do in Elm a lot), then the first time I write JD. in a module, it's kind of tricky for the autocomplete to know to suggest "add import Json.Decode as JD" as part of its suggestions

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:37):

for that to work, it would have to keep an index of common as usages, which is further complicated by the fact that sometimes you might have as JD in multiple places used in different ways

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:38):

like in the earlier example, Button.V10 as Button and Button.V9 as Button might lead autocomplete to not know which one to suggest (although in that very specific case, a first-class concept of versioned modules would help)

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:38):

at any rate, the point is, if there's just one Json.Decode as JD in the root module, then there's no question

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:38):

even if you open a brand new module and type JD., autocomplete immediately knows exactly what to do

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:39):

which is cool

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:39):

oh and I forgot to mention - someone asked earlier about collapsed imports outside the editor; the classic example of where that comes up is GitHub

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:39):

not having import lines in individual modules would on average make PR diffs smaller

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:40):

I don't personally give that benefit much weight, but it's real

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:51):

from a build perspective, this has similarities to a build tool like Bazel: you declare everything that's going to be worked on in one place

view this post on Zulip Francesco Orsenigo (Feb 19 2022 at 20:51):

The idea is that the metafile acts as a single source of truth for imports, it's the only thing that autocomplete (or a human) needs to read to figure out who's who.

view this post on Zulip Richard Feldman (Feb 19 2022 at 20:52):

right

view this post on Zulip jan kili (Feb 19 2022 at 23:16):

Richard, is your improvement idea identical to, compatible with, or incompatible with Francesco's metafile idea?

view this post on Zulip Richard Feldman (Feb 19 2022 at 23:32):

I'd say it's identical in terms of its implications

view this post on Zulip Richard Feldman (Feb 19 2022 at 23:33):

(unless I've missed something!)

view this post on Zulip Richard Feldman (Feb 19 2022 at 23:34):

it's more of a concrete idea for how it would probably look in Roc - one of the design goals is for Roc to be useful as a scripting language, which means it has to be possible to implement any application as a single .roc file

view this post on Zulip Richard Feldman (Feb 19 2022 at 23:35):

so there wouldn't be a separate metadata file; it would be defined in the app module

view this post on Zulip jan kili (Feb 19 2022 at 23:36):

Could we allow imports to be defined somewhere else in the module (bottom, for example), so that we don't have to scroll past a screen of imports to get to the real logic?

view this post on Zulip Richard Feldman (Feb 19 2022 at 23:37):

possibly, but I think in an example with that many imports, the app module will probably just be one line of code - like main = MyApp.main or something like that

view this post on Zulip jan kili (Feb 19 2022 at 23:38):

True, I'm conflicted

view this post on Zulip jan kili (Feb 19 2022 at 23:39):

Incrementalism! Let's see if it proves awkward :)

view this post on Zulip Richard Feldman (Feb 20 2022 at 02:44):

here's what elm-spa-example would look like if all of its imports were moved into Main.elm https://github.com/rtfeldman/elm-spa-example/compare/imports-idea

view this post on Zulip Richard Feldman (Feb 20 2022 at 02:45):

37 additions and 383 deletions

view this post on Zulip Richard Feldman (Feb 20 2022 at 02:45):

and 2 of those are blank lines inserted by elm-format since I guess something changed in elm-format since the last time I ran it on that code base :big_smile:

view this post on Zulip jan kili (Feb 20 2022 at 03:17):

If Roc imports would otherwise be similar in volume and file distribution, then YES PLEASE!

view this post on Zulip Tommy Graves (Feb 20 2022 at 03:42):

I've had lots of occasions in the past where I've wanted to map out a graph of all the dependencies between files in an application, which is generally a pretty easy thing to do in languages with imports at the top of each file/module. I've also sometimes used linters or other static analysis tools to artificially restrict what imports certain module have access to, in order to enforce some patterns around decoupling code (for example). How would either of those work if all the imports are in the app module? I guess these are pretty easily solved once for all by just building a tool that outputs the list of imports for each file and then consuming that output in a dependency visualization tool or in static analysis, etc.?

What about for testing? Do you already have to build the whole application to test a single module anyway?

view this post on Zulip Richard Feldman (Feb 20 2022 at 03:45):

map out a graph of all the dependencies between files in an application, which is generally a pretty easy thing to do in languages with imports at the top of each file/module

we'd presumably want to offer this out-the-box in the editor anyway, although it would make it harder for a third party author to write their own tool (e.g. just using regexes on imports)

view this post on Zulip Richard Feldman (Feb 20 2022 at 03:46):

not impossible, but definitely harder

view this post on Zulip Richard Feldman (Feb 20 2022 at 03:46):

I've also sometimes used linters or other static analysis tools to artificially restrict what imports certain module have access to, in order to enforce some patterns around decoupling code (for example)

I think the same thing could be done based on usage - e.g. the linter rule is no longer "module X can't import module Y, or else error" but rather "module X can't reference anything from module Y, or else error"

view this post on Zulip Richard Feldman (Feb 20 2022 at 03:47):

I guess these are pretty easily solved once for all by just building a tool that outputs the list of imports for each file and then consuming that output in a dependency visualization tool or in static analysis, etc.?

that's definitely doable!

view this post on Zulip Richard Feldman (Feb 20 2022 at 03:48):

What about for testing? Do you already have to build the whole application to test a single module anyway?

I think for running one module's worth of tests, we can lean on build caching/incremental compilation

view this post on Zulip Richard Feldman (Feb 20 2022 at 03:48):

like we'll want to cache a dependency graph anyway once we've determined it from scratch on the first build

view this post on Zulip Richard Feldman (Feb 20 2022 at 03:48):

so as long as your root module (with all the imports in it) didn't change since the last time you ran the test, we can just load up the cached build graph and make sure we only rebuild the modules depended on by your test

view this post on Zulip Tommy Graves (Feb 20 2022 at 04:02):

That all makes sense! For what it's worth I think not having to specify imports at the top of each file is pretty great for productivity. At least in my experience even the best auto-importing tools make mistakes or cause frustrations from time to time. I was optimistic that the Roc editor would actually provide a world-best experience with this so it wouldn't be an issue, but not even having the opportunity to worry about it sounds pretty great. And it seems like we'll be able to avoid the challenges you run into with similar approaches in other languages -- e.g. in Rails it's almost impossible (or in some cases actually impossible -- like some_variable.constantize.some_method() lol) to determine the dependency graph of a file statically, whereas in Roc that won't be difficult for the compiler, parser, and associated tooling to do that.

view this post on Zulip Francesco Orsenigo (Feb 20 2022 at 05:10):

I assume that the 'app module' is the one that defines the main value.
Wouldn't that force libraries to pick a main module rather than letting it be a collection of peer modules?
What if you want to have different main and/or platforms in the same codebase?
Wouldn't that force you to maintain duplicated metadata?

I haven't yet gotten there, but the idea for Squarepants was to have the platform define a default metafile, so that single modules can be ran against that.

view this post on Zulip Richard Feldman (Feb 20 2022 at 13:31):

so there are three relevant "root" modules here:

so the idea is that all three of them take the place of something like elm.json (depending on what you're building), and all three can serve as the equivalent of a metafile.

view this post on Zulip Francesco Orsenigo (Feb 20 2022 at 17:39):

Ok. It makes sense. Honestly, for Squarepants I was planning to dump most of that information in the metafile, but I'm not there yet and my requirements are slightly different than Roc's.
Whatever Roc ends up doing, glad I got your wheels spinning.

view this post on Zulip Richard Feldman (Feb 20 2022 at 17:59):

yeah definitely thank you again for sharing it!


Last updated: Jun 16 2026 at 16:19 UTC