Stream: contributing

Topic: x86 dev backend


view this post on Zulip Brendan Hansknecht (Jul 13 2022 at 05:16):

I just revamped all of the issues around the x86 dev backend and plan to be working on it more actively. I want to get it to a similar state as the dev wasm backend. It should be usable for any roc program that just uses common types (List, Str, Numbers, Closures, Unions, Structs). There is definitely a large amount of work to be done especially around builtins. That being said, there are fewer large new language constructs than I expected. So that is nice.

Generally speaking, the dev backend is not the easiest piece of software to contribute to. It is directly generating machine instructions in the form of a list of bytes. That being said, there are a lot of online tools and documentation. I, also, am willing to help/chat with/pair with anyone that wants to work on the dev backend. In general, I advise people to give it a shot if they are at all interested. It's a great way to learn.

Probably the best place to start for new contributors in adding a Num builtin. The list of needed builtins can be found in issue #3513. If you want to work on a task, just convert the task into an issue and assign it to yourself. Feel free to throw up a WIP PR or just implement part of the builtin if supporting all types is larger than the scope you want to start with.

For general information on the dev backend as a whole, look to the readme. For information on how to figure out x86 assembly, look at this comment. Also, here is an example PR that had to deal with assembly generation for a Num builtin.

As a side note, some of the builtins are actually super easy to implement because we just wrap zig builtins. Adding these functions should be easy and a great way to contribute even if you have zero interesting in anything assembly.

For a look into all of the issues that are open. There is now a github project for the X86 dev backend MVP. If you have any interesting in contributing, please reach out.

view this post on Zulip Qqwy / Marten (Jul 13 2022 at 10:55):

As new contributor I think I'm missing something.

Can someone explain how this relates to the current x86 binary creation? Is the idea that there will be an alternative to the bitcode written in Zig?

view this post on Zulip Folkert de Vries (Jul 13 2022 at 11:06):

no, zig is the magic sauce that makes this work

view this post on Zulip Folkert de Vries (Jul 13 2022 at 11:07):

zig conveniently makes several formats available: wasm32, llvm IR, and various kinds of assembly (zig makes cross-compilation very easy)

view this post on Zulip Folkert de Vries (Jul 13 2022 at 11:07):

"all" we need to do is hook up the zig builtins based on the user program

view this post on Zulip Folkert de Vries (Jul 13 2022 at 11:07):

also for speed, some things (e.g. addition on integers) are implemented directly in assembly/wasm/llvm

view this post on Zulip Qqwy / Marten (Jul 13 2022 at 11:35):

Thanks! What does the 'x86 dev backend' then mean exactly?

view this post on Zulip Folkert de Vries (Jul 13 2022 at 11:49):

it is where we go from roc code to x86 assembly

view this post on Zulip Folkert de Vries (Jul 13 2022 at 11:49):

much like we have an "llvm" backend which goes to llvm (and from there to a binary, potentially x86 assembly), and wasm

view this post on Zulip Folkert de Vries (Jul 13 2022 at 11:50):

and even wasm via llvm

view this post on Zulip Anton (Jul 13 2022 at 12:10):

The dev backend is meant to quickly build binaries for a pleasant development experience, the llvm backend is used to produce the fastest possible binary for use in production.

view this post on Zulip Richard Feldman (Jul 13 2022 at 13:12):

yeah, in general llvm runs slowly (increases compile times) but produces optimized binaries that run really fast

view this post on Zulip Richard Feldman (Jul 13 2022 at 13:13):

the goal of the dev backends is to do the reverse: improve compile times at the cost of not running as fast

view this post on Zulip Qqwy / Marten (Jul 13 2022 at 13:13):

Ah! I see

view this post on Zulip Qqwy / Marten (Jul 13 2022 at 13:13):

Thanks!

view this post on Zulip Richard Feldman (Jul 13 2022 at 13:14):

once all the backends are at feature parity, the idea is to make it so that we only use LLVM when you pass --optimize to the roc compiler

view this post on Zulip Richard Feldman (Jul 13 2022 at 13:14):

that way everyone gets the fastest builds by default!

view this post on Zulip Qqwy / Marten (Jul 13 2022 at 13:25):

Does the compilation pipeline look a bit like this then? Or are there steps missing?

view this post on Zulip Folkert de Vries (Jul 13 2022 at 13:26):

roughly. We have a parse ast, canonical ast and then mono IR

view this post on Zulip Ayaz Hafiz (Jul 13 2022 at 13:27):

the Zig bitcode is read by the backends, we don't lower mono IR to Zig

view this post on Zulip Folkert de Vries (Jul 13 2022 at 13:28):

yeah the code generated by zig and roc is merged. it depends a bit on the backend where that happens

view this post on Zulip Folkert de Vries (Jul 13 2022 at 13:28):

or how it happens

view this post on Zulip Qqwy / Marten (Jul 13 2022 at 13:38):

So maybe more like this ?

view this post on Zulip Ayaz Hafiz (Jul 13 2022 at 13:39):

yeah I think so

view this post on Zulip Ayaz Hafiz (Jul 13 2022 at 13:40):

there is also an alias analysis pass but the code generators don't directly operate over it right @Folkert de Vries ?

view this post on Zulip Folkert de Vries (Jul 13 2022 at 13:49):

correct

view this post on Zulip Richard Feldman (Jul 13 2022 at 13:52):

to expand on "Zig bitcode" - we use the term bitcode because of LLVM bitcode

view this post on Zulip Richard Feldman (Jul 13 2022 at 13:52):

originally, when we only had the LLVM backend, what we'd do is to have Zig generate a LLVM bitcode representation of all the builtins

view this post on Zulip Richard Feldman (Jul 13 2022 at 13:52):

then we would load that bitcode file in at the very beginning of code generation, and use it as our starting point for code generation

view this post on Zulip Richard Feldman (Jul 13 2022 at 13:53):

so it's as if the Roc programmer had written all those functions (except of course they didn't) - they're all available right there, and LLVM has no idea that they came from a file instead of from the user's program

view this post on Zulip Richard Feldman (Jul 13 2022 at 13:53):

importantly, this means that it optimizes everything together as a whole

view this post on Zulip Richard Feldman (Jul 13 2022 at 13:53):

as far as LLVM is concerned, there's no boundary between userspace code and builtins, so it is totally unrestricted in the optimizations it can do

view this post on Zulip Richard Feldman (Jul 13 2022 at 13:54):

for the non-LLVM backends, naturally there isn't LLVM bitcode (because they aren't using LLVM) but the directory is still named bitcode/ because that's what it originally did :big_smile:

view this post on Zulip Qqwy / Marten (Jul 13 2022 at 13:55):

Cool!

view this post on Zulip Richard Feldman (Jul 13 2022 at 13:55):

for the other backends, we generate some other binary file (e.g. wasm binary file, x86 binary object file, arm binary object file, etc.)

view this post on Zulip Richard Feldman (Jul 13 2022 at 13:55):

instead of LLVM bitcode

view this post on Zulip Richard Feldman (Jul 13 2022 at 13:55):

and then we link it in

view this post on Zulip Richard Feldman (Jul 13 2022 at 13:56):

also, regarding "other? assembly" - right now we have dev backends that compile to 64-bit x86 instructions, to 64-bit ARM instructions, and to wasm instructions

view this post on Zulip Richard Feldman (Jul 13 2022 at 13:57):

and I believe that's it - although @Brendan Hansknecht would know if there are others I'm missing, since he made all the non-wasm dev backends, and @Brian Carroll made the wasm one!

view this post on Zulip Richard Feldman (Jul 13 2022 at 13:57):

I think WASM is the only 32-bit dev backend we support at the moment

view this post on Zulip Brian Carroll (Jul 13 2022 at 14:16):

Yep all of that is correct! We currently have 3 dev backends. x86_64, aarch64, and wasm.
The Wasm generator is in a totally separate crate because the instruction set is so different from CPUs.
The CPU backends are all in the same crate that's structured to be extensible for 64 and 32 bit targets. But it probably makes sense to focus on x86_64, then aarch64, then maybe other stuff.

view this post on Zulip Brendan Hansknecht (Jul 13 2022 at 14:51):

@Qqwy / Marten For the why of the dev backend, look at #compiler development > compile fast. Specifically this message. Essentially when looking at backend time only, the x86 dev backend is more than 100x faster than an llvm dev build. When looking at full build of the quicksort example, it lead to about 2.5x faster compile times for the entire build.


Last updated: Jul 05 2025 at 12:14 UTC