Skip to content

Latest commit

 

History

History
107 lines (79 loc) · 4.98 KB

Architecture.md

File metadata and controls

107 lines (79 loc) · 4.98 KB

Rebel Architecture

Read this https://github.com/chenglou/intro-to-reason-compilation before you read rest of the content to understand the ocaml compiler jargon used in this file.

Jenga Jargon

Target: Any artifact that will be built by jenga.

Dep: Any source file or artifact that is used to build targets.

Rule: A rule is triplet of an action, deps and targets. Every build needs to have a default rule. This generates the build graph for jenga.

Note: A target can only be generated by a single rule.

Scheme: A scheme is list of rules. Schemes are composable.

More jenga things

A jenga process starts running from the root_dir and expects a scheme with a default rule. Jenga now looks at the directories of the targets of default rules and starts constructing schemes for the target directories. This happens till schemes for all the target directories are generated.

Jenga will not compute any targets that are not depended by the default targets either directly or indirectly.

Overview of Native backend Compilation

A rebel project compilation has two parts compiling third party packages and compiling first party sources. Each third party package is built almost the same way as the first party sources. As the first step we generate a file that namespaces all the source files to avoid name clashes with third party sources. In the next step, we compile all the source files and the order of compilation is generated by the first party dependency of the each file. In the last step, we generate an executable if we are compiling top level source files otherwise we generate a cma artifact for the package which will used to generate the final executable.

Indepth look at Compilation Steps

Step 1: Module Aliasing

We first generate a module alias file for the entire package from all the source files. The module alias file takes the current library foo's first-party sources, e.g. A.re, B.re, and turn them into a foo.ml file whose content is:

module A = Foo__A;
module B = Foo__B;

We'll then compile this file into foo.cmi/cmo/cmt, and have it opened by default when compiling A.re and B.re (into foo_A and foo_B respectively) later. The effect is that, inside A.re, we can refer to B instead of Foo__B thanks to the pre-opened foo.ml. But when these files are used by other libraries (which aren't compiled with foo.re pre-opened of course), they won't see module A or B, only Foo__A and Foo__B, aka in practice, they simply won't see them. This effectively means we've implemented namespacing!

Note that we're generating a ml file rather than re, because this rebel theoretically works on pure ocaml projects too, with no dep on reason.

Aka Foo__A and Foo__B's compiled cmis can't be found at the moment this module alias file is compiled. This is normal, since the module alias file is the first thing that's compiled (so that we can open it during compilation of A.re and B.re into Foo__A and Foo__B). Think of this as forward declaration.

We also declare third party npm package cma artifacts as deps here so that all the third party sources are compiled before source files are compiled.

Step 2: Compile sources

Since we already have all the third pary packages compiled by this point, to compile any source file we only need to compile it's first party dependencies. Therefore we declare the first party cmi artifacts as deps for this file along with module alias artifacts. Although it may seem we are underspecifying deps by just specifying cmi artifacts, it is sufficient. This is compensated by specifying cmo artifacts in the next step which will make sure all necessary artifacts are built.

For example compiling A.re will emit Foo__A.cmo, Foo__A.cmt, Foo__A.cmo as build artifacts. But we can still refer to A as A in the source files because we compile with opening module alias we generated in the first step.

Step 3: Linking

If the source files belong to a third partry package, then we generate a cma artifact. Cma is a library file for the current library which bundles up all the lib's first-party compiled sources. This way, it's much easier, at the end, at the top level, to include each library's cma file to compile the final executable, than to tediously pass every single source file from every lib in order.

There's a caveat though. We said we're only bundling up the current lib's first-party code; We could have bundled up its third-party deps' cma too, but then we might get into duplicate artifact problem caused by e.g. library A and B both requiring and bundling C. So we can only bundle first-party code, and then, at the top, take all the transitive dependencies (libraries) cmas, figure out their relative order, and pass them in that order to the top level linking step. Still tedious, but at least we're not passing individual source files in order.

If the source files belong to the top level source, then we generate app.out binary executable by passing the topologically sorted first party cmo artifacts and cma artifacts of third-party npm packages.