Let's Build a Cargo Compatible Build Tool - Part 1

Let's Build a Cargo Compatible Build Tool - Part 1
Photo by Kurt Cotoaga / Unsplash

This is the first in an educational mulitpart series about how build tools like Cargo work under the hood, by building our own from the ground up.


Have you ever wondered how a build tool like Cargo works? Have you ever wondered about all the things it takes care of for you and the complexity that underlies it? We'll answer those questions and more as we build Freight from the ground up. This will be a long series, built over time, with the goal of being educational. We'll go through most of the commits I make over in the main repository and iterate on the code as we go along. We're not necessarily concerned with "the perfect code" as it's expected that we'll throw most of it away over time. We're concerned with getting it working now, acknowledging the short comings and then coming back to fix it later. Freight, the tool we will build, is not meant to be a replacement for Cargo. Fracturing the ecosystem is explicitly not a goal. We want to understand how it works, or any build/packaging tool for that matter, regardless of language. We are of course going to use Rust for this series since we’re building a Cargo compatible tool, but a lot of the same underlying principles apply, especially when we get to resolving version dependencies.

We’re also going to set some constraints for ourselves:

  1. Freight must be able to build itself at every commit
  2. We will not use Cargo at all to build Freight

Back when Rust was at 1.0 a lot of the libraries we use these days like hyper, serde, or clap, either didn’t really exist or were in their early stages. Most of the time we only had access to the standard library and that itself was smaller compared to today. We aren’t going all the way back to 2015 for this series, but the above constraints means that until we can get Freight to pull dependencies from crates.io we’re going to need to build a lot of things by hand, which gives us an opportunity to explore just how good std is and learn about all kinds of things like parsing, rolling our own http requests, and more just to get to that point.

Come along, there’s a lot to learn and build.


Setting up our repo

Now with a build tool, we need something to build the tool and well we’re not going to implement a compiler, and we’re not going to limit ourselves when it comes to other programs we can use. If you want to follow along, you’ll need just, git, and rustc at 1.70 or greater somewhere on your $PATH. We’ll be running all of this on Linux, but we’ll try to make sure it can work on Windows and MacOS, but that's not as much of a guarantee early days as we get functionality added. Optionally if you use nix and direnv you can use the flake.nix and .envrc below to get these automatically.

Let’s first create a folder called freight somewhere on our computer

❯ mkdir freight

We’ll then go inside the directory and initialize an empty git repo

❯ cd freight

❯ git init
Initialized empty Git repository in /home/michael/freight/.git/

If you want to use nix with flakes and direnv then also add these files. It’s not necessary for local development, but note when we setup our CI we’ll use these files for it. Here’s the .envrc file

use flake

and for our flake.nix file this will set us up with all of the tools we need:

{
  description = "Learn how Cargo and other build tools/package managers work under the hood by building one";

  inputs = {
    nixpkgs.url      = "github:NixOS/nixpkgs/nixos-unstable";
    rust-overlay.url = "github:oxalica/rust-overlay";
    flake-utils.url  = "github:numtide/flake-utils";
  };

  outputs = { self, nixpkgs, rust-overlay, flake-utils, ... }:
    flake-utils.lib.eachDefaultSystem (system:
      let
        overlays = [
          (import rust-overlay)
        ];
        pkgs = import nixpkgs {
          inherit system overlays;
        };
      in
      with pkgs;
      {
        devShells.default = mkShell {
          buildInputs = [
            git
            just
            (rust-bin.stable."1.70.0".default.override {
              extensions = [ "rust-src" "rust-analyzer" ];
              targets = ["x86_64-unknown-linux-gnu"];
            })
          ];
          RUST_SRC_PATH = "${rust-bin.stable."1.70.0".default}/lib/rustlib/src/rust/library";
        };
      }
    );
}

If you're using nix-direnv you'll need to run 'direnv allow' in order for it to load the flake environment for you. Now let’s also setup up the .gitignore so we don’t accidentally commit files we don’t mean to at all.

/target
.direnv

Let's also add our main.rs file and lib.rs file. We want both since we want to use main.rs as our file to do things like parse arguments to run subcommands and act as the CLI binary, while lib.rs will contain all of the code used to actually build our programs and imported as a library. The other benefit of this is if we did eventually publish the code to crates.io people could use the freight library in their own code if they needed it without needing to shell out to our binary. Let's get them setup by first creating the src directory and then the files inside of it:

❯ mkdir src

❯ touch src/lib.rs

❯ echo "fn main() {}" > src/main.rs

With this we're all set to begin.


Who builds the build tool?

One of the first things we need to do is to figure out what series of commands we need to run in order to actually build freight. Normally with rust code we'd do a lot of this setup with cargo init and then we could just call cargo build or cargo run. Cargo would figure out from the config file it generated what commands it needed to invoke for rustc in the correct order and then build the program for you and run it. We're starting from scratch though so let's see what we can do with just rustc and try to compile lib.rs:

❯ rustc src/lib.rs
error[E0601]: `main` function not found in crate `lib`
  |
  = note: consider adding a `main` function to `src/lib.rs`

error: aborting due to previous error

For more information about this error, try `rustc --explain E0601`.

Weird shouldn't rustc know that lib.rs is a library not a binary? Surprisingly no! This is actually a convention set by cargo. We need to somehow tell rustc this is a library not a binary. Let's try this again using the --crate-type flag using the lib option:

❯ rustc --crate-type=lib src/lib.rs

❯ ls
liblib.rlib  src

This works! lib creates an rlib file which is a Rust library (thus rlib) that can be linked into other Rust programs. If we said that this was cdylib instead we'd create a '.so' file on Linux or the equivalent dynamic library file type on other platforms like so:

❯ rustc --crate-type=cdylib src/lib.rs

❯ ls
liblib.rlib  liblib.so  src
💡
Quick Recap
- rustc defaults to building a binary with a given Rust file as input
- rustc can build other types of files like dynamic libraries or Rust libraries with the --crate-type flag

We know now that for the lib.rs file we want to use '--crate-type=lib' and for the bin.rs file we can omit the flag or set it to '--crate-type=bin'. Let's clean up our artifacts from building lib.rs:

❯ rm liblib.rlib liblib.so

Now you might be wondering why did it default to liblib as a name? rustc will just use whatever the file name is and tack on lib to the front if it's a library. If we want this to be the library called freight we need to change the invocation a bit:

❯ rustc --crate-type=lib --crate-name=freight src/lib.rs

❯ ls
libfreight.rlib  src

Great we now actually have the library using the correct name. Let's get the binary in order as well. We'll compile src/main.rs with what we've learned above:

❯ rustc --crate-type=bin --crate-name=freight src/main.rs

❯ ls
freight  libfreight.rlib  src

We now have an executable! It doesn't do anything yet, but so far rustc hasn't complained which is great. Let's print out something to make sure it works. Open up main.rs in your favorite editor and add the following 'println!' statement:

fn main() {
    println!("Hello from freight!");
}

Then compile and run the code:

❯ rustc --crate-type=bin --crate-name=freight src/main.rs

❯ ./freight
Hello from freight!

Awesome. I think we have this mostly figured out then. Now we just need to link the library to the binary. Let's think about the order of operations here. We can have a library on it's own, but a binary that uses another library or dependency can't be built until the library or dependency is built. That means we'll need to first build the library and then link it. Let's try that. First let's add the library to main.rs

use freight; // This line is new!

fn main() {
    println!("Hello from freight!");
}

Now let's build the library then our binary:

❯ rustc --crate-type=lib --crate-name=freight src/lib.rs

❯ rustc --crate-type=bin --crate-name=freight --extern=freight src/main.rs
error[E0432]: unresolved import `freight`
 --> src/main.rs:1:5
  |
1 | use freight;
  |     ^^^^^^^ no `freight` in the root

error: aborting due to previous error

For more information about this error, try `rustc --explain E0432`.

We've told rustc to use an external library called freight, but it doesn't seem to work. Let's try it again but we'll list this directory as the spot for it to look up the crate.

❯ rustc --crate-type=bin --crate-name=freight --extern=freight -L . src/main.rs
error[E0432]: unresolved import `freight`
 --> src/main.rs:1:5
  |
1 | use freight;
  |     ^^^^^^^ no `freight` in the root

error: aborting due to previous error

For more information about this error, try `rustc --explain E0432`.

We're having the same problem! The library is definitely there why isn't it working? The problem here is editions and it's impossible to tell from the error message here. Let's change main.rs to this temporarily:

extern crate freight;

fn main() {
    println!("Hello from freight!");
}

and then run the same command as before:

❯ rustc --crate-type=bin --crate-name=freight --extern=freight -L . src/main.rs

❯ ./freight
Hello from freight!

This time it worked and that's because in Rust 2015 edition we had to specify every single external crate dependency that we wanted to use. This was quite cumbersome, because when you did something like 'use serde::Deserialize;' in the header this implied you were using the crate serde if it wasn't a module in your src directory that you specified with 'mod serde;'. In the 2018 edition the need to write 'extern crate serde;' was removed because of this. Tools like cargo would make sure to link the libraries and all would be fine. We have a lot of great improvements in the 2018 and 2021 editions of Rust so we want to make sure we use that instead. rustc defaults to the 2015 edition for compatibility reasons, but it's 2023 so let's live a little and use the 2021 edition in our code. Let's reset everything and try again. First we return main.rs to this again:

use freight;

fn main() {
    println!("Hello from freight!");
}

and then we run the compile code with the edition option now:

❯ rustc --crate-type=lib --crate-name=freight --edition=2021 src/lib.rs

❯ rustc --crate-type=bin --crate-name=freight --edition=2021 --extern=freight -L . src/main.rs
warning: unused import: `freight`
 --> src/main.rs:1:5
  |
1 | use freight;
  |     ^^^^^^^
  |
  = note: `#[warn(unused_imports)]` on by default

warning: 1 warning emitted

❯ ./freight
Hello from freight!

The warning is expected since we do nothing with the crate, but this indicates we've successfully linked it! Now personally if I had to write this out every time just to compile our build tool, I would be exhausted. Instead let's write out a justfile so we can use just to do it for us. just is a great alternative to make for things like this, especially for newer projects. So let's write one:

run: build
  ./freight
build:
  # Build crate dependencies
  rustc src/lib.rs --edition 2021 --crate-type=lib --crate-name=freight
  # Create the executable
  rustc src/main.rs --edition 2021 --crate-type=bin --crate-name=freight -L . --extern freight

First we write out the first command run. This will also be our default command since it's the first in the file. It has a dependency on the build command being run first. When running build is done it'll execute the commands below it. In this case it will run the freight binary. We then have the build command. We have some comments here to say what each part does and then we have our commands that we've been building for a little bit. So let's just run it now:

❯ just
# Build crate dependencies
rustc src/lib.rs --edition 2021 --crate-type=lib --crate-name=freight
# Create the executable
rustc src/main.rs --edition 2021 --crate-type=bin --crate-name=freight -L . --extern freight
warning: unused import: `freight`
 --> src/main.rs:1:5
  |
1 | use freight;
  |     ^^^^^^^
  |
  = note: `#[warn(unused_imports)]` on by default

warning: 1 warning emitted

./freight
Hello from freight!

As we can see it prints out each line it's doing as well as the output from the command it runs. We've automated all of that work so that we don't have to keep doing it over and over again. Now we just need to handle the fact that we're dumping all of these build artifacts into our top level directory. Let's instead output them into the target folder. We'll adjust the justfile to look like this instead:

run: build
  ./target/bootstrap/freight
build:
  mkdir -p target/bootstrap
  # Build crate dependencies
  rustc src/lib.rs --edition 2021 --crate-type=lib --crate-name=freight \
    --out-dir=target/bootstrap
  # Create the executable
  rustc src/main.rs --edition 2021 --crate-type=bin --crate-name=freight -L target/bootstrap \
    --extern freight --out-dir=target/bootstrap

We'll get to why we're calling it bootstrap later, but for now we've set rustc to point to that directory. If you run just again you should get the same output as earlier. This time, however, it is using the binary found in ./target/bootstrap instead. Now is also a good time to clean up any build artifacts from earlier so as not to pollute the top level directory if you've not done so already:

❯ rm freight libfreight.rlib

We now have a way to build and run our code, no Cargo needed. Now let's make it a bit more useful by having it compile itself.


Bootstrapping One's Build

With our justfile we have the first piece in our bootstrapping process. We want to build our code by hand, so that we can use that code to build itself, and then run that binary to show we built itself successfully. This is the case with rustc as well as a fully bootstrapped compiler. It uses a previous known working version of itself to build itself. There's a talk from last year's rustconf that goes into way more depth if you're curious about it.

🤔
Did you know that Rust was originally written in OCaml? A fun quirk of this is that Rust defines '\n' as '\n' and it only knows that '\n' is '0x0a' because of the OCaml compiler which had defined '\n' as '0x0a'. The Rust compiler has a completely benign trusting trust attack in it!

For our purposes though we'll want to use the justfile to create a binary, then use that binary to build itself, and then we run the new one. We'll use bootstrapping stages for now, so that we can visually see we have compiled a new version, but eventually we'll remove these once we get to argument parsing and run different commands to show this.

Let’s first fill out our main.rs file a bit:

use std::error::Error;

 fn main() -> Result<(), Box<dyn Error>> {
     #[cfg(not(stage1))]
     {
        // We'll fill this part in later
     }
     #[cfg(stage1)]
     {
         println!("Bootstrapped successfully!");
     }
     Ok(())
 }

As you can see we have two code blocks each denoted with a cfg attribute. When we call rustc we can pass in values with the —cfg flag. When it starts compiling the code, if it sees a cfg attribute or the cfg macro it will check the condition inside of it and whether it evaluates to true. In the case of the cfg macro this will either produce a boolean of true or false that you can use in the code with an if statement or in the case of the attribute it will either compile in the code or function it’s attached too if true or completely leave it out if false. In our code above if we pass the flag ‘—cfg stage1’ to rustc then only the println statement will actually be compiled in, everything we put in the other block will not show up in the final binary. This also applies in reverse if the flag is not set. With this we can put the code to compile itself into the first block, then when we run that binary and build another different one, we can run it to print out that it bootstrapped. Long term though we’ll move to building one binary via the justfile, a new binary with that one we just made, and then running that binary to make sure it all worked out just fine.

What we want to do now is abstract over rustc with Rust code. We want to produce a nice interface over us shelling out to it via ‘std::process::Command’. It should feel like Rust code even if we’re not running Rust functions. Let’s get started by opening up lib.rs and creating a struct to hold all of the options and flags we’ll possibly want to set to compile our binary:

use std::path::PathBuf;

pub struct Rustc {
    edition: Edition,
    crate_type: CrateType,
    crate_name: String,
    out_dir: PathBuf,
    lib_dir: PathBuf,
    cfg: Vec<String>,
    externs: Vec<String>,
}

pub enum Edition {
    E2015,
    E2018,
    E2021,
}

pub enum CrateType {
    Bin,
    Lib,
    RLib,
    DyLib,
    CDyLib,
    StaticLib,
    ProcMacro,
}

We have added an additional Edition and CrateType type to represent the possible options for the ‘—edition’ and ‘—crate-type’ flags that we saw earlier in our justfile. It’s much easier to work with enums rather than strings here. However, when we shell out to rustc we’ll need to convert these to strings somehow. Let’s implement ‘Display’ for these types which also implies ‘ToString’:

use std::path::PathBuf;
use std::fmt;           // This is new
use std::fmt::Display;  // This is also new

// struct Rustc Code abridged here

pub enum Edition {
    E2015,
    E2018,
    E2021,
}

impl Display for Edition {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        let edition = match self {
            Self::E2015 => "2015",
            Self::E2018 => "2018",
            Self::E2021 => "2021",
        };
        write!(f, "{edition}")
    }
}

pub enum CrateType {
    Bin,
    Lib,
    RLib,
    DyLib,
    CDyLib,
    StaticLib,
    ProcMacro,
}

impl Display for CrateType {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        let crate_type = match self {
            Self::Bin => "bin",
            Self::Lib => "lib",
            Self::RLib => "rlib",
            Self::DyLib => "dylib",
            Self::CDyLib => "cdylib",
            Self::StaticLib => "staticlib",
            Self::ProcMacro => "proc-macro",
        };
        write!(f, "{crate_type}")
    }
}

These translate the types to the string that rustc expects, but lets us utilize types that are easier to work with and use for things like pattern matching when writing Rust code.

Great now let’s write out the code that lets us translate the Rustc struct to a rustc command that’s executed.

use std::error::Error;     // This is new
use std::fmt;
use std::fmt::Display;
use std::path::PathBuf;
use std::process::Command; // This is new

pub struct Rustc {
    edition: Edition,
    crate_type: CrateType,
    crate_name: String,
    out_dir: PathBuf,
    lib_dir: PathBuf,
    cfg: Vec<String>,
    externs: Vec<String>,
}

impl Rustc {
    pub fn run(self, path: &str) -> Result<(), Box<dyn Error>> {
        Command::new("rustc")
            .arg(path)
            .arg("--edition")
            .arg(self.edition.to_string())
            .arg("--crate-type")
            .arg(self.crate_type.to_string())
            .arg("--crate-name")
            .arg(self.crate_name)
            .arg("--out-dir")
            .arg(self.out_dir)
            .arg("-L")
            .arg(self.lib_dir)
            .args(
                self.externs
                    .into_iter()
                    .map(|r#extern| ["--extern".into(), r#extern])
                    .flatten(),
            )
            .args(
                self.cfg
                    .into_iter()
                    .map(|cfg| ["--cfg".into(), cfg])
                    .flatten(),
            )
            .spawn()?
            .wait()?;


        Ok(())
    }
}

// Rest of the file abridged

With this code above we can spawn a new invocation of rustc, have it’s output dumped to the screen since it will inherit the parent’s stdout and stderr by default, and wait on it to complete before we want to execute something else. Most of this is pretty straight forward in that we state a specific rustc flag and then the next one is the input for that flag. Where it’s a little different is for ’—extern’ and ‘—cfg’. These flags can be called multiple times. You can have many dependencies for a program and each one of those will need a call to ’—extern’ so that they get linked in. The same goes for ‘—cfg’ as you could specify many config flags for a program. In our case, for awhile at least, it will be just one, but we should plan ahead here. The other benefit is that if the array passed to args is empty, then it won’t affect the command being run. It’s not going to output blank spaces or anything weird that could mess up the shell command. What the code does is that given an array like this for externs:

["freight", "serde", "clap"]

It turns it to an array of arrays like this:

[["--extern", "freight"], ["--extern", "serde"], ["--extern", "clap"]]

Which then flattens out to just this, which is the form that ‘args()’ is expecting:

["--extern", "freight", "--extern", "serde", "--extern", "clap"]

The same thing happens with ‘—cfg’ as well!

Lastly we need to actually have a way to populate options for Rustc. We’ll be using the builder pattern in Rust for this, where we create a struct to set all the options we want using functions as safe guards against improper configuration, then calling build to create the struct we actually want with all of the fields set. This isn’t the most robust implementation of that, but it’s good enough for what we want to accomplish with this code and we can always make it more flexible and easy to use later on. Right now our goal is to just getting things to compile.

We have no way to actually make a Rustc struct. Let's fix that by adding a builder function and a RustcBuilder type.

impl Rustc {
    pub fn builder() -> RustcBuilder {
        RustcBuilder {
            ..Default::default()
        }
    }
  // Abridged impl
}

#[derive(Default)]
pub struct RustcBuilder {
    edition: Option<Edition>,
    crate_type: Option<CrateType>,
    crate_name: Option<String>,
    out_dir: Option<PathBuf>,
    lib_dir: Option<PathBuf>,
    cfg: Vec<String>,
    externs: Vec<String>,
}

With the builder pattern we start with a struct with all of the same fields, but now all empty or we could even provide defaults that the user could override. In our case we want the user to set all the fields that are required. Now we have to just write out all the functions. We don't want the user to set the values directly so that we can enforce invariants at runtime that we want such as "an edition must be set" or "a crate type must be set". Now let's get those functions written out. It's mostly a lot of boilerplate where we take in a value, set the field to that value, and return self. This lets us have an API where all the function calls are chained together which we'll see when we finally use the API.

#[derive(Default)]
pub struct RustcBuilder {
    edition: Option<Edition>,
    crate_type: Option<CrateType>,
    crate_name: Option<String>,
    out_dir: Option<PathBuf>,
    lib_dir: Option<PathBuf>,
    cfg: Vec<String>,
    externs: Vec<String>,
}

impl RustcBuilder {
    pub fn edition(mut self, edition: Edition) -> Self {
        self.edition = Some(edition);
        self
    }
    pub fn out_dir(mut self, out_dir: PathBuf) -> Self {
        self.out_dir = Some(out_dir);
        self
    }
    pub fn lib_dir(mut self, lib_dir: PathBuf) -> Self {
        self.lib_dir = Some(lib_dir);
        self
    }
    pub fn crate_name(mut self, crate_name: String) -> Self {
        self.crate_name = Some(crate_name);
        self
    }
    pub fn crate_type(mut self, crate_type: CrateType) -> Self {
        self.crate_type = Some(crate_type);
        self
    }
    pub fn cfg(mut self, cfg: String) -> Self {
        self.cfg.push(cfg);
        self
    }
    pub fn externs(mut self, r#extern: String) -> Self {
        self.externs.push(r#extern);
        self
    }

    pub fn done(self) -> Rustc {
        Rustc {
            edition: self.edition.unwrap_or(Edition::E2015),
            crate_type: self.crate_type.expect("Crate type given"),
            crate_name: self.crate_name.expect("Crate name given"),
            out_dir: self.out_dir.expect("Out dir given"),
            lib_dir: self.lib_dir.expect("Lib dir given"),
            cfg: self.cfg,
            externs: self.externs,
        }
    }
}

The important bit here is that with this 'done' function we're creating a Rustc type and we panic if anything we expect to be set is not set at all! With that we can now open up our main.rs file again and actually call the code we just wrote:

#[cfg(not(stage1))]
use freight::CrateType;
#[cfg(not(stage1))]
use freight::Edition;
#[cfg(not(stage1))]
use freight::Rustc;
#[cfg(not(stage1))]
use std::fs;
#[cfg(not(stage1))]
use std::path::PathBuf;
#[cfg(not(stage1))]
const BOOTSTRAP_STAGE1: &str = "bootstrap_stage1";

use std::error::Error;

fn main() -> Result<(), Box<dyn Error>> {
    #[cfg(not(stage1))]
    {
        let target_dir = PathBuf::from("target");
        let bootstrap_dir = target_dir.join(BOOTSTRAP_STAGE1);
        fs::create_dir_all(&bootstrap_dir)?;
        Rustc::builder()
            .edition(Edition::E2021)
            .crate_type(CrateType::Lib)
            .crate_name("freight".into())
            .out_dir(bootstrap_dir.clone())
            .lib_dir(bootstrap_dir.clone())
            .cfg("stage1".into())
            .done()
            .run("src/lib.rs")?;
        Rustc::builder()
            .edition(Edition::E2021)
            .crate_type(CrateType::Bin)
            .crate_name("freight_stage1".into())
            .out_dir(bootstrap_dir.clone())
            .lib_dir(bootstrap_dir)
            .cfg("stage1".into())
            .externs("freight".into())
            .done()
            .run("src/main.rs")?;

        println!("Completed Stage1 Build");
    }
    #[cfg(stage1)]
    {
        println!("Bootstrapped successfully!");
    }
    Ok(())
}

As you can see, we now have it set to compile and run. Of course, it's important to note that we've hardcoded quite a few things and made some assumptions about the project, such as "we have a 'src/main.rs' and 'src/lib.rs' file". This is okay though, freight is only now going to be able to compile itself, it's sure not going to be able to compile any other project. It's important that for every commit we do that it can build itself at that stage. In a sense we want things to be reproducible. Further down the line, we'll want to be able to compile the current commit with the previous commit to provide a chain of continuity. cargo 1.69 for instance can compile cargo 1.70. eventually we would want something similar as well.

Now the above won't work just yet because we want to separate things out a bit. We need to adjust our justfile to have multiple bootstrap stages and also for us to run the proper binaries in the right order to make sure things work out. So let's do that and change things up a bit:

run: build
  ./target/bootstrap_stage0/freight_stage0
  ./target/bootstrap_stage1/freight_stage1
build:
  mkdir -p target/bootstrap_stage0
  # Build crate dependencies
  rustc src/lib.rs --edition 2021 --crate-type=lib --crate-name=freight \
    --out-dir=target/bootstrap_stage0
  # Create the executable
  rustc src/main.rs --edition 2021 --crate-type=bin --crate-name=freight_stage0 \
    --out-dir=target/bootstrap_stage0 -L target/bootstrap_stage0 --extern freight

With this we can now bootstrap freight for the first time. If we run it with just we should see output like this:

❯ just
mkdir -p target/bootstrap_stage0
# Build crate dependencies
rustc src/lib.rs --edition 2021 --crate-type=lib --crate-name=freight --out-dir=target/bootstrap_stage0
# Create the executable
rustc src/main.rs --edition 2021 --crate-type=bin --crate-name=freight_stage0 --out-dir=target/bootstrap_stage0 -L target/bootstrap_stage0 --extern freight
./target/bootstrap_stage0/freight_stage0
Completed Stage1 Build
./target/bootstrap_stage1/freight_stage1
Bootstrapped successfully!

We did it! We'll commit the code and we can now move forward! The first crack at the builder pattern wasn't the most ergonomic here. We require owned types like String or PathBuf to be passed in to the functions, when we would be fine with things like &str being used for the functions that take a String or a PathBuf. We can do this by changing these functions to take any type that implements Into<String> or Into<PathBuf>.

Let's open up lib.rs again and change the function signatures for RustcBuilder:

    pub fn out_dir(mut self, out_dir: impl Into<PathBuf>) -> Self {
        self.out_dir = Some(out_dir.into());
        self
    }
    pub fn lib_dir(mut self, lib_dir: impl Into<PathBuf>) -> Self {
        self.lib_dir = Some(lib_dir.into());
        self
    }
    pub fn crate_name(mut self, crate_name: impl Into<String>) -> Self {
        self.crate_name = Some(crate_name.into());
        self
    }
    pub fn crate_type(mut self, crate_type: CrateType) -> Self {
        self.crate_type = Some(crate_type);
        self
    }
    pub fn cfg(mut self, cfg: impl Into<String>) -> Self {
        self.cfg.push(cfg.into());
        self
    }
    pub fn externs(mut self, r#extern: impl Into<String>) -> Self {
        self.externs.push(r#extern.into());
        self
    }

As you can see we now have these functions take in a generic type that implements the trait we need for that function and we call into inside of the RustcBuilder functions. This means we can now cleanup the calls over in main.rs where we use into and remove them:

        Rustc::builder()
            .edition(Edition::E2021)
            .crate_type(CrateType::Lib)
            .crate_name("freight")            // into() removed
            .out_dir(bootstrap_dir.clone())
            .lib_dir(bootstrap_dir.clone())
            .cfg("stage1")                    // into() removed
            .done()
            .run("src/lib.rs")?;
        Rustc::builder()
            .edition(Edition::E2021)
            .crate_type(CrateType::Bin)
            .crate_name("freight_stage1")    // into() removed
            .out_dir(bootstrap_dir.clone())
            .lib_dir(bootstrap_dir)
            .cfg("stage1")                   // into() removed
            .externs("freight")              // into() removed
            .done()
            .run("src/main.rs")?;

If we call 'just' again we should see everything works exactly the same:

❯ just
mkdir -p target/bootstrap_stage0
# Build crate dependencies
rustc src/lib.rs --edition 2021 --crate-type=lib --crate-name=freight --out-dir=target/bootstrap_stage0
# Create the executable
rustc src/main.rs --edition 2021 --crate-type=bin --crate-name=freight_stage0 --out-dir=target/bootstrap_stage0 -L target/bootstrap_stage0 --extern freight
./target/bootstrap_stage0/freight_stage0
Completed Stage1 Build
./target/bootstrap_stage1/freight_stage1
Bootstrapped successfully!

Let's commit this code and keep doing a few more bits of cleanup. In main.rs we have quite a few '#[cfg(not(stage1))]' attributes with the imports. This makes it hard to read since most of those imports are only needed for the first code block. We can fix this by moving them into the code block with the same attribute in main.rs

use std::error::Error;

fn main() -> Result<(), Box<dyn Error>> {
    #[cfg(not(stage1))]
    {
        use freight::CrateType;
        use freight::Edition;
        use freight::Rustc;
        use std::fs;
        use std::path::PathBuf;
        const BOOTSTRAP_STAGE1: &str = "bootstrap_stage1";
        // abridged code
    }
    #[cfg(stage1)]
    {
        println!("Bootstrapped successfully!");
    }

    Ok(())
}

With this we still won't get any warnings about unused code. Let's commit this and move onto our next goal which is to get some basic arg parsing in! We can't use crates like clap so we're going to have to do this the old fashioned way, by hand and with match statements.

We want to print out a nice and helpful message if someone runs 'freight help'. First let's adjust the justfile so that our stage1 build calls help:

  ./target/bootstrap_stage1/freight_stage1 help

and now in main.rs we want to change our stage1 block to be more than just a 'println!' statement, to this:

    #[cfg(stage1)]
    {
        use std::env;
        use std::process;

        const HELP: &str = "\
          Alternative for Cargo\n\n\
          Usage: freight [COMMAND] [OPTIONS]\n\n\
          Commands:\n    \
              help    Print out this message
        ";

        let mut args = env::args().skip(1);
        match args.next().as_ref().map(String::as_str) {
            Some("help") => println!("{HELP}"),
            _ => {
                println!("Unsupported command");
                println!("{HELP}");

                process::exit(1);
            }
        }
    }

We create a constant for the help message that we will expand over time as we add functions such as build and test, but for now we only have help. As for the args we want to skip over the first one because that's just the name of the executable being run and we don't need that, we only cares about what comes after. Since args is an Iterator, we call next to get the next one and then we match on it. If we get a value of help we print out the help and if there's nothing else we still print out the help, but we set the exit code to that of a failure so that the shell knows and by extension the user it did not work. If we call just again we can see this works!

❯ just
mkdir -p target/bootstrap_stage0
# Build crate dependencies
rustc src/lib.rs --edition 2021 --crate-type=lib --crate-name=freight --out-dir=target/bootstrap_stage0
# Create the executable
rustc src/main.rs --edition 2021 --crate-type=bin --crate-name=freight_stage0 --out-dir=target/bootstrap_stage0 -L target/bootstrap_stage0 --extern freight
./target/bootstrap_stage0/freight_stage0
Completed Stage1 Build
./target/bootstrap_stage1/freight_stage1 help
Alternative for Cargo

Usage: freight [COMMAND] [OPTIONS]

Commands:
    help    Print out this message

It's a bit fragile, for instance if there are no args freight says "Unsupported command", but for our purposes this is fine. freight is nowhere near being usable for much and we can always come back and clean things up like we did earlier. Let's commit this code and move onto the penultimate part and make it compile itself with a build command and no stages.

First off let's change the justfile again. We're going to remove all of the '_stage1' and '_stage0' suffixes and have the final binary be run from a different directory.

run: build
  ./target/bootstrap/freight build
  ./target/debug/freight help
build:
  rm -rf target
  mkdir -p target/bootstrap
  # Build crate dependencies
  rustc src/lib.rs --edition 2021 --crate-type=lib --crate-name=freight \
    --out-dir=target/bootstrap
  # Create the executable
  rustc src/main.rs --edition 2021 --crate-type=bin --crate-name=freight \
    --out-dir=target/bootstrap -L target/bootstrap --extern freight

With this any bootstrapping things will be in the bootstrap folder and contained there, while everything else will go in 'target/debug' like you'd expect when running cargo. Right now we're not too concerned with following cargo 1:1 here, just trying to get it to work and we can separate things onto the proper paths as time goes on. Let's change main.rs and make it a lot smaller:

use std::env;          // This is new
use std::error::Error;
use std::process;      // This is new

fn main() -> Result<(), Box<dyn Error>> {
    // We've updated the help string here to have a build command
    const HELP: &str = "\
          Alternative for Cargo\n\n\
          Usage: freight [COMMAND] [OPTIONS]\n\n\
          Commands:\n    \
              build    Build a Freight or Cargo project\n    \
              help     Print out this message
        ";

    let mut args = env::args().skip(1);
    match args.next().as_ref().map(String::as_str) {
        Some("build") => freight::build()?, // This is new
        Some("help") => println!("{HELP}"),
        _ => {
            println!("Unsupported command");
            println!("{HELP}");

            process::exit(1);
        }
    }
}

Mostly we cut out the attributes and all of the Rustc code. The main thing here is that we call the build command from our library. Which we haven't written yet. We should do that! Open up lib.rs and add these imports:

use std::env;              // This is new
use std::error::Error;
use std::fmt;
use std::fmt::Display;
use std::fs;               // This is new
use std::path::PathBuf;
use std::process::Command;

Rather than writing out 'Result<(), Box<dyn Error>>' all the time it would be more convenient to write 'Result<()>' and imply the error type. Over time we'll add better error handling, but that day is not today. Let's add a type alias just below the imports. Note we need to specify the full path to Result in std, so that we don't have type name conflicts.

use std::env;
use std::error::Error;
use std::fmt;
use std::fmt::Display;
use std::fs;
use std::path::PathBuf;
use std::process::Command;

pub type Result<T> = std::result::Result<T, Box<dyn Error>>;

We can now change the run function to look like this instead.

    pub fn run(self, path: &str) -> Result<()> {

Now we make a lot of assumptions about where freight is run, namely in the top level directory of a project and nowhere else. If you called it in a subfolder nothing would work. Well we can kind of fix that by finding out the root of our project and deduce paths based off where the root is. We just need some way to find out what that is. With cargo it does this by looking upwards for a Cargo.toml file, but we don't have that. What we do have is a '.git' folder for our repo that only exists in the root so we can use that to find out the root directory. Let's create a new function root_dir that figures out where the root of our project is so we can use that to then figure out where things like the main.rs and lib.rs file are. We'll put it just above the Rustc struct:

fn root_dir() -> Result<PathBuf> {
    let current_dir = env::current_dir()?;
    for ancestor in current_dir.ancestors() {
        if ancestor.join(".git").exists() {
            return Ok(ancestor.into());
        }
    }
    Err("No root dir".into())
}

pub struct Rustc {
    edition: Edition,
    crate_type: CrateType,
    crate_name: String,
    out_dir: PathBuf,
    lib_dir: PathBuf,
    cfg: Vec<String>,
    externs: Vec<String>,
}

We ask what directory we're in and then iterate through the ancestors which also includes the current directory and checks if the '.git' folder even exists and if it does to return that path. If it's not found then we assume we're not in a freight project and spit out an error. All that's left is to actually write the build function. We'll put it just below our type alias for Result:

pub type Result<T> = std::result::Result<T, Box<dyn Error>>;

pub fn build() -> Result<()> {
    let root_dir = root_dir()?;
    // TODO: Get this from a config file
    let crate_name = root_dir
        .file_name()
        .ok_or::<Box<dyn Error>>("Freight run in directory without a name".into())?;

    let lib_rs = root_dir.join("src").join("lib.rs");
    let main_rs = root_dir.join("src").join("main.rs");
    let target = root_dir.join("target");
    let target_debug = target.join("debug");
    fs::create_dir_all(&target_debug)?;

This is the opening of our function, we get the root_dir and the name of the crate by assuming it's the directory name. This will change when we actually have a config file we can use. We now use the root directory to create all of the paths that we'll need, such as where lib.rs and main.rs are as well as target/debug. Note we're using 'join' here to be platform agnostic and not assume what the path separator is even if '/' works on most platforms these days. Next up we want to create two closures we can call to build the binary and library

    let lib_compile = || -> Result<()> {
        println!("Compiling lib.rs");
        Rustc::builder()
            .edition(Edition::E2021)
            .crate_type(CrateType::Lib)
            .crate_name(crate_name.to_str().unwrap())
            .out_dir(target_debug.clone())
            .lib_dir(target_debug.clone())
            .done()
            .run(lib_rs.to_str().unwrap())?;
        println!("Compiling lib.rs -- Done");
        Ok(())
    };

    let bin_compile = |externs: Vec<&str>| -> Result<()> {
        println!("Compiling main.rs");
        let mut builder = Rustc::builder()
            .edition(Edition::E2021)
            .crate_type(CrateType::Bin)
            .crate_name(crate_name.to_str().unwrap())
            .out_dir(target_debug.clone())
            .lib_dir(target_debug.clone());

        for ex in externs {
            builder = builder.externs(ex);
        }

        builder.done().run(main_rs.to_str().unwrap())?;
        println!("Compiling main.rs -- Done");
        Ok(())
    };

There are a few things worth noting here. For one you can actually use ? in a closure. If you did not know this before, you just need to specify the return type to be a Result, which I've used to great effect before to make a poor man's try catch in Rust. We also have added a bit of logging to know when everything is compiled. Additionally we even made it so that we've not called Rustc with any hard coded values, but programmatically generated (with a few hard coded assumptions) which is a big step up from where we were earlier. Now we just need to write a bit of logic to know when to call it:

We can now handle anyRust code with freight (so long as it's a program with no dependencies and requires Rust 2021 and depends on no other cargo features) . Let's fire up the just command again and watch it work it's magic!

❯ just
rm -rf target
mkdir -p target/bootstrap
# Build crate dependencies
rustc src/lib.rs --edition 2021 --crate-type=lib --crate-name=freight --out-dir=target/bootstrap
# Create the executable
rustc src/main.rs --edition 2021 --crate-type=bin --crate-name=freight --out-dir=target/bootstrap -L target/bootstrap --extern freight
./target/bootstrap/freight build
Compiling lib.rs
Compiling lib.rs -- Done
Compiling main.rs
Compiling main.rs -- Done
./target/debug/freight help
Alternative for Cargo

Usage: freight [COMMAND] [OPTIONS]

Commands:
    build    Build a Freight or Cargo project
    help     Print out this message

IT WORKS! With this we can be reasonably confident that freight is doing what we expect it to at this stage. Let's commit this code and add one more thing: CI. I'm using GitHub and thus GitHub Actions here, so you might need to adapt this to your needs, especially if you don't want to use nix (which is valid). The following is the file '.github/workflows/ci.yaml' which is relative to the project root:

name: CI
on: push
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v3

    - name: "Setup environment"
      uses: JRMurr/direnv-nix-action@v4.1.0

    - name: "Build and run Freight"
      run: "just run"
      shell: bash

It's pretty simple in that we checkout the project, setup the env using our nix flake to get just and rust setup and then we call just run. If that works then we say that CI has passed. Let's commit the file and push everything to GitHub. We should see that CI will pass since we've made sure it works locally at every step this point. We still will run these tests locally, but we also now have CI to verify that and let everyone know it's working as well. In our case we want every commit on the main branch to pass CI every time and to be buildable and bootstrappable. With that we're done with part 1.


Conclusion

There's still more to come, such as adding configuration support and adding tests, but this post is long as it is and this is a pretty good stopping point now that it can build itself. Do expect to see more in the future. Also if you liked what you read and want to work with me, I'm available for hire or contracting. It should also be noted that for freight I will not be accepting external contributions for the time being, nor issues as this is something I want to craft specifically for this series. If you want to read something similar, I've also done a deep dive on how async executors work in Rust by building a small executor called whorl.