Let's Build a Cargo Compatible Build Tool - Part 2

Let's Build a Cargo Compatible Build Tool - Part 2
Photo by Timelab / Unsplash

This is the second post in an educational mulitpart series about how build tools like Cargo work under the hood, by building our own from the ground up. You can find part 1 here.


Welcome back! Last time we managed to get freight to build itself a bit more dynamically rather than hardcoded and we got CI setup so that all of the code would test and pass. Today we're going to add some basic configuration for Freight as well as some logging and a few small house keeping things. This post will probably be a bit shorter than the previous one. In part 3 we'll be adding testing and that's going to be a big one. Let's get to it.

Hooking into Git

Git has some really nice features and one of them is githooks. If you're unfamiliar with the feature, you can add scripts to git that will be executed at various points in git's lifecycle. For instance you can have a script that runs after you checkout something or just before you push. This lets you run anything you want. It's just a script after all. As such, git doesn't just ship hooks written in a repo directly in your .git folder of the repo you clone. Security wise it would be bad if people could just run arbitrary code on your machine. We can't therefore use hooks as our CI, but we can use them for ourselves to make sure before we commit anything, that everything would pass CI. We want CI in this case to show the world we haven't broken anything. With githooks we can make sure we do all the testing locally, can work on one branch (for now), and push directly without worrying we broke anything.

Since hooks aren't shipped around in the .git folder and we want to keep track of things, we're going to create a folder to hold all of our githooks, and a small script to symlink the files into the appropriate location so that they are always set to point to what we currently have checked out and edited.

Let's get started with that then by running this command in the top level of freight:

❯ mkdir hooks
# We should now have a layout that looks something like this:
❯ ls
flake.lock  flake.nix  hooks  justfile  LICENSE  README.md  src  target

Let's now create two files in hooks link.sh and pre-commit.sh. We also want to make sure that we mark them as executable on Unix.

❯ touch hooks/link.sh hooks/pre-commit.sh

❯ chmod +x hooks/link.sh

❯ chmod +x hooks/pre-commit.sh

The link.sh file will be what symlinks all of our hooks to the correct directory. It's a very small script. I like having this script because for whatever reason, knowing the correct way to run ln with the right paths almost never works out the first time for me.

#!/bin/sh
ln -s $(pwd)/hooks/pre-commit.sh $(pwd)/.git/hooks/pre-commit

You can of course change this to fit your system/needs. The important part is that git looks for executable hooks in .git/hooks and with the correct name for the lifecycle part, such as pre-commit or post-checkout with no suffix. Git doesn't care what program or file type it is so long as it has the right name and it's executable. You could have this be a compiled Rust program or a Ruby script. It's all the same to git. In our case we'll just be using sh. In order to be more cross platform compatible we might want to consider something else, but that's a hard problem in and of itself.

Now let's get our actual pre-commit hook setup by editing hooks/pre-commit.sh to look like this:

#!/bin/sh
just run

Our justfile from before handles most of the dirty work, we're just using this script to execute it for us everytime we want to commit our changes. We can see this in action by running link.sh and then adding and commiting all of the files.

❯ ./hooks/link.sh

❯ git add .

❯ git commit
rm -rf target
mkdir -p target/bootstrap
# Build crate dependencies
rustc src/lib.rs --edition 2021 --crate-type=lib --crate-name=freight --out-dir=target/bootstrap
# Create the executable
rustc src/main.rs --edition 2021 --crate-type=bin --crate-name=freight --out-dir=target/bootstrap -L target/bootstrap --extern freight
./target/bootstrap/freight build
Compiling lib.rs
Compiling lib.rs -- Done
Compiling main.rs
Compiling main.rs -- Done
./target/debug/freight help
Alternative for Cargo

Usage: freight [COMMAND] [OPTIONS]

Commands:
    build    Build a Freight or Cargo project
    help     Print out this message

[main 8db3b4d] Add pre-commit hook to Freight
 2 files changed, 4 insertions(+)
 create mode 100755 hooks/link.sh
 create mode 100755 hooks/pre-commit.sh
Add pre-commit hook to Freight · mgattozzi/freight@8db3b4d
In order to make sure I always have a working version of freight locallybefore pushing I've added a local check so that on every commit it willrun the justfile like in CI to make sure it work…

The new code runs our justfile now before any commit and if it fails it won't let us run a commit at all! Now we can be more confident that we won't commit broken code by accident. We can now start working on configuring our code with just a bit more confidence.

Configs, Configs, Configs, why aren't programmers Profigs?

Up to this point freight has made a few assumptions about things like the name of the program solely based off of the name of the folder. This might not work though if say someone uses git-worktrees and uses a different name for the folder and that worktree. It would be better if we instead had some kind of configuration file. We'll be making a Freight.toml file, but it will eventually resemble a Cargo.toml file. For now we'll start small and as we need more features and configuration options we'll add them to the file and work on the parsing of things even more. Let's get started then by creating Freight.toml in the root and filling it with the following:

name = "freight"
edition = "2021"

We want to be able to specify the name and edition for our code just like with cargo and so this is the first steps towards this. Before we can start passing this to our code to use when compiling however, we need to be able to parse this file. Let's create a new file src/config.rs. It's going to house all of our parsing code for our configuration. Now you might be thinking that we need to change the justfile to handle the fact that we have a new file, but we do not! Once we list it as a module in src/lib.rs rustc will know how to take care of the rest and will look for the file just like you'd expect when using cargo and Rust modules. It's a compiler feature not a cargo feature!

Let's open up our new file and start adding some imports:

use super::Edition;
use super::Result;
use std::error::Error;
use std::fs;
use std::path::Path;

You'll notice that we're using super here, which means let me use the parent module's code. We could have also used crate here and it would have worked as well. We also have a few std imports that we will need, but overall it will be a small file.

Below that let's add our new Manifest type which will hold all of the configuration options for us when parsed:

pub struct Manifest {
    pub crate_name: String,
    pub edition: Edition,
}

Since the crate name can be anything it makes sense for it to be a String, but when it comes to editions we have a type for that already! We can just parse it into that type directly so that we can use it where we would expect too in our code. Now let's actually parse the code and look at it in chunks. Just below are manifest type we'll start with this:

impl Manifest {
    pub fn parse_from_file(path: impl AsRef<Path>) -> Result<Self> {
        let path = path.as_ref();
        let mut crate_name = None;
        let mut edition = None;

We're using a function to parse a file from a given path. In the future we might want to have the parsing into a type code be in a separate function so that users could either parse a blob of bytes they have directly, or open a file to do it via this method, all while sharing the parsing code. That day is not today and so we'll be sticking it all in one function.

We accept anything that we can call .as_ref() on and have it turn into a path which gives us some flexibility with what can be passed in. We also are returning our Result type alias we made in the previous post which is just Result<T, Box<dyn std::error::Error>>. With this as our function signature, we call .as_ref() on path so that we have an &Path stored in path and create two new variables that we set to None.  We will want to set these to Some(value) as we parse the code and set the fields of the same name in Manifest to them. If they are still None we will return an error as these fields must be set.

We can now add our parsing code which looks like this:

        for line in fs::read_to_string(path)?.lines() {
            let mut split = line.split('=');
            let field = split.next().unwrap().trim();
            let value = split.next().unwrap().trim();

            match field {
                "name" => crate_name = Some(value.replace('"', "")),
                "edition" => {
                    edition = Some(match value.replace('"', "").parse()? {
                        2015 => Edition::E2015,
                        2018 => Edition::E2018,
                        2021 => Edition::E2021,
                        edition => return Err(format!("Edition {edition} is unsupported").into()),
                    })
                }
                field => return Err(format!("Field {field} is unsupported").into()),
            }
        }

It's not necessarily the most efficient code, but it does what we need. We read the file into a String from the path we hand it and then iterate over each line one at a time. In toml keys and values are done as key = value. We aren't doing sections, arrays, or objects here, just a key and a string value. We can therefore make a few assumptions:

  1. We can call split('=') on every line to split the values into a key and value
  2. The first value will be a string
  3. The second value will be a string

We call trim on them since key=value or key = value or even key =value would be possibilities and we need to get rid of the possible extra spaces that would be there.

We now can actually figure out what to do. We can match on the field and run parsing code specific to that field. For name it's as simple as removing all of the " at the beginning and end of the word. Remember the file has "freight" in it and we want what's inside the " not the whole thing. We call replace and assign it to crate_name. The more complicated part is edition.

We also want to remove the " here, but now we also parse it into a number. "Why not a number then for that field?", well cargo uses a string and so will we now. We parse into a number and match on it. We set edition to the proper Edition and if there's a number we don't expect we return an error. We also will return an error for any key value that we do not handle. Something like about or author would be an error for freight at this point in time.

We finally want to wrap up our code with actually returning a Manifest type:

        Ok(Self {
            crate_name: crate_name.ok_or::<Box<dyn Error>>("name is a required field".into())?,
            edition: edition.ok_or::<Box<dyn Error>>("edition is a required field".into())?,
        })
    }
}

We turn a Some value into an Ok or a None into an Err here and either return the error, or assign the value to the field and if all goes well we return the Manifest type with the parsed data. Now we just need to hook it into our lib.rs file. Let's open that up now and add this to the top of our file:

mod config; // This is new

use config::Manifest; // This is also new
use std::env;
use std::error::Error;
use std::fmt;

We now want to replace this line in fn build

    // TODO: Get this from a config file
    let crate_name = root_dir
        .file_name()
        .ok_or::<Box<dyn Error>>("Freight run in directory without a name".into())?;

To this:

    let manifest = Manifest::parse_from_file(root_dir.join("Freight.toml"))?;

With this we will now actually get a Manifest type and can use that for our crate names and editions. Now we can adjust our lib_compile and bin_compile closures in fn build. They both have a call to .edition() and .crate_name() in them so we can change these calls from these:

            .edition(Edition::E2021)
            // and this
            .crate_name(crate_name.to_str().unwrap())

Into this instead:

            .edition(manifest.edition)
            // and this
            .crate_name(&manifest.crate_name)

We also need to modify our call to bin_compile if the lib.rs file exists to this:

    match (lib_rs.exists(), main_rs.exists()) {
        (true, true) => {
            lib_compile()?;
            // The args to `vec!` changed here
            bin_compile(vec![&manifest.crate_name])?; 
        }

This way it uses the name provided from the config file rather than the crate_name variable we had from earlier. Two more changes before we're done here!

First in fn root_dir we need to change the default. We were originally looking for .git. This was fine for freight since we are using git. However, if someone wanted to use mercurial or even darcs or pijul this would not work. Instead we should now look for Freight.toml instead so that we can be VCS agnostic. It's a one line change:

    for ancestor in current_dir.ancestors() {
        // We switcthed .git to Freight.toml
        if ancestor.join("Freight.toml").exists() {
            return Ok(ancestor.into());
        }
    }
	

One last thing we need to do is derive Clone and Copy for Edition so that our code can compile. Notice we did not pass a reference from the edition field in Manifest to lib_compile or bin_compile in our changes above. We need it to be Copy for this to work and since it's an enum with no extra fields this is fine. With that we just need to add this:

#[derive(Clone, Copy)] // This is new
pub enum Edition {

We can now run just to make sure everything works before commiting or we can git add . for all these changes and run git commit with the githooks installed from earlier.

❯ just run
rm -rf target
mkdir -p target/bootstrap
# Build crate dependencies
rustc src/lib.rs --edition 2021 --crate-type=lib --crate-name=freight --out-dir=target/bootstrap
# Create the executable
rustc src/main.rs --edition 2021 --crate-type=bin --crate-name=freight --out-dir=target/bootstrap -L target/bootstrap --extern freight
./target/bootstrap/freight build
Compiling lib.rs
Compiling lib.rs -- Done
Compiling main.rs
Compiling main.rs -- Done
./target/debug/freight help
Alternative for Cargo

Usage: freight [COMMAND] [OPTIONS]

Commands:
    build    Build a Freight or Cargo project
    help     Print out this message

The output is what we expect so let's make the commit and move onto the next part.

Implement basic configuration w/Freight.toml file · mgattozzi/freight@b4d4c32
This commit adds the beginning of a configuration file for Freight.Eventually we&#39;ll make it more robust with parsing when we can add serdeand the like as a dependency but we&#39;re a long way…

We now have an opportunity to clean up our parsing code for our configuration file. Up to this point we would parse edition versions from a string in a match statement when we needed to do so. While this is fine it would be better to encapsulate this logic in case we ever need to use this functionality elsewhere or users of our library wanted to do so themselves. We can use the FromStr trait in the standard library for this! It will allow us to call Edition::from_str on a string when we want to turn it into an Edition. Let's add this by first importing the trait into src/lib.rs

mod config;

use config::Manifest;
use std::env;
use std::error::Error;
use std::fmt;
use std::fmt::Display;
use std::fs;
use std::path::PathBuf;
use std::process::Command;
use std::str::FromStr; // This is new

While we're at it, let's make our errors a tad bit easier to work with and type out.

use std::str::FromStr;

pub type Result<T> = std::result::Result<T, BoxError>; // This is new
pub type BoxError = Box<dyn Error>;                    // This is new

We're creating two aliases:

  1. The first is an alias for the actual Result type that takes a type T and always uses our BoxError type for the error. Note that we have to use the full import path here to avoid type name conflicts otherwise Rust wouldn't know what Result to use. It's also interesting to note that std::result::Result is part of the prelude so there's an implicit use std::result::Result; statement for every single Rust file. We just happen to be overriding that with our alias here.
  2. The second alias is for BoxError which is just a Box<dyn std::error::Error>. This will let us use this instead of having to type it out all the time which can get tedious.

Aliases are really nice when we want to avoid typing out long type names that we need in many places, but we also don't want to make a new wrapper type to hold it. You'll find the Result alias is used in many different libraries as well where they use their own error type rather than BoxError. It's a pattern worth knowing about if you ever need to dive into someone else's code or docs.

From here we want to actually implement the FromStr trait. We'll put it below our Display impl for Edition:

impl Display for Edition {
    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
        let edition = match self {
            Self::E2015 => "2015",
            Self::E2018 => "2018",
            Self::E2021 => "2021",
        };
        write!(f, "{edition}")
    }
}

impl FromStr for Edition {
    type Err = BoxError;
    fn from_str(input: &str) -> Result<Self> {
        match input {
            "2015" => Ok(Self::E2015),
            "2018" => Ok(Self::E2018),
            "2021" => Ok(Self::E2021),
            edition => Err(format!("Edition {edition} is not supported").into()),
        }
    }
}

The trait requires us to specify an error type in case parsing fails. In this case we just say that we will use the BoxError alias we specified since it's what we use everywhere. The actual function returns our new Result alias with type Self which just means "the name of the type this was implemented for".

The actual body is just a match statement like we had done before in src/config.rs. With this we can actually modify our config parsing code. Before we had this code in src/config.rs

            match field {
                "name" => crate_name = Some(value.replace('"', "")),
                "edition" => {
                    edition = Some(match value.replace('"', "").parse()? {
                        2015 => Edition::E2015,
                        2018 => Edition::E2018,
                        2021 => Edition::E2021,
                        edition => return Err(format!("Edition {edition} is unsupported").into()),
                    })
                }
                field => return Err(format!("Field {field} is unsupported").into()),
            }

Now we can condense it to this instead:

            match field {
                "name" => crate_name = Some(value.replace('"', "")),
                "edition" => edition = Some(Edition::from_str(&value.replace('"', ""))?),
                field => return Err(format!("Field {field} is unsupported").into()),
            }
        }

Much more compact and easier to reason about. However, we can't forget that we need to import the trait so that we can use it:

use super::Edition;
use super::Result;
use std::fs;
use std::path::Path;
use std::str::FromStr; // This is new

While we're here we can also use our new error alias instead:

use super::BoxError; // This is new
use super::Edition;
use super::Result;
use std::fs;
use std::path::Path;
use std::str::FromStr;

For our return type in parse_from_file we can now write this instead:

        Ok(Self {
            crate_name: crate_name.ok_or::<BoxError>("name is a required field".into())?,
            edition: edition.ok_or::<BoxError>("edition is a required field".into())?,
        })

It's not much, but it's a few less angle brackets to parse which is always nice. Let's run the code to make sure it works and if it does commit it!

❯ just
rm -rf target
mkdir -p target/bootstrap
# Build crate dependencies
rustc src/lib.rs --edition 2021 --crate-type=lib --crate-name=freight --out-dir=target/bootstrap
# Create the executable
rustc src/main.rs --edition 2021 --crate-type=bin --crate-name=freight --out-dir=target/bootstrap -L target/bootstrap --extern freight
./target/bootstrap/freight build
Compiling lib.rs
Compiling lib.rs -- Done
Compiling main.rs
Compiling main.rs -- Done
./target/debug/freight help
Alternative for Cargo

Usage: freight [COMMAND] [OPTIONS]

Commands:
    build    Build a Freight or Cargo project
    help     Print out this message

It works as expected! Let's commit it.

Add FromStr impl for Edition and BoxError alias · mgattozzi/freight@5966b05
This commit moves the parsing code we did for Edition in the Manifesttype to a FromStr impl so that we can instead use this in other placesas well if need be and have the parsing code for Manifes…

Finally we're almost done for today. Up to this point we haven't done much in the way of logging. We've printed things out, but it would be better if we used something else to control how we log data. We're going to create a rudimentary Logger type that takes a lock to stdout and writes to it any data that we need. Let's first add a new logger module to lib.rs:

mod config;
mod logger; // This is new

Now we can create a new file at src/logger.rs that will hold our Logger type in it. We'll start with our import statements:

use std::io;
use std::io::Write;

It'll be a small file so the imports here aren't too large. Let's create our new Logger type:

pub struct Logger {
    out: io::StdoutLock<'static>,
}

In the future we might add an err field for stderr, but for now this is sufficient for our use cases. With that we can actually implement our new type. First up, a new function to make a Logger:

impl Logger {
    pub fn new() -> Self {
        Self {
            out: io::stdout().lock(),
        }
    }

We now want to make functions to encapsulate what we're doing. Right now we only log when we build a crate or a binary. We print that out without a newline and then when we're done we print out done next to it with a newline. In the future we might compile many crates at once and this method would cause a whole mess, but it works for our use cases for now! Let's get those three methods in then. One for crates, one for binaries, and one to say it's all done:

    pub fn compiling_crate(&mut self, crate_name: &str) {
        let _ = self
            .out
            .write_all(format!("Compiling crate {crate_name}...").as_bytes());
    }
    pub fn compiling_bin(&mut self, crate_name: &str) {
        let _ = self
            .out
            .write_all(format!("Compiling bin {crate_name}...").as_bytes());
    }
    pub fn done_compiling(&mut self) {
        let _ = self.out.write_all(b"Done\n");
    }
}

Now we're throwing out the error here if one happens, but it's not something we're too concerned with at this point in time. We can worry about io failing in the future. With this we can now utilize it in our code.

Let's open lib.rs again and create a new Logger type by importing it and calling our new function:

use config::Manifest;
use logger::Logger; // This is new
use std::env;
use std::error::Error;
use std::fmt;
use std::fmt::Display;
use std::fs;
use std::path::PathBuf;
use std::process::Command;
use std::str::FromStr;

pub type Result<T> = std::result::Result<T, BoxError>;
pub type BoxError = Box<dyn Error>;

pub fn build() -> Result<()> {
    let mut logger = Logger::new(); // This is new

We can now modify our lib_compile closure to use the new Logger:

    // The function sig is new
    let lib_compile = |logger: &mut Logger| -> Result<()> {
        logger.compiling_crate(&manifest.crate_name); // This is new
        Rustc::builder()
            .edition(manifest.edition)
            .crate_type(CrateType::Lib)
            .crate_name(&manifest.crate_name)
            .out_dir(target_debug.clone())
            .lib_dir(target_debug.clone())
            .done()
            .run(lib_rs.to_str().unwrap())?;
        logger.done_compiling(); // This is also new
        Ok(())
    };

We then can do a simmilar thing with bin_compile:

    // The function sig is new
    let bin_compile = |logger: &mut Logger, externs: Vec<&str>| -> Result<()> {
        logger.compiling_bin(&manifest.crate_name); // This is new
        let mut builder = Rustc::builder()
            .edition(manifest.edition)
            .crate_type(CrateType::Bin)
            .crate_name(&manifest.crate_name)
            .out_dir(target_debug.clone())
            .lib_dir(target_debug.clone());
        for ex in externs {
            builder = builder.externs(ex);
        }

        builder.done().run(main_rs.to_str().unwrap())?;
        logger.done_compiling(); // This is new
        Ok(())
    };

Finally we need to get our match statement at the end fixed up a bit to use the new type:

    match (lib_rs.exists(), main_rs.exists()) {
        (true, true) => {
            lib_compile(&mut logger)?; // This is updated
            bin_compile(&mut logger, vec![&manifest.crate_name])?; // This is updated
        }
        (true, false) => {
            lib_compile(&mut logger)?; // This is updated
        }
        (false, true) => {
            bin_compile(&mut logger, vec![])?; // This is updated
        }
        (false, false) => return Err("There is nothing to compile".into()),
    }
	

Let's give it a whirl and see what happens:

❯ just
rm -rf target
mkdir -p target/bootstrap
# Build crate dependencies
rustc src/lib.rs --edition 2021 --crate-type=lib --crate-name=freight --out-dir=target/bootstrap
# Create the executable
rustc src/main.rs --edition 2021 --crate-type=bin --crate-name=freight --out-dir=target/bootstrap -L target/bootstrap --extern freight
./target/bootstrap/freight build
Compiling crate freight...Done
Compiling bin freight...Done
./target/debug/freight help
Alternative for Cargo

Usage: freight [COMMAND] [OPTIONS]

Commands:
    build    Build a Freight or Cargo project
    help     Print out this message

We've slightly changed the output to use ... instead of ` – ` and we can see this is a success. We now have a way to log things that we can expand on in the future.

Create the Logger type to log data to the terminal · mgattozzi/freight@2c1a862
cargo has some fancy ways to print out things to the terminal and showstatus. Eventually we&#39;ll get there as we add more things likedependencies etc. However, we have to start somewhere with o…

Conclusion

Overall we've made great progress in making the tool less fragile and more configurable. In our next post we'll take this a step further and introduce testing with Rust's inbuilt testing framework so that we can make sure things work the way we expect them too and not just rely on "if it compiles it works" which we can given the current size of freight, but not for much longer. Also if you liked what you read and want to work with me, I'm available for hire or contracting.