FFI with Haskell and Rust

I've had an idea percolating for a while. I love using Haskell for work.
It's functional, expressive, and easy to reason about. It's also
strongly typed and deals with immutable data (for the most part) which are
big pluses for writing good code. However, it's garbage collected,
and while it's not likely to run into performance issues for small
projects, on larger ones it can get in the way. Commonly with Haskell,
if one really needed speed, the FFI would be used with C code in order
to get the performance needed. The only problem with that is that C is
unsafe with undefined behavior being the norm. This is where Rust is
a great alternative C for those performance critical functions needed in
some Haskell problems. If you've ever used Haskell before you know that
the documentation is either outdated, unrelated to what you actually
want, or just plain missing. I spent hours digging through old Haskell and
Rust blog posts, stack overflow posts from 2009, GHC options, and god
knows how many compiler errors. Rather than having you dig through the depths
of the archaeological nightmare that is the Internet, I've decided to
write this for anyone trying to use Rust in Haskell. Let me be clear,
this article is not covering how to use Haskell in Rust. That's for some
other time.

Let's get started then.


This articles assumes you have the following installed:

  • rustc - at least 1.12 stable
  • cargo
  • cabal
  • GHC

Why not stack? It kept yelling at me for weird things. I like it but
this is a simple example and cabal will suffice. Besides stack is built
on top of cabal so you should already have it anyways.

Setting up our Rust Code

First let's setup our project directory. Initialize it with the
following command:

cargo new haskellrs

This well setup a folder haskellrs that's laid out like this:

├── Cargo.toml
└── src
    └── lib.rs

Before we start setting up our Haskell code as well let's get our Rust
code straightened out. First we need to modify our Cargo.toml file. When
you open it up it should look something like this:

name = "haskellrs"
version = "0.1.0"
authors = ["Michael Gattozzi <mgattozzi@gmail.com>"]


We need to modify it to look like this:

name = "haskellrs"
version = "0.1.0"
authors = ["Michael Gattozzi <mgattozzi@gmail.com>"]

name = "rusty"
crate-type = ["staticlib"]


If you've done FFI before you might have noticed I chose to statically
link this file rather than dynamically link it. I have yet to get an
example working with dylib or the new cdylib format for Rust and would love to
know how if you've done it before. The three new lines added let Rust
know a few things:

  1. The output file will be a .a file called librusty.a (The common
    convention for C code that's a library to start with lib and then the
    name of the project).
  2. This file can get statically linked into other programs because we've
    defined it to be a static lib.
  3. It knows how to setup the file differently with it's symbols for
    other programs. Rust normally uses .rlib files to link Rust code to
    other Rust code when you pull in external dependencies with Cargo. We
    don't want that since Haskell would have no idea how to understand that
    file format.

Okay so our project is setup! Let's start writing some Rust! Open up
your src/lib.rs file. It should look like this:

mod tests {
    fn it_works() {

We don't need this for our purposes so feel free to delete it. Instead
we're going to write two simple functions so that we can see how this
works. One with CStr and one with i32. First up we're going to write
a function called double_input that takes an i32 and doubles it. It
should look like this:

pub extern fn double_input(x: i32) -> i32 {
    2 * x

Let's go through each line:

  1. The #[no_mangle] tells the Rust compiler not to do anything weird
    with the symbols of this function when compiled because we need to be
    able to call it from other languages. This is needed if you plan on
    doing any FFI. Not doing so means you won't be able to reference it in
    other languages
  2. Now our actual function header. pub makes it available to use
    elsewhere. extern means this is externally available outside our
    library and tells the compiler to follow the C calling convention when
    compiling. If you don't know what that means, it has to do with how code
    leaves values available for functions that called it on the CPU level.
    You can find more information about it
    here if
    you are interested on learning more. We need to do this so Haskell knows
    how to treat the Rust code. For all intents and purposes Haskell thinks
    it's calling C compiled code and not Rust. Now we finish with the rest
    of the function. fn tells us this is a function we are declaring,
    double_input is the function name (x: i32) means it has an input x of
    type i32 and -> i32 means we are returning an i32.
  3. Our next line is simple take x and multiply it by 2. Since it's the
    last line in the Rust function by omitting the ; at the end we're
    telling the compiler return the value of the expression, much like how
    the last expression of a do block in Haskell returns a value.

Pretty simple right? The only difference between FFI code and Rust only code
is the extern and #[no_mangle] statements put into the function

Okay now let's write a function that prints Strings passed to it. Since
we're using FFI we need to use CStr so that both languages know how
to talk to the other. At the top of lib.rs add the following line:

use std::ffi::CStr;
use std::os::raw::c_char;

Now we're going to write our printing function:

pub extern fn print_string(x: *const c_char) {
  unsafe {
    let cstring = CStr::from_ptr(x);
    if let Ok(input) = cstring.to_str() {
      println!("{}", input);
    } else {
      panic!("Unable to print input");

Like before we write out our #[no_mangle] and pub extern so that the
function will be exported. We've also stated that our input is
a pointer to what we'll turn into a CStr. Because a CStr isn't like a Rust
String we need to turn it into one if we want it to be able to print. This is
why we use the to_str() function. However, this has the possibility of failure. If
the program making the CStr fails in making it properly then we won't
be able to turn it into a Rust String. This is why to_str() returns
a Result type. If it's translated fine we'll print out the input string
and if not we panic and cause the program to abort because it failed in
some manner. We also have to wrap it in an unsafe because we're messing
around with raw pointers. While this is classified as unsafe by Rust we
can be pretty sure that the Haskell code will generate a correct CString
for us to use so wrapping it in unsafe here is alright.

All right so we have our two functions we want to use in Haskell setup.
I want to talk about what we're going to be doing with the Haskell code
and some problems I ran into before actually writing this code.

A small digression

While we could do this in one file we're going to use two of them. One
containing the foreign functions imported from Rust and our main file that will
call them from the other module. Why? Because there's no documentation
about using foreign functions imported in one module to be used in
another and it was the most frustrating things to have to deal with. To
digress for a bit, I found the documentation for Rust FFI was alright but
not the best. I could figure out what I needed but it's out of date since
it lacks any examples with dylib/cdylib and using that in other
languages. It also assumes I only want to write and use it in C code. I don't
want to use C. That's why I use Rust. I want that embedded in other languages
not the other way around.

Haskell's documentation however was messy, disorganized, scattered across
various web pages and sites, and could use a lot of love in getting fixed up.
There's no good tutorials on dynamic vs statically linking libraries and how to
do it. It just said cabal and GHC manuals covered it without linking the relevant
pages. That's honestly pretty frustrating, it felt like saying that's an exercise
left to the reader with no basic knowledge to get started. Linking to it would have
saved me many hours of digging through the documentation. There's no well written
explanations of doing FFI. The information on how to do it is there, it's just
not good or easy to parse without time and effort. This is one of the times where,
no the types really can't speak for themselves, because it's not types
I'm dealing with. It's frustrating because I really do like Haskell as a language
but the lack of good documentation really hurts the language.

Alright. Enough about bad documentation. Let's actually code what
I spent forever figuring it out so you don't have to be as frustrated as
I was, and so I can contribute rather than just being angry about it.

Setting up our Haskell code

We're going to create two files in our src directory. Main.hs and FLib.hs. Here's what FLib.hs
should look like:

module FLib where

import Foreign.C.Types
import Foreign.C.String

foreign import ccall "double_input" doubleInput :: CInt -> CInt
foreign import ccall unsafe "print_string" printString :: CString -> IO ()

Here's what Main.hs looks like:

import FLib
import Foreign.C.String (newCString)

main :: IO ()
main = do
  putStrLn $ show $ doubleInput 3 --This will print 6
  -- Technically newCString doesn't need the \0 but I found that
  -- Rust was eating up the last character regardless of what it was
  -- so to make sure it works I just null terminated it to get the right
  -- output.
  str <- newCString "Hello World\0"
  printString str

Now we need to setup our cabal file so that this actually can work:

name:                haskellrs
build-type:          Simple
cabal-version:       >=1.10

executable haskellrs-exec
  main-is:             Main.hs
  hs-source-dirs:      src
  build-depends:       base >= 4.7 && < 5
  default-language:    Haskell2010
  other-modules:       FLib
  extra-libraries:     rusty, pthread
  extra-lib-dirs:      target/release

  hs-source-dirs:      src
  exposed-modules:     FLib
  other-extensions:    ForeignFunctionInterface
  build-depends:       base >= 4.7 && < 5
  default-language:    Haskell2010
  extra-libraries:     rusty, pthread
  extra-lib-dirs:      target/release

Pretty simple file right? Before we dive into the Haskell code let's get
this example working so you can see for yourself:

cargo build --release
cabal run

You should see the following printed out on your screen:

Hello World

You just ran Rust code inside of Haskell! Let's dissect our Haskell
code, so that we can understand how all of this works.

If we take a look at FLib we have our statements for importing code:

foreign import ccall "double_input" doubleInput :: CInt -> CInt
foreign import ccall unsafe "print_string" printString :: CString -> IO ()

foreign import ccall tells us that the function we're importing
follows the C calling convention and Haskell should treat it like that
in it's code. Anything in quotes is the name of the actual function
we're importing. The stuff after the quotes is the function header and
what we'll call the name in Haskell. For example our Rust function double_input
is being imported into Haskell as doubleInput and it takes a CInt and
returns a CInt.

Take a look at the second line though. Notice how I put unsafe? We need
this because of how CStrings work for one in Haskell (I couldn't get
anything to print if I didn't put unsafe) but two we are doing an IO
action. Unlike doubleInput this function isn't pure and we have side
effects (printing out a line) and so we need this to be in the IO monad
for it to work. Even if your input might be a CInt for another function
you write, if you implement any kind of IO with that function import it
with an unsafe call so that it's enforced that way in Haskell. Unlike
Rust, Haskell needs to keep track of it's IO actions so that it knows
how to execute things each time.

General rule of thumb:

  • Import pure functions as if they were safe
  • Import impure functions as unsafe

Okay now let's look at our Main.hs file

import FLib
import Foreign.C.String (newCString)

main :: IO ()
main = do
  putStrLn $ show $ doubleInput 3
  str <- newCString "Hello World\0"
  printString str

This is the code that actually executes our Rust Functions. First we
import our library that contains our Rust code and we import a function
to make CStrings.

Now we setup our main function. If you've not used Haskell before then
the syntax might look different but we'll go through each line so you
can understand what's going on:

main :: IO ()

This is our function declaration. Every executable has a main function
with a return type of of IO (). Essentially we're saying this program
has side effects (because we are running a program on a computer) and it
returns the unit type () upon completion wrapped up in the IO monad.
If that was complete gibberish then just know that IO is there to tell
the compiler we're doing stuff that affects the computer and it needs to
properly keep track of it.

main = do
  putStrLn $ show $ doubleInput 3

We've started our do block which is saying that all the actions after
this must be in a monad and the last statement returns the value of the
expected monad, in this case we're in the IO monad and the last
statement must return IO (). Our next line contains $. It's a way of
saying do everything to the right of me first. That line is equivalent
to writing:

(putStrLn (show (doubleInput 3)))

I just find that the $ is a bit cleaner to read. So what's happening
here is we're doubling the input 3. However we want to display this in
the terminal, so show converts the CInt into a String and
putStrLn actually prints it out to the console!

Next up we have this line:

str <- newCString "Hello World\0"

What we're saying is store the value of newCString into str. Why are we
doing this though? Couldn't we just do what we did with the other line?
Well no, see newCString returns an IO CString and what we want is
the value inside the IO which is a type CString. Our function
printString expects a CString not an IO CString. By doing this
step we can actually then use str as a CString!

That's why this line works:

  printString str

It then prints our string and returns an IO () like the do block
expects and so our code compiles! I've been hiding some of the magic of
how this works in the cabal file that contains some options and things
to get this working. Let's take a look at it again:

name:                haskellrs
build-type:          Simple
cabal-version:       >=1.10

executable haskellrs-exec
  main-is:             Main.hs
  hs-source-dirs:      src
  build-depends:       base >= 4.7 && < 5
  default-language:    Haskell2010
  other-modules:       FLib
  extra-libraries:     rusty, pthread
  extra-lib-dirs:      target/release

  hs-source-dirs:      src
  exposed-modules:     FLib
  other-extensions:    ForeignFunctionInterface
  build-depends:       base >= 4.7 && < 5
  default-language:    Haskell2010
  extra-libraries:     rusty, pthread
  extra-lib-dirs:      target/release

It's fairly straight forward. We've created an executable that imports
our FLib library (this is why it's listed under other-modules) and it
imports our rust library (remember we called it rusty). We've also stated
where rusty was located with the extra-lib-dirs that way Haskell could
link it in properly. Remember this means you've compiled the Rust
Library first otherwise the target/release folder won't exist at all. Now
we've also imported this library pthread. It was necessary to get the CString
example working. Our library entry is also pretty much the same but take
a look under other-extensions. It uses ForeignFunctionInterface. If we
didn't put this here we would have to put {-# LANGUAGE ForeignFunctionInterface #-} at the top of every Haskell file that did
an import call. By putting it here it implicitly does that for every
file in the library. Together these allow Haskell to call our Rust code
as expected. Seems like a lot but overall there aren't that many lines
of code needed to get it working!


If you want to take a look at the code I've put it up in a repository
here. This post covered how to
setup a project to use Rust inside of Haskell code. While this was
fairly simple to setup and do in terms of the overall code it still took
some work. It also covered the proper ways to call the code inside of
Rust. I've started a project to automate the process in
Rust and Haskell projects so the need to write all the boiler plate
yourself is unnecessary. It'll also provide mappings between the various
types in Rust and Haskell to make it easier to reason about. This will
take some time to get right but making this a seamless experience for
users would be great. I think Rust and Haskell compliment
each other well and making it easy to do that will be a boon to both

Thanks to Caleb Jones for making some minor
corrections to this article.