Rust, Haskell, FFI, and a Whole Lot of Linking

Rust, Haskell, FFI, and a Whole Lot of Linking

Back in 2016 I had an idea. What if I put Rust code inside Haskell? Not long after that I had another idea. What if I put Haskell code inside Rust? With that the library curryrs (pronounced couriers) was born with the goal of making FFI between the two languages easy and to let the user spend less time mucking around with types and more time actually coding. However, this all broke when cargo changed it's behavior regarding rpaths for dynamic linking back some time in ~2017. Like many of my projects, I let it languish because I just couldn't fix it at the time and I was still relatively junior as a coder at my job and was finishing up my CS degree. I attempted it again in 2018 and for multiple reasons failed to get it to work again. curryrs languished for all of 2019 and I had archived the code at that point in time. A lot of the failure just came down to not understanding linking and just not enough knowledge around computers. Then this year in the midst of COVID-19 I got a brain itch. This is mostly what causes me to try wild ideas that I work on obsessively until I solve it, like when I put the Haskell runtime in Rust. That level of working on the edge of no documentation really, cryptic clues, and can it even be done has driven me to do things just to prove I can. It reminds me of this line:

"As a systems hacker, you must be pre-pared to do savage things, unspeakable things, to kill runaway threads with your bare hands, to write directly to network ports using telnet and an old copy of an RFC that you found in the Vatican" from James Mickens 'The Night Watch' which is well worth your time to read

Thing is since this is due to having ADHD and thus this brain itch only comes around once in a while and I can't control when it happens. This is a huge reason I can't maintain projects as well as I wish I could. For once though I got the itch. Could I finally fix the code up to link properly and actually get it into a state that it would be usable today rather than bitrot for all eternity in Github's Glacier Code Vault? Spoiler alert: yes.

I want to cover the actual solution a bit, some lessons learned around linking, and possible future directions for the library and how it can act as both a library and a resource for how to do FFI between the two considering all the GHC docs use C and the Rust docs also only really talk about C.

I live alone with my dog so my brain at this point has melted into a pile of memetic mush and this is the only way I can communicate now.

Times are bad here's a picture of my dog Bukka

One of the things that was a fundamental part of my failures before was thinking I had to dynamically link the GHC libs and Rust libs together. It was confusing to me that .so/.dll/.dylib files were the only things that GHC distributed (it wasn't the case actually) with it and why wouldn't it run, why do I have to use LD_LIBRARY_PATH this is not at all an ergonomic solution, especially given how many libs for GHC you have to link in to get it to even run the runtime. So I tried everything I could think of to no avail and let it die in 2017.

Then in 2018 I figured well why not static linking? Which oof that itself was it's own trip of, modifying GHC makefiles probably older than I am to make all of it's libraries static, then trying to see if I could just distribute those artifacts somehow, but oh wait there are multiple operating systems and platforms, so I'll just ship the modified repo with the Rust library and have people build the patched libs, but oh wait has a file size limit of 10mb, so uh what if I made a Haskell minifier and it was at that point that I decided maybe I should just let this project bitrot.

The other day when I fixed this I again tried to get dynamic linking to work to no avail, but what I noticed this time was that the GHC libs ship with both static (.a  files) and dynamic versions. From this ancient blog post it seemed that GHC libs would statically link by default so I should just be able to use that fact and statically link the libs in via the file. Haskell in Rust was the primary breakage due to dynamic linking so this was a god send to figure out. I didn't want to do the obsessive mad scientist route again. A couple hours later I had brought the code into Rust 2018 form (fairly effortless given the age of the code as it used try!() and predated ?) and statically linking the code and it worked again! If you're curious about the exact changes I reccomend looking at the commit here particularly the changes around There was a lot of reformating because well rustfmt really didn't work well when the library was made and cargo fmt just wasn't a thing and just typing this sentence alone makes me feel old.

Statically linking the code via looks much better than my code crime of dynamically creating a .cargo/config with it, then requiring you to recompile twice to get it to work with dynamic linking. I discovered the .a files after this, but not before I tweeted about it.

Pictured here are crimes against linkers and cargo

Like I said I get a lot of ideas that I want to try out. I just never said they were good ideas.

I also ran into a fun OSX specific bug this time around because it couldn't dynamically link for reasons I cannot fathom and continues the trend to want to chuck this laptop/os into the sea for giving me yet more trouble then I ever had using Arch Linux and the testing repos as my main dev machine.

The libiconv error in question

The real fun part was being able to figure out this Rust linker error for a Haskell/Rust FFI library because of a bug report on this particular Haskell stack GitHub issue. The serendipity of it all is not lost on me.

This was a mess of a night, but it finally worked. So what did I take away from this about linking?

Don't dynamic link.

No just really don't.

Okay you can for specific reasons, but if your libs do not exist in /usr/local/lib or /usr/lib then just statically link and save yourself the trouble, a few extra kilobytes on disk be damned. Unless you're a phone or some kind of resource constrained system you'll just be sad and have to set LD_LIBRARY_PATH in order to run your lib. There's just nothing you can do about it. I spent years trying to do this, so just don't be me.

The only reason dynamic linking even is a thing really is because the world was built on C, mostly C projects (there are other langs) get their libs installed in  /usr/local/lib or /usr/lib because of some package manager and why have 20 copies of libc? Just link to the one copy. But this is a very Unix design, solving a size restricted disk problem back in the 70s/80s, and is counter to modern software package management for developing languages like cargo which will just do static linking. Tools like stack, ghcup, rustup, rbenv, and nvm to name a few are designed to handle the fact different programs are built with different compiler versions and that the developer will manage this, not the package manager. As such if you want to link to their libs they won't be in the places dynamic libs live because that usually requires sudo access and therefore dynamic linking is just a pain to get working.

My biggest take away from this was that there is a specter haunting Unix-the specter of C and how we used to package software butting heads with modern tooling. Dynamic linking has it's place, just on the whole not as many places these days given how we actually deploy our code these days. Still I think this was a valuable lesson and kind of reflects this blog post's thoughts on The Rust Dependency Discourse (which I'm not gonna bother getting into beyond I am personally not a fan). C has a lot of deps, they just all live in 2 directories and modern tooling like cargo makes you pay the cost upfront.

The fix is in, now what?

I'm not gonna do a release just yet, I'd like to do some more work around doing more than just passing numbers across the FFI boundry and clean up the docs. I would also like to create a handbook of sorts on how to do this all safely and correctly and how to utilize curryrs to do that. I'd like to expand test coverage and who knows what else. I think there's a lot that can be done in this space and if you're intersted I invite you to come help out. I'd like to get this to a point that it can be maintained with or without me around as I know I tend to drop in and out on things and feel guilty around letting things languish. I'd rather not given how I think both languages really can compliment each other well and how much I've learned from using them these past 5 years and how cool it is that it is possible to do this.

Most of this was just me ranting against dynamic linking, as the other two blog posts I linked cover much of the actual journey on getting them to work originally. Much of that still stands minus linking to the GHC libs statically in order to have access to the runtime. Take a look and if there's something you want to see happen or could use improvement, as I've only really begun to do some spring cleaning on the repo, open an issue here. Hopefully, I'll have more updates and a release coming out sooner rather than later.