Crossing the impossible FFI boundary, and my gradual descent into madness

(verdagon.dev)

106 points | by signa11 383 days ago

15 comments

jamilbk 382 days ago
The article provides a very detailed exploration of all of the fun challenges you can face designing FFIs with Rust, but there's a good chance you can "get away" with simpler approaches if you think ahead a bit.
In our case, we call into Rust from Kotlin using JNI [0] and Swift using swift-bridge [1]. Thankfully our use case for the FFI [2] is for non-performance-critical calls and the data structures are fairly simple, so we just serialize objects with JSON.
No major issues so far.
One thing I am surprised hasn't been mentioned so far is Mozilla's UniFFI [3] which seems to solve some of the issues brought up in the article. We plan to switch to that once our FFI requirements become more complex.
[0] https://docs.rs/jni/latest/jni/
[1] https://github.com/chinedufn/swift-bridge
[2] https://www.firezone.dev/kb/architecture/tech-stack#client-a...
[3] https://github.com/mozilla/uniffi-rs
alexvitkov 383 days ago
If you want to interop well with Rust code, it feels to me like your language has to inherit so many Rust semantics, that I'm questioning myself why I would use it over Rust.
If you're making a new language, just have good interop with C. Most libraries worth using are written in C. Calling into C is trivial* and enforces almost no limitations on what you can do language-design wise.
* trivial, with the somewhat sizable asterisk that you have to rewrite the header files in your language.
[-]
- jlarocco 382 days ago
  I wish Rust would standardize their ABI already. I started a project to call Rust from Common Lisp, but haven't got very far. It's a lot of work, and they can break compatibility at any time.
  If they really want to replace C and C++ then they really need to support being called from third party languages.
  [-]
  - guipsp 382 days ago
    https://github.com/rust-lang/rust/issues/111423
  - SkiFire13 382 days ago
    Rust already has a C ABI for those cases. Also, the C++ example is kinda bad because it doesn't have a standard ABI (only a bunch of implementation specific ones); they also mostly treat this ABI as stable, but this is also detrimental because it is making the performance of some features suboptimal (e.g. `unique_ptr`)
- verdagon 382 days ago
  I've been looking into this, and I suspect that one actually needs surprisingly little to interoperate safely with Rust.
  TL;DR: The lowest common denominator between Rust and any other memory-safe language is a borrow-less affine type.
  The key insight is that Rust is actually several different mechanisms stacked on top of each other.
  To illustrate, imagine a program in a Rust-like language.
  Now, refactor it so you don't have any & references, only &mut. It actually works, if you're willing to refactor a bit: you'll be storing a lot of things in collections and referring to them by index, and cloning even more, but nothing too bad.
  Now, go even further and refactor the program to not have any &mut either. This requires some acrobatics: you'll be temporarily removing things from those collections and moving things into and out of functions like in [2], but it's still possible.
  You're left with something I refer to as "borrowless affine style" in [1] or "move-only programming" in [0].
  I believe that's the bare minimum needed to interoperate with Rust in a memory safe way: unreference-able moveable types.
  The big question then becomes: if our language has only these moveable types, and we want to call a Rust function that accepts a reference, what then?
  I'd say: make the language move the type in as an argument, take a temporary reference just for Rust, and then move-return the type back to the caller. The rest of our language doesn't need to know about borrowing, it's just a private implementation detail of the FFI.
  These weird moveable types are, of course, extremely unergonomic, but they serves as a foundation. A language could use these only for Rust interop, or it could go further: it could add other mechanisms on top such as & (hard), or &mut (easy), or both (like Rust), or a lot of cloning (like [3]), or generational references (like Vale), or some sort of RefCell/Rc blend, or linear types + garbage collection (like Haskell) and so on.
  (This is actually the topic of the next post, you can tell I've been thinking about it a lot, lol)
  [0] "Move-only programming" in https://verdagon.dev/grimoire/grimoire#the-list
  [1] "Borrowless affine style" in https://verdagon.dev/blog/vale-memory-safe-cpp
  [2] https://verdagon.dev/blog/linear-types-borrowing
  [3] https://web.archive.org/web/20230617045201/https://degaz.io/...
  [-]
  - rng_civ 382 days ago
    Have you taken a look at the paper "Foreign Function Typing: Semantic Type Soundness for FFIs" [0]?
    > We wish to establish type soundness in such a setting, where there are two languages making foreign calls to one another. In particular, we want a notion of convertibility, that a type τA from language A is convertible to a type τB from language B, which we will write τA ∼ τB , such that conversions between these types maintain type soundness (dynamically or statically) of the overall system
    > ...the languages will be translated to a common target. We do this using a realizability model, that is, by up a logical relation indexed by source types but inhabited by target terms that behave as dictated by source types. The conversions τA ∼ τB that should be allowed, are the ones implemented by target-level translations that convert terms that semantically behave like τA to terms that semantically behave like τB (and vice versa)
    I've toyed with this approach to formalize the FFI for TypeScript and Pyret and it seemed to work pretty well. It might get messier with Rust because you would probably need to integrate the Stacked/Tree Borrows model into the common target.
    But if you can restrict the exposed FFI as a Rust-sublanguage without borrows, maybe you wouldn't need to.
    [0] (PDF Warning): https://wgt20.irif.fr/wgt20-final23-acmpaginated.pdf
  - alexvitkov 382 days ago
    Thanks for the write-up. My biggest fear is not references, overloads or memory management, but rather just the layout of their structures.
    We have this:
```
    sizeof(String) == 24
    sizeof(Option<String>) == 24
```
    Which is cool. But Option<T> is defined like this:
```
    enum Option<T> {
       Some(T),
       None,
    }
```
    I didn't find any "template specialization" tricks that you would see in C++, as far as I can see the compiler figures out some trick to squeeze Option<String> into 24 bytes. Whatever those tricks are, unless rustc has an option to export the layout of a type, you will need to implement yourself.
    [-]
    - vlovich123 382 days ago
      You don’t need to determine the internal representation as long as you’re dealing with opaque types and invoking rust functions on it.
      As for the tricks used to make both 24 bytes, it’s NonNull within String that Option then detects and knows it can represent transparently without any enum tags. For what it’s worth you can do similar tricks in c++ using zero-sized types and tags to declare nullable state (in fact std::option already knows to do this for pointer types if I recall correctly)
    - ithkuil 382 days ago
      Yeah currently "niche optimization" is performed when the compiler can infer that some values of the structure are illegal.
      This can be currently done when a type declares the range of an integer to not be complete with the
      rustc_layout_scalar_valid_range_start or _end attribute (requires #![feature(rustc_attrs)])
      In your example it works for String, because String contains a Vec<U8> which inside contains a capacity field of type struct Cap(usize) but the usize is effectively constrained to contain values from 0..=max_isize
      The only way for you to know that is to effectively be the rustc compiler or be able to consume it's output
kodablah 383 days ago
It seems like the struggle here is trying to use Rust transparently/automatically from another language instead of just make bindings easier. I have found that trying to auto-FFI existing Rust types is not the best for languages because there is often an impedance mismatch with how the language treats things and how Rust does. Therefore trying an always-works transparent binding may inevitably end up with people asking for more flexibility to fit the language better (e.g. controlling lifetime semantics, type mappings/copying, etc).
I think it's clearer to take an approach like Neon and PyO3 and other FFI-to-lang helpers do where you just make it easy/safe to write these Vale functions in Rust.
[-]
- DSMan195276 382 days ago
  I agree with you, but it's always hard to ignore the allure of not needing to write all the bindings manually. If nobody is willing to write the bulk of the initial bindings then the chance of someone using it seems low, and in theory writing a transparent layer between the two takes less time/effort (in practice I agree that the incompatibilities will make it messy long term).
  Rust has the same problem with C APIs, in the past I've went to use something and found that the binding was not there. For a couple functions it's no big deal, but if say half or more of the ones I needed weren't there already then I wouldn't have bothered trying to use it at all.
ar7hur 382 days ago
> Anyone trying to make a new mainstream language is completely insane, unless they're backed by a huge corporation. There are only two exceptions in the last 25 years that come close: Scala and Kotlin.
And Clojure! (also a JVM language)
[-]
- munchler 382 days ago
  I would also add Zig to the list. I certainly hear about it often enough on HN.
  [-]
  - zem 382 days ago
    elixir and gleam in the erlang world
    [-]
    - bobnamob 382 days ago
      Much less effort to build a language (without megacorp backing) if you're building off a battle tested runtime.
      I'm absolutely not saying this to discredit the work that's gone into Clojure, Elixir et al, but it does lend credence to the idea of building for an existing ecosystem instead of bootstrapping your own (along with "seamless" interop as a first class concern)
      If anyone can crack seamless interop between natively compiled languages to dodge ABI hell they'll earn a nice place in history
throwawaymaths 383 days ago
Rustler crosses the rust/Erlang barrier relatively well, though it's error messages when you try to cross it wrong are somewhat unhelpful.
toyg 383 days ago
> Anyone trying to make a new mainstream language is completely insane, unless they're backed by a huge corporation. There are only two exceptions in the last 25 years that come close: Scala and Kotlin
Kotlin was designed and backed by JetBrains from the start. Maybe not a "huge" corporation but a pretty big company still (by revenue).
[-]
- iudqnolq 383 days ago
  I don't know the story of how the Android team went Kotlin-first. If that wasn't a deliberate plan they got quite lucky. Could Kotlin arguably be backed by Google?
  [-]
  - kernal 382 days ago
    Android Studio is based on IntelliJ and there's a lot of collaboration between both teams. The adoption of Kotlin was a logical next step, considering a lot of IntelliJ is written in Kotlin.
  - izacus 383 days ago
    No, the Android community adopted Kotlin before Google added any support for it from their side.
    [-]
    - kernal 382 days ago
      I don't know when the first Kotlin Android app was published, but Kotlin 1.0 was released in 2016 and then announced as a first class language at Google I/O in 2017.
jauntywundrkind 383 days ago
There's a huge amount of doom & gloom, prophecies of failure against wasm's component-model, a latent expectation that trying to solve FFI is impossible & destined to failure. But what if?
It's be so neat for language creators to be able to use & leverage other works. Getting there wouldn't be easy, but there's be a standard path to getting the hard fought capability here.
forrestthewoods 382 days ago
C APIs are the best APIs. I do a lot of mixed language work and I would never attempt anything like. Just write a C API and provide trivial FFI bindings for your favorite language.
That said, I thoroughly enjoyed the article and the authors admission of its insanity! Great read. But do the simple thing and call it a day.
blaise-pabon 382 days ago
I'm a novice on this topic, but I'm surprised that no one has mentioned Python. Is that because it is a solved problem, thanks to https://github.com/PyO3/pyo3 and is no longer a challenge?
move-on-by 383 days ago
I’ve not used rust, and quite frankly I think a lot of the post is over my head, but I enjoyed the read nonetheless.
> I don't have any specific plans to turn this C proof-of-concept into a production-quality tool that would enable calling Rust from C, but if anyone wants to take it from here, I'd be happy to assist!
I laughed at this, I’d bet my bottom dollar it’s an attempted nerd snip!
marklar423 383 days ago
With all this effort required (as the author points out), I start to wonder if a better solution is to communicate via RPC over local sockets.
There will be some overhead, but it might be a wash considering calling over a FFI often involves similar overhead to marshall / unmarshall objects. And the simplicity gains would be massive.
[-]
- masfuerte 382 days ago
  COM [1] was a solution to these problems thirty years ago.
  In-process it's just function calls. Cross-process COM has automatic marshalling for standard types ("automation types") or you can define custom marshalling that does whatever you want.
  WinRT [2] is a more modern version. It builds on COM and (among other things) provides the basis for the latest UI frameworks in Windows.
  [1]: https://en.wikipedia.org/wiki/Component_Object_Model
  [2]: https://en.wikipedia.org/wiki/Windows_Runtime
  [-]
  - nsguy 382 days ago
    A long time ago I worked on a project where we needed to distribute an in process COM object, so we moved it to DCOM, instantiated multiple instances, and that worked! All in all COM was a fairly pleasant technology. Not really that different than gRPC (e.g. idl vs. proto).
- layer8 383 days ago
  Why over a socket? You could perform the same protocol more efficiently with normal functions in-process. Maybe we need a standard serializing LPC protocol just using the platform ABI. Or maybe this comes down to something like ZeroMQ in-process.
  [-]
  - marklar423 382 days ago
    Mostly because sockets are supported by everything today, and they're easy to understand. What you're describing would certainly work but it looks similar to what the OP did in the blog post, with all the complexity it comes with.
    [-]
    - layer8 382 days ago
      The OP doesn’t serialize. My proposal would still serialize as with RPC, but instead of passing the data over a socket, just pass the data as a binary blob over a regular function call.
      [-]
      - spongebobstoes 382 days ago
        The main thing on my mind is that the build system would become more bespoke when doing it that way, compared to running a few processes that interact with each other.
        The overhead of socket read+write is typically much less than the serialization overhead, although both can be optimized to the point of irrelevance for many applications.
        It's also interesting because it ends up looking like a microservices architecture, except all on one machine (even all in one process tree).
  - marklar423 382 days ago
    https://zeromq.org/ -> TIL really cool, thanks for the pointer.
thatoneguy 382 days ago
For anyone else, OP is not referring to the original FFI that leads to madness (and worse):
https://en.wikipedia.org/wiki/Fatal_insomnia
ingve 383 days ago
This could be great for scripting with Neptune! [0]
[0] https://github.com/Srinivasa314/neptune-lang
[-]
- yobananaboy 382 days ago
  I'm gonna find some reason to use this for my Battleship game too!
renewedrebecca 382 days ago
One of the things I like about Crystal is how utterly painless it is to call C library functions.
revskill 383 days ago
WHy not WASI ?
[-]
- Retr0id 383 days ago
  How could WASI solve (or be involved in solving) this problem?
  [-]
  - Findecanor 382 days ago
    I suspect confusion with the WebAssembly Component Model — whose development is somewhat intertwined with that of WASI's.
    It defines a function call ABI between sandboxes. No object is in shared memory: parameters are passed by value or by handle. Has its own IDL and ABI that languages' ABIs need to have adaptors to, if they don't conform.