Recently, I put together an Erlang asynchronous port driver named keccakf1600 which implements the SHA-3 algorithms used in another one of my projects, jose.
See version 1.0.2 of keccakf1600 for the original port driver implementation.
When interfacing with native C and the Erlang VM, you essentially have 3 options to choose from:
driver_entry
(I/O heavy operations are typically best suited for this type)ERL_NIF_INIT
(fast synchronous operations are typically best suited for this type)stdin
and stdout
My goal was to have a fast and asynchronous way to call blocking functions without disrupting the Erlang VM schedulers from carrying out their work. The original plan was to use driver_async
combined with ready_async
to perform the blocking operations on “a thread separate from the emulator thread.” I used the ei
library in order to communicate between the Erlang VM and the port driver written in C.
Having accomplished my goal, I decided to run a simple benchmark against the equivalent SHA-2 algorithms out of curiosity as to how my implementation might stack up against the native Erlang crypto
library.
The results were not terribly impressive:
The two main concerns I had with the results were:
Concern #1 was ruled out by directly testing the C version of the algorithms, for small message sizes they were typically within 1-2μs of each other.
Concern #2 required more research, which eventually led me to the bitwise project by Steve Vinoski. The project explores some of the strategies for dealing with the synchronous nature of a NIF without blocking the scheduler by keeping track of reductions during a given timeslice. It also explores strategies using the experimental dirty NIF feature.
I highly recommend reading the two presentations from the bitwise project: vinoski-opt-native-code.pdf and vinoski-schedulers.pdf.
After experimenting with the two options, I decided to use enif_consume_timeslice
combined with enif_schedule_nif
to yield control back to the main Erlang VM on larger inputs to prevent blocking other schedulers.
I rewrote the port driver as a NIF and released it as version 2.0.0 of keccakf1600 and ran the same benchmark again:
These results are much more consistent and closer to my original expectations. I plan on refactoring the erlang-libsodium project using the same technique.