See version 1.0.2 of keccakf1600 for the original port driver implementation.
When interfacing with native C and the Erlang VM, you essentially have 3 options to choose from:
driver_entry (I/O heavy operations are typically best suited for this type)
ERL_NIF_INIT (fast synchronous operations are typically best suited for this type)
My goal was to have a fast and asynchronous way to call blocking functions without disrupting the Erlang VM schedulers from carrying out their work. The original plan was to use
driver_async combined with
ready_async to perform the blocking operations on “a thread separate from the emulator thread.” I used the
ei library in order to communicate between the Erlang VM and the port driver written in C.
Having accomplished my goal, I decided to run a simple benchmark against the equivalent SHA-2 algorithms out of curiosity as to how my implementation might stack up against the native Erlang
The results were not terribly impressive:
The two main concerns I had with the results were:
Concern #1 was ruled out by directly testing the C version of the algorithms, for small message sizes they were typically within 1-2μs of each other.
Concern #2 required more research, which eventually led me to the bitwise project by Steve Vinoski. The project explores some of the strategies for dealing with the synchronous nature of a NIF without blocking the scheduler by keeping track of reductions during a given timeslice. It also explores strategies using the experimental dirty NIF feature.
After experimenting with the two options, I decided to use
enif_consume_timeslice combined with
enif_schedule_nif to yield control back to the main Erlang VM on larger inputs to prevent blocking other schedulers.
I rewrote the port driver as a NIF and released it as version 2.0.0 of keccakf1600 and ran the same benchmark again:
These results are much more consistent and closer to my original expectations. I plan on refactoring the erlang-libsodium project using the same technique.