I produced a Keccak implementation in May 2019, through about one week of intensive study and hacking. It builds on some techniques and routines I'd been developing for small, self-contained C programs on Unix, whereby the standard I/O library is thrown out in favor of a minimalistic interface fitting the needs of the program, requiring for portability only the system call wrappers, which generally have direct translations to assembly language. The approach ensures that system errors are detected without requiring effort by calling code and leaves no uncertainty regarding signal behavior, flushing of buffers or integer overflow. The resulting binary here (musl/amd64, static, unstripped) weighs in at 19K and contains not so much as a "malloc".
Regarding the permutation and sponge construction themselves, I found the Keccak reference quite comprehensible; the only topics I needed to brush up on were Linear Feedback Shift Registers and the polynomial algebra used to describe them.
Having previously written what ended up as possibly the world's slowest hash functions in an interpreted language, I was itching to make something fast for once: not pouring vast efforts into optimization but at least avoiding obvious slowdowns. I wish I could say I got it all right on the first try, or that I identified all mistakes through careful re-reading, but not quite:
jfw: well my keccak proggy hashes; now to look for some test vectors
jfw: sure enough I botched it:
jfw: mathematically, (x-1) mod y is equivalent to (x+y-1) mod y
jfw: (pretty much the definition of mod...)
jfw: but if x is an unsigned type -- which I made it, because it's used as an array index and those better be nonnegative -- the subtraction wraps *mod 2*
jfw: er, mod power-of-two
jfw: the arrays in question are indexed mod 5, which being coprime to any power of 2, gives a decidedly different result from if x were a signed or mathematical integer.
jfw: (in another part of the code I had in fact anticipated this.)
The arithmetic in question sure looked correct on the surface, so I tracked this down by adding fine-grained test probes to compare each step of the permutation against provided data.(i) Further contemplation lead me to see that the mod-5 operations were entirely invariant in the permutation's input and thus amenable to a lookup table without introducing a timing side-channel.
I did a full re-read a month after writing; the only mistake found was a comment with off-by-one range description.
Capacity and output length parameters are user-selectable; I chose a default of 512 bits for both, as explained in the code:
Sponge capacity is an upper bound on security level because the permutation is readily inverted if its full output is known. Beyond that, its relationship to actual security is not clear to me (or perhaps anyone); some margin of safety seems prudent. In the FIPS202 parameters, capacity under 512 is seen only in "SHAKE128" and none of the "SHA3" fixed-width hashes. The EuCrypt default is 256 (bitrate 1344); I have not found a discussion of this choice.
I should have piped up and asked about this at the time; though I was still a WoT non-entity living in the shadows, it happens that blog comments are one of the easier ways to get started in the grand conversation, and certainly if you have something interesting to say or ask.
The past being what it is, I will ask now: do any mathematical minds in the readership have input on what constitutes a "good" choice of capacity and why? If as I'd been thinking it's more than 256 to be at least as secure as SHA3, it would seem to suggest a need to regrind the existing Keccak vpatches or otherwise deal with a multiplicity of standards.
A compiler with 64-bit integer support is required; there is no dependence on machine byte order. The little-endian convention is used for interpreting bytes as the bit strings the permutation is defined on. I believe the code to be timing-invariant with respect to potentially secret bits, with the exception of output hex encoding. This would be good to fix. The other main deficiency I see is lack of a working "-c" option to verify provided hashes.
Finally, some basic performance numbers, from a modest Core 2 @2.26GHz, 3072K cache:
$ dd if=/dev/zero bs=1048576 count=100 | keksum 100+0 records in 100+0 records out 104857600 bytes (100.0MB) copied, 2.656227 seconds, 37.6MB/s 7a37879dcc634482775f4c1ebf294a0a02c9bcf924222cf6d2fe1c6beca3574debfe73f9034b868deefd7bbad4f5c251333bc3c735f1a82de045c7980814a2c2 $ dd if=/dev/urandom bs=1048576 count=100 > rand $ dd if=rand bs=1048576 count=100 | keksum 100+0 records in 100+0 records out 104857600 bytes (100.0MB) copied, 2.648968 seconds, 37.8MB/s ec4eab1cff9198e1ef23d663cc9da505eaeda713168bf31ed7807ec63fb1ddfa3454aea212fab193b071ad98e2cc47b5d004a4d1737d23ae8804978995ae1b7a
Download vpatches
- The spiffy modular types in Ada would have prevented this mistake entirely, though not without some cost either in runtime performance or compiler complexity (and thus potential bugs...), I would imagine. [^]
Builds, runs, and matches output of Diana's classic Keccaktron (when set to -s256.)
Comment by Stanislav Datskovskiy — 2019-12-04 @ 17:56
[...] Bitcoin project presently rests on some shaky foundations. Despite apparently intricate efforts, a Keccak based vtree has seen little progress, in that the original patch signers besides mod6 have not [...]
Pingback by Basic getrawtransaction patch proposal for bitcoind « Fixpoint — 2019-12-05 @ 17:35
From http://logs.ossasepia.com/log/ossasepia/2019-12-05#1012028 :
Comment by Jacob Welsh — 2019-12-05 @ 19:02
> what constitutes a "good" choice of capacity and why?
The sizes are not really relevant ; even 128 will be far beyond amply sufficient, if the presuppositions hold. And if not, obviously, 128mb or any other size is in principle equally useless.
This leaves the choice of capacity, once freed from such considerations, to be matched to more practical concerns. If, for instance, you're contemplating deployment in a UDP scheme, you might want to make it packet size, or half packet size, or somesuch.
Comment by Mircea Popescu — 2019-12-05 @ 20:52