Category Archives: Hashing


SipHash: a fast short-input PRF

This is a hash function with a 64-bit output that is designed to be resistant to hash-flooding attacks. The authors are Jean-Philippe Aumasson and Daniel J. Bernstein. The algorithm is supposed to be competitive in speed with MurmurHash.

HASH-FLOODING DOS RELOADED: ATTACKS AND DEFENSES – talk on how to attack common hash functions. Video.

Related page: someone implementing it in Python, and re-implementing it in C. SipHash.


Sodium and NaCl


NaCl: Networking and Cryptography library

“NaCl (pronounced “salt”) is a new easy-to-use high-speed software library for network communication, encryption, decryption, signatures, etc. NaCl’s goal is to provide all of the core operations needed to build higher-level cryptographic tools.”

The main author is Daniel Bernstein, who has a long history of writing very secure and stable software.

The main page lists a paper that should be read that covers the design of the library.



Despite the proclamation, Sodium is Nacl, but a re-implementation sharing the same API. NaCl apparently only compiles on a few platforms, whereas Sodium builds on many platforms, including OpenBSD, Dragonfly BSD, NetBSD, FreeBSD, SmartOS, OSX, Linux, Windows, iOS and Android.

Not to denigrate the developer of Sodium at all – Frank Denis works at OpenDNS, and wrote DNSCrypt.

Sodium GitHub page.

Non-cryptographic hash functions

Here is a collection of all the “good” non-cryptographic hash functions that I am aware of. Also

Hash function page from Wikipedia.

Bob Jenkins

1997 – lookup8 (64-bit)

2006 – lookup3 (32-bit and 64-bit)

Also see SpookyHash


The CityHash family of hash functions (32-bit through 128-bit).


MurmurHash2- deprecated.

MurmurHash3 and SMHasher (SMHasher is a test suite).

Test statistics from 2008.


FNV Hash – only use FNV-1a.


Old Hash Functions page from 1990 era.

Hash Functions page by Paul Hsieh, 2008.

Which hashing algorithm is best for uniqueness and speed from the Programmers StackExchange.

Hash functions: an empirical comparison from

xxhash on

Hash functions page from Bret Mulvey, 2007

Creating a Fast Hash Function


SHA-1 implementations

I’m going to attempt to collect open-source implementations of SHA-1 here. There are many more of them than MD5, which is appropriate, considering that SHA-1 is the most widely used message digest function to date.

Obligatory Wikipedia page on SHA-1.

FIPS PUB 180-4: Secure Hash Standard, March 2012, covers SHA-1 and SHA-2. In this document, all but SHA-1 are variants of SHA-2 – e.g. SHA-256 and SHA-512 are 256-bit and 512-bit output variants of SHA-2, respectively.

NIST 800-107: Recommendations for Applications Using Approved Hash Algorithms is also worth reading.

NIST has a list of all validated SHA-1 implementations: SHS Validation List. As of 2/22/2013, there are 2024 of them, although to be fair, a lot of the entries are incremental updates by folks like IBM/Tivoli, and the majority of entries are in the past few years (I suppose this means increased scrutiny on security and trust).

1995, FIPS 180-1

FIPS 180-1 – Secure Hash Standard. This was the formal release of the SHA-1 algorithm. FIPS 180-1 was superceded by FIPS 180-2, FIPS 180-3 and now FIPS 180-4. The document had pseudo-code and several test vectors. While MD5 has a single author, Ronald Rivest, SHA-1 was invented by the NSA.

2001, RFC 3174 (Donald Eastlake and Paul Jones)

RFC 3174 has an implementation of SHA-1 in C by Donald Eastlake and Paul E. Jones. The date on the RFC is 2001, but it’s likely that drafts and code were written several years before this.

1998, Packetizer (Paul Jones)

There is a slight variant from Paul E. Jones, linked in Secure Hashing Algorithm (SHA-1),which appears to have been written around 1998, with an update in 2009. There are a handful of changes from the code in RFC 3174. I don’t know if this is the same Paul E. Jones from RFC 3174.

2006, RFC 4634 (Donald Eastlake and Paul Jones)

RFC 4634 is an update to handle input of arbitrary bit length. RFC and presumably code written by Donald Eastlake and Paul Jones, 2006.

2011, RFC 6234 (Donald Eastlake and Tony Hansen)

RFC 6234 replaces RFC 4634, and adding HMAC and HKDF functionality. The RFC (and presumably code) was written by Donald Eastlake and Tony Hansen, 2011.

1998, Ghostscript (Steve Reid, James Brown)

SHA-1 in C from ghostscript.


There are validated implementations of SHA-1 in the OpenSSL FIPS Object Model, at


IBM Crypto for C (ICC) is listed as “a non-proprietary FIPS 140-2 cryptographic module”. I couldn’t find source available, though, so I’ll probably remove this.

Mozilla NSS

Mozilla Network Security Services (NSS) contains SHA-1 implementations.

Crypto++ (Wei Dai)

Crypto++ has SHA-1 implementations

2001, Botan (Jack Lloyd)

Botan has SHA-1 implementations.

2004, Aaron Gifford

Secure Hash Algorithm (SHA) has BSD licensed code for SHA-1.

2006, Cryptlib (Peter Gutman)

Cryptlib has SHA-1.

2009, Git (Linus Torvalds)

Commit d7c208a9, Add new optimized C ‘block-sha1′ routines, introduced an optimized version of the Mozilla SHA1 function to git, reportedly about 30% faster. This version would not work on some architectures, because it requires an architecture that allows unaligned 32-bit loads.

Also see comments on mailing lists, including Re: x86 SHA1: Faster than OpenSSL for development history.

This is in coreutils, which is GPLv3, but a message from Linus indicated he might be willing to stick with the MPL license, since he started from mozilla-sha1; see thread Linus’ sha1 is much faster! There’s also a separate implementation from George Spelvin in a thread spun off from that thread.

2011, smallsha1 (Micael Hildenborg)

smallsha1 is written in C++. It uses an old-style BSD license, the one with an advertising clause, so it’s probably not suitable for use by most companies.

2012, Nayuki Minasi

Fast SHA-1 hash implementation in x86 assembly

Source code available, but not open source. Gets about 327 MiB/sec on a Core 2 Q6600 2.4 GHz, compared with (claimed) OpenSSL speed of 305 MiB/sec on equivalent hardware.


MD5 implementations

I’m going to try to collect all the open-source MD5 implementations that I know of, in this post. I’ll extend this to all useful cryptographic hash functions. I’m going to supply them directly as links here (for posterity’s sake), but also include links to as original sources. One reason to do this is to find or create a really fast one; all of the ones I’ve seen so far are reasonable, but not speedy.

MD5 Homepage (unofficial) is Mordechai Abzug’s collection of MD5 information.

1991, RSA (Ron Rivest)

RFC 1321: The MD5 Message-Digest Algorithm

Ron Rivest and RSA released MD5 in an open-source fashion in 1992 in the appendix of RFC1321, along with test vectors. The method listed in the RFC source – context, init-func, update-func, finalize-func – has been followed pretty faithfully ever since.

1993, Collin Plumb

MD5 Command Line Message Digest Utility

Collin Plumb wrote the first public-domain implementation of the MD5 algorithm in 1993, and his code was “hacked slightly by John Walker to turn it back into K&R”. This might need to be credited to the combination of Collin Plumb, Branko Lankester, Ian Jackson and Galen Hazelwood.

1999, L. Peter Deutsch/Aladdin


L. Peter Deutsch also wrote his own version of MD5 from RFC 1321, in order to have something with a more BSD-like license.

2001, Alexander Peslyak

A portable, fast, and free implementation of the MD5 Message-Digest Algorithm (RFC 1321)

Alexander Peslyak (posting as Solar Designer,, wrote an OpenSSL-compatible version of MD5 in 2001.

 2012, Nayuki Minase

Fast MD5 hash implementation in x86 assembly

Nayuki Minase optimized an MD5 implementation for speed. The assembly version is 10% faster than the C version, but the impressive part is that the C version is 390 MiB/sec, just short of OpenSSL’s speed of 410 MiB/sec. The speeds reported are on a Core2 Q6000 with a 2.4 Ghz clock speed, but the limiting factor might be memory access speeds, not CPU speed itself.

The code is specifically not open-source, although the source code is available, so it’s really more of a data point, and not something that can be used (without licensing).