-
-
Notifications
You must be signed in to change notification settings - Fork 180
abbrev. for Undefined behaviour, as signalled with -fsanitize=undefined
.
Typical undefined behaviour (UB) problems
Many word-wise hashes (in opposite to safe byte-wise processing)
don't check the input buffer for proper word alignment, which will
fail with ubsan or Sparc or old ARM CPU's. word being int32_t
or int64_t
or even
more. On some old RISC hardware this will be a BUS error, you can
even let Intel HW generate such a bus error by setting some CPU
flag. But generally using misaligned accesses is fine.
These are: mx3, Spooky, mirhash (but not strict), MUM, fasthash, Murmur3*, Murmur2*, metrohash* (all but cmetro*), Crap8, discohash, beamsplitter, lookup3, fletcher4, fletcher2, all sanmayce FNV1a_ variants (FNV1a_YT, FNV1A_Pippip_Yurii, FNV1A_Totenschiff, ...), fibonacci.
The usual mitigation is to check the buffer alignment either in the
caller, provide a pre-processing loop for the misaligned prefix, or
copy the whole buffer into a fresh aligned area.
Put that extra code inside #ifdef HAVE_ALIGNED_ACCESS_REQUIRED
.
Some hash function assume a padded input buffer which can be accessed past its length up to the word size. This allows for faster loop processing, as no 2nd loop or switch table for the rest is needed, but it requires a cooperative calling enviroment and is as such considered cheating. This is tested in the Sanity Tests
A simple type error, this hash needs to use unsigned integer types internally, to avoid undefined and inconsistent behaviour. i.e. SuperFastHash: signed integer overflow: -2147483641 + -113 cannot be represented in type 'int'
With: FNV1A_Pippip_Yurii, FNV1A_Totenschiff, pair_multiply_shift, sumhash32 shift exponent 64 is too large for 64-bit type 'long unsigned int'