How to do an inverted heap implementation in JAVA?

Why does Java use a mediocre hashCode implementation for strings?

  • The Java hashCode() implementation for strings (and arrays of primitive types) is quite simple: int h = 0; for (int i = 0; i < input.length() ; i++) { h = 31 * h + input.charAt(i); } This hash function isn't particularly good, especially in the higher bits [1]. There are much better string hash functions (e.g. Jenkings Hash, FNV Hash [2]) that are roughly equally fast to execute but have a better distribution. Why doesn't Java implement a more elaborate hashing scheme? [1] http://www.javamex.com/tutorials/collections/hash_function_technical_2.shtml [2] http://eternallyconfuzzled.com/tuts/algorithms/jsw_tut_hashing.aspx

  • Answer:

    Referring to the documentation of Object's hashCode : "This method is supported for the benefit of hashtables such as those provided by java.util.Hashtable." http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/lang/Object.java#Object.hashCode%28%29 The Java language designers added hashcode primarily for hash-based collections such as HashTable and HashMap. String is no different. If you look at HashMap, you will notice that it is uses "bitwise-And" to assign objects to HashMap buckets -- essentially the low-order bits determine the bucket index. /*** Returns index for hash code h. */ static int indexFor(int h, int length) {    return h & (length-1); } From from your school days, you may recall using myString.hashCode() % numOfBuckets to assign an object to a bucket. In Java HashMap (starting in 1.4, I believe), Josh Bloch et al changed this to myString.hashCode() & (numOfBuckets -1) This is much cheaper in terms of CPU as modulo (division) is more expensive than bit-AND or bit-shifts. As long as the low order bits of myString.hashCode() are random, this algorithm will get uniform object-to-bucket distribution. However, since anyone can override hashCode for an object, Josh et al needed to provide some safeguards. This is accomplished by a supplemental hash method (in the HashMap class). /*** Applies a supplemental hash function to a given hashCode, which      * defends against poor quality hash functions.  This is critical      * because HashMap uses power-of-two length hash tables, that      * otherwise encounter collisions for hashCodes that do not differ      * in lower bits. Note: Null keys always map to hash 0, thus index 0.      */     static int hash(int h) {         // This function ensures that hashCodes that differ only by         // constant multiples at each bit position have a bounded         // number of collisions (approximately 8 at default load factor).         h ^= (h >>> 20) ^ (h >>> 12);         return h ^ (h >>> 7) ^ (h >>> 4);     } To summarize, a key aim of Java's hashCode was to support hash-based collections. Hash-based collections like HashMap deal with some bit-bias and  issues related to poor hash function implementations, so a mediocre hashCode  implementation in String is not a terribly big deal.

Siddharth Anand at Quora Visit the source

Was this solution helpful to you?

Other answers

Because it's cheap; if you need something more sophisticated, then you probably won't be able to use any other default function either (which also should be cheap). There is a reason why hashCode() is easily overrideable. Even if tempting, changing it would break any applications with stored hashes.

Toby Thain

Compatibility mostly.  It was a lot worse in JDK 1.0 and 1.1 where it would only sample every other character.  JDK and above multiplied an odd prime (31) by the sequence char to avoid collisions.  Seems like a decent tradeoff between speed and collision reduction.

Chris Longo

The hash code algorithm is specified in the documentation for String so they are "not allowed" to change it (it would break backwards compatibility). Some people may have written programs that depend on this specific algorithm

Jonathan Paulson

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.