What is Hash Table, how to create a hash table?

How do I create my own Hash Table implementation in Python?

  • I was thinking of a basic implementation where I could use a small array of some size, but what seems to be most problematic is the collision resolution part. I was thinking of using BST/Linked lists to place items with same hash(index), but conflict resolution seems hard. Can anyone show how it can be implemented?

  • Answer:

    Use a list of lists. Preliminaries A hash table maps a possibly infinite domain to a finite output range. To map a set of infinite inputs to a set of finite outputs, we use hash functions. For this demonstration we use a simple hash function, using the modulus operator such that h(x)=xmodkh(x)=xmodk h(x) = x\mod{k} where k is the table size. where the RHS denotes the index of the table in which we insert the record xx x We could use more exotic hash functions but for our purposes the mod function should suffice. The hash table should be engineered such that each slot in the table is equally likely to be picked by the hash function (to avoid collision) and to have a sub linear time complexity. P(X=index)=1kP(X=index)=1k P(X = index) = \dfrac{1}{k} where k is the table size. Crude Approach First, we shall demonstrate a crude approach and improvise it as and when we face difficulties. We are going to work only with integers as inputs in this demo. For now, I'm going to use a single python list as a table. The mod function with map the input to a specific index in the list. We'll use a table size of 10 for the time being. #Initialize a table of zeros. 10 is the table size. table = [0] * 10 def hash_function(x): return x % 10 Let's insert a few values. def insert(table,input,value): table[hash_function(input)] = value insert(table,41,'apple') insert(table,93,'banana') Our table as of now looks like this Now lets insert 13. insert(table,13,'tangerine') Now what do we do? There are several ways to resolve collisions, such as chaining (we are going to use this), open addressing and so on. Second Approach with chaining Now, instead of only a list as an output table, we use a list of lists. Every list inside this master list is a sort of a bucket in which we insert (key,value) pairs. table = [[] for x in range(10)] In other languages, you could use a linked list for this operation. Visually this looks like a linked list. Our hash function remains the same. We change the insert function as follows def insert(table,input,value): table[hash_function(input)].append((input,value)) Let's insert a few values. insert(table,41,'apple') insert(table,93,'banana') insert(table,13,'tangerine') Our table looks like this now Thus, we have implemented a rudimentary but working hash table. Note that to retrieve a value, we find the index, traverse the list at the index, match the input value to the key of that tuple and decide if it is present. If we reach the end of the list, the given value is not present. Extra If there are too many inputs, the load factor (number of inputs divided by table size) becomes too large, too many inputs will be inserted into one index and the table will become slow, that is, the time complexity will increase. We aim for a O(1)O(1) O(1) time complexity for a table. Load factor α=nkα=nk \alpha = \dfrac{n}{k} should be low. nnn is the number of inputs and kkk is the table size. If the number of inputs exceeds a particular threshold, we use techniques like table doubling to reduce the load factor.

Aditya N. Joshi at Quora Visit the source

Was this solution helpful to you?

Other answers

You can consider the approach developed in the website attached.http://interactivepython.org/runestone/static/pythonds/SortSearch/Hashing.html

Steven Huang

Related Q & A:

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.