Sunday, March 19, 2006

Hash Tables

Couple of days back, I got into a discussion about hash tables with one of my friends. Initially we were confused about what good it is over an array. Before explaining the confusion, I'll describe an example. (You know I strongly believe what my sixth standard teacher told me once: "Eg makes it easy.")

Suppose we want to store a table containing roll numbers and names of students:
(1, Pundarikakshya Purakayastha)
(2, Prasenjit Sarvadhikari)
(... well I think it is better to shorten the names ...)
(3, Arun)
(4, Varun)
(5, Darth Vader)

Okay we can continue with this. Well we can store this data in an array containing five elements and store the ith entry in a[i-1]. That would be good enough. We were confused what is the point of defining something like a hash table. Well we realised the point is we don't construct hash tables for such data. To give a real example, hash tables are used to handle data like path names of Linux executables indexed by the command names, like (just as an example)
(ls, /bin/ls)
(rm, /bin/rm)
(shutdown, /sbin/shutdown)
(vi, /usr/bin/vi)
(acroread, /usr/local/bin/acroread)
... and so on

Suppose there are 1355 commands and if we put the whole data in a 2 x 1355 array, it might be a time consuming job to find the path for a command you want. First you have to find the position of the command in the array and then check the corresponding path. If the command lies towards the end of the array, it will take some time to find it out.

So the way a hash table solves the problem is that it uses a hash function to convert the command name into a number and then store the path in a one dimensional array at the corresponding place. Like if, for example, for a suitable hash function, "ls" converts to 6, the computer sets a[5] = "/bin/ls". And so on. So when the computer has to retrieve the path for ls, it calculates the value of the hash function at "ls" and gets 6 as it's value and then gets the path at a[5] to be "/bin/ls". That works much faster than searching an array for "ls".

Now, as usual, life is not all that fair and it's very hard to construct hash functions which are "good" in the sense that they do not give the same number for two different commands and at the same time the numbers it produces are not too huge compared to the total number of commands. Well if one is interested how one solves those problems, one can read the article on Hash Tables in Wikipedia or some advanced book on algorithms.

Tuesday, March 14, 2006

Memories from my mathematical childhood

The reason for writing such a strange blog is the sudden realisation that sometimes we don't give enough importance to the things we have around us. I don't want to get too philosophical but all this started when I was looking for a book which I treasured a lot in my schooldays. My father is a great admirer of Mathematics and used to get me books on Mathematics and mathematical puzzles. I used to like them. One of them which was my favourite at that time was by Ya. I. Perelman. It was titled "Mathematics can be Fun". There was another book by him: "Physics can be Fun" which also I liked. Now I don't know where they are. I just searched Google for the book. Got an Amazon link. "Mathematics can be Fun" is out of print.

I spent a lot of time with the book. I specially remember and liked the discussion on Diophantine equations and pythagorean triples. I was fascinated by the fact that, for three integers a,b and c, whenever a² + b² = c², there exists integers m and n such that

a = m² - n²,
b = 2mn,
c = m² + n².

The fact that one can logically argue to get these forms really fascinated me in my high school. I also liked the section on maxima and minima done with just some basic knowledge in high school algebra.

Anyway, I really liked the book and it's really sad that Mir had to stop publishing.

Monday, March 13, 2006

My first blog

I had been thinking of blogging for quite some time. The reason for blogging is completely personal. I'd like to put up some random stuff so that I don't lose them and so that I can share them with others more easily. Well the stuff will be mostly basic math trivia and computer related stuff though I don't really want to restrict myself about what I want to put in this blog right now. For example I might put up a few poems too. Or movie reviews. Anything. But then this is meant to be a serious blog and will be boring to most people.

Read at your own risk! :).

To the few friends who were asking me to blog for quite some time: Well finally, I'm here!