Imagine doing a 'foreach' to look up an item in a hundred of thousands of List
1. Equal items should have equal hash.
Class Employee
{
Public int Name { get; set; }
}
Employee suresh1 = new Employee { Name = "Suresh" };
Employee suresh2 = new Employee { Name = "Suresh" };
Ideally suresh1.GetHashCode() == suresh2.GetHashCode()
[Normally this is not the case, because String.GetHashCode will return different hash for same value, but the point here is we need to modify the GetHashCode()]
if this didn't return same value, ContainsKey will fail due to look up on different bucket than it is originally stored.
2. GetHashCode on a type should be based on its immutable fields or it's value should never change, if possible. At least it should not change while it is contained on a data structure based on hashing like Dictionary/HashSet/HashTable etc.
3. Should never throw exception.
4. Should be really fast.
5. Logically related records are physically stored together, but randomly. If we are going to store clustered data in a same bucket, it will affect performance as other buckets will never be used. So this needs to be random.
6. It is often suggested to use Big prime numbers while calculating Prime numbers instead of XOR.
int hash = 13;
hash = (hash * 7) + field1.GetHashCode();
hash = (hash * 7) + field2.GethashCode();
...
return hash;
7. Hash Collision - If two items have same hash code, it means they are equal(Rule 1), but not always. Hence we need to override equals() or IEqualityComparer
Some useful stuffs
http://blogs.msdn.com/b/ericlippert/archive/2010/03/22/socks-birthdays-and-hash-collisions.aspx
http://msdn.microsoft.com/en-us/library/ms379571%28VS.80%29.aspx#datastructures20_2_topic5
http://stackoverflow.com/questions/371328/why-is-it-important-to-override-gethashcode-when-equals-method-is-overriden-in-c
http://stackoverflow.com/questions/263400/what-is-the-best-algorithm-for-an-overridden-system-object-gethashcode#263416
http://stackoverflow.com/questions/638761/c-gethashcode-override-of-object-containing-generic-array/639098#639098
No comments:
Post a Comment