Tuesday, 13 December 2011

SQL - Parse array of guid's Xml

drop PROCEDURE ParseList
go

CREATE PROCEDURE ParseList @list xml AS
   BEGIN
        DECLARE @Ids TABLE(Id uniqueidentifier)
        INSERT INTO @Ids
        SELECT T.Ids.value('.', 'uniqueidentifier') FROM @list.nodes('/ArrayOfGuid/guid') AS T(Ids)
        SELECT * FROM @Ids
    END
GO

EXEC ParseList
N'
        699f9527-1f9d-4a00-937b-a7637b0a8c03
        f766e5fb-27a7-4171-bf0c-c0e093baed27
        c22b2aa6-541e-495f-b315-233c0ed0e7a7
        ccf83d1e-5809-4214-a5a9-416ce22e62ca
     
'
go

Tuesday, 6 December 2011

GetHashCode

We basically use hash functions for Data integrity, faster look up etc.
Imagine doing a 'foreach' to look up an item in a hundred of thousands of List, it will certainly hamper the performance. One alternative is to use hashes to create something called _buckets_ of List. Essentially, it means List[] instead of List. Eric's article explains this, and he lists down all the guidelines for using GetHashCode, and i try to summarize below

1. Equal items should have equal hash.

Class Employee
{
      Public int Name { get; set; }
}

Employee suresh1 = new Employee { Name = "Suresh" };
Employee suresh2 = new Employee { Name = "Suresh" };

Ideally suresh1.GetHashCode() == suresh2.GetHashCode()
[Normally this is not the case, because String.GetHashCode will return different hash for same value, but the point here is we need to modify the GetHashCode()]

if this didn't return same value, ContainsKey will fail due to look up on different bucket than it is originally stored.

2. GetHashCode on a type should be based on its immutable fields or it's value should never change, if possible. At least it should not change while it is contained on a data structure based on hashing like Dictionary/HashSet/HashTable etc.

3. Should never throw exception.

4. Should be really fast.

5. Logically related records are physically stored together, but randomly. If we are going to store clustered data in a same bucket, it will affect performance as other buckets will never be used. So this needs to be random.

6. It is often suggested to use Big prime numbers while calculating Prime numbers instead of XOR.

int hash = 13;
hash = (hash * 7) + field1.GetHashCode();
hash = (hash * 7) + field2.GethashCode();
...
return hash;
 

7. Hash Collision - If two items have same hash code, it means they are equal(Rule 1), but not always. Hence we need to override equals() or  IEqualityComparer

Some useful stuffs

http://blogs.msdn.com/b/ericlippert/archive/2010/03/22/socks-birthdays-and-hash-collisions.aspx
http://msdn.microsoft.com/en-us/library/ms379571%28VS.80%29.aspx#datastructures20_2_topic5
http://stackoverflow.com/questions/371328/why-is-it-important-to-override-gethashcode-when-equals-method-is-overriden-in-c
http://stackoverflow.com/questions/263400/what-is-the-best-algorithm-for-an-overridden-system-object-gethashcode#263416
http://stackoverflow.com/questions/638761/c-gethashcode-override-of-object-containing-generic-array/639098#639098

Readonly vs Thread safe

Interesting article from Eric, basically we should not assume read only operations unless documented. In above article, read only operation mutates the structure for performance.

Friday, 2 December 2011

Primitive Types, Value Types and Reference Types

Just for my reference, good article

Signalling Threads

I am just trying to summarize what i learned over a period of time regarding Thread Signalling.

To put it in simple words, a thread might signal about something that might be of interest to other threads. For example, an Order Thread(OT) might signal that it created order in the database, so the Email Thread(ET) can pull certain details like total, email etc to mail _Confirmation_ to the customer. Also a Report Thread(RT) might be interested(to create something called _Order Total Today_). I am not going to discuss why this needs to be done asynchronously(then there is no point in creating this article :)).

In .Net there are various ways in which we can do thread signalling, but we can broadly categorize this into two types

1. Thread signalling within the process
2. Thread signalling across the process's

AutoResetEvent and ManualResetEvent belongs to Category 1
EventWaitHandle, Semaphore and Mutex belongs to Category 2

We take a moment to understand Thread Signalling and Thread Synchronization. Thread Synchronization, allows serialized access to a shared resource. So lock, Monitor, ReadWriterLock, ReadWriterLockSlim and Interlocked are such examples. We can even use Thread Synchronization for signalling(in fact they internally have to do some kind of signalling, but Thread signalling classes provides better API's). But having said that Thread Signalling is a kind of Thread Synchronization, in my Order example threads signal for synchronized access to database(in case of ET & RT).

wrt Thread Signalling, when a thread signals - it means 'I am done'. We can say a thread's flowchart will look like Wait->Lock->Act->Signal. So all signalling classes will helps us do the _last step_ depending upon the scenario.

EventWaitHandle - Threads can rely on WaitOne() to wait for any signals. Thread should call Set() to signal that it is done. So if more than one thread called WaitOne(), will all they be able to proceed further? Yes, in case of ManualResetEvent, No incase of AutoResetEvent. If we don't want all the threads to proceed, we need to call Reset()(which is called automatically for AutoResetEvent).

What else is the difference between Manual & Auto(copied)

It's like the difference between a tollbooth and a door. The ManualResetEvent is the door, which needs to be closed (reset). The AutoResetEvent is a tollbooth, allowing one car to go by and automatically closing before the next one can get through.

Note : Signaled State - threads can proceed further
           Non-signaled State - threads are blocked

EventWaitHandle, Semaphore and Mutex - all can create 'named wait handles' hence it can be accessed across process. If we want to open 'named wait handles', we need to call 'OpenExisting()'

Semaphore(from Wikipedia) - 

Suppose a library has 10 identical study rooms, intended to be used by one student at a time. To prevent disputes, students must request a room from the front counter if they wish to make use of a study room. When a student has finished using a room, the student must return to the counter and indicate that one room has become free. If no rooms are free, students wait at the counter until someone relinquishes a room.
The librarian at the front desk does not keep track of which room is occupied, only the number of free rooms available. When a student requests a room, the librarian decreases this number. When a student releases a room, the librarian increases this number. Once access to a room is granted, the room can be used for as long as desired, and so it is not possible to book rooms ahead of time.

So here we can set initial count(no of rooms free) & maximum rooms using 'new Semaphore(0, 10)'. Students can request for a room using WaitOne() and vacate using ReleaseOne(). So 'new Semaphore(0, 10)', means 10 students will be allocated rooms immediately, but more than that only depending upon on room vacation.

Mutex - Similar to any thread synchronization techniques like lock/monitor etc. WaitOne() & ReleaseMutex() are used correspondingly. Important point is 'AbandonedMutexException'. When thread aborts or killed abruptly without releasing the Mutex, further threads calling 'WaitOne()' will experience this exception. So it is better ti have try catch for this kind of exception when dealing with Mutex.

Before i close, i want to touch on 'immutability', this is an important concept on thread synchronization. Immutable class means, it's state cannot be changed once created. This allow us to just share reference of the object across threads without worry as threads can't change the state of the object.

Hope this if of some use.

Thursday, 1 December 2011

Mutex - Good pattern?

Just for my reference

http://stackoverflow.com/questions/229565/what-is-a-good-pattern-for-using-a-global-mutex-in-c/229567#229567

This below link for creating a single instance application(referencing it to 'ReleaseMutex' method should force the garbage collector to leave it alone)
http://stackoverflow.com/questions/777744/how-can-i-check-for-a-running-process-per-user-session