Saturday, January 2, 2010

Association Rules- An exercise

An Insurance Company is planning to launch a new Insurance product in the market by packaging some of the add-ons it had been offering in it’s motor Insurance policies. It has listed out following five add-ons for this exercise.

a) Depreciation waiver benefit
b) Daily cash allowance while the car is under repair
c) Cover for personal luggage loss
d) On road assistance to car and car passengers
e) Medical Extension.

From the records, following number of incidences of various combinations of add-ons is found.

f(a) = 489, f(b) = 823, f(c) = 56, f(d) = 372, f(e) = 649, f(a,b) = 682, f(a,c) = 68,
f(a,d) = 532, f(a,e) = 89, f(b,c) = 312, f(b,d) = 279, f(b,e) = 24, f(c,d) = 52, f(c,e) = 10,
f(d,e) = 189, f(a,b,c) = 227, f(a,b,d) = 295, f(a,b,e) = 378, f(a,c,d) = 1, f(a,c,e) = 10,
f(a,d,e) = 512, f(b,c,d) = 2, f(b,c,e) = 13, f(b,d,e) = 187, f(c,d,e) = 10, f(a,b,c,d) = 0, f(a,c,d,e) = 5, f(b,c,d,e) = 2, f(a,b,d,e) = 198, f(a,d,c,e) = 4, f(a,b,c,d,e) = 85.

Here f(x,y,z) means number of policies in which add-ons x,y and z have been taken and remaining add-ons have not been taken.

You work on these data to find some association rules. The association rules will be in following form:

“A person opting for add-ons x,y is likely to opt for w as well”.

Let us call the first set i.e. x,y as current set and the second set w as associated set. There can be any number of add-ons from a,b,c,d,e in these sets. An add-on cannot be in both the sets.

Let there be a threshold value and a threshold probability for accepting an association as association rule. Threshold value is the minimum number of incidences for current set and the threshold probability is the minimum probability of someone taking the associated set add-ons if he has taken the current set add-ons.

For this exercise take the threshold value as 2000 and the threshold probability as 0.4.

If you need any clarification, please ask that as comment to this post.

4 comments:

Unknown said...

dear sir, can u solve any problem with any one value so that we can understand easily. how the information regarding threshold value would be used? i am not very confident abt the same. plz clarify.

Girijesh said...

Suppose number of policies in which add-on x has been taken is 2800, out of which in 1500 cases add-on z also has been taken then 'A person opting for x is likely to opt for z also' is a valid association rule in this exercise. But if these numbers are 1900 and 1200 respectively, the association will not be taken as a rule. Also respective values of 6000 and 2100 too will not give a valid association rule.

If you are not confident, you have to practice harder and apply yourself more.

M. Mangal said...

Dear sir,
There may be a discrepency in the given data.In the combination of 4 add-on f(a,c,d,e)=5 and f(a,d,c,e)=4 is given.But elements are similar.also f(a,b,c,e) is missing.
kindly throw some light on that.

Girijesh said...

Thanks. This is a typing error. f(a,d,c,e) should be f(a,b,c,e). So, you get the concept.