# Exercises

Hello everybody, today let’s exercise our knowledge !

Q1.

Calculate the following lift values for the table correlating Burger & Chips below:

LIFT (Burger, Chips)

LIFT (Burger, ^Chips)

LIFT (^Burger, Chips)

LIFT (^Burger, ^Chips)

 Column1 Chips ^Chips Total Row Burgers 600 400 1000 ^Burgers 200 200 400 Total Column 800 600 1400

LIFT (Burgers, Chips)

s (Burgers u Chips) = 600/1400 = 0.428

s(Burgers) = 1000/1400 = 0.714

s(Chips) = 800/1400  = 0.571

LIFT (Burgers, Chips) = 0.428/(0.714*0.571) = 1.049

LIFT (Burgers, Chips) > 1

My answer suggests that Burgers and Chips are positively correlated.

LIFT (Burgers, ^Chips)

s(Burgers u ^Chips) = 400/1400 = 0.285

s(Burgers) = 1000/1400 = 0.714

s(^Chips) = 600/1400 = 0.428

LIFT (Burgers, ^Chips) = 0.285/(0.714*0.428) = 0.932

LIFT (Burgers, ^Chips) < 1

My answer suggests that Burgers and ^Chips are negatively correlated.

LIFT (^Burgers, Chips)

s(^Burgers u Chips) = 200/1400 = 0.142

s(^Burgers) = 400/1400 = 0.285

s(Chips) = 800/1400 = 0.571

LIFT (^Burgers, Chips) = 0.142/(0.285*0.571) = 0.872

LIFT (^Burgers, Chips) < 1

My answer suggests that ^Burgers and Chips are negatively correlated.

LIFT (^Burgers, ^Chips)

s(^Burgers u ^Chips) = 200/1400 = 0.142

s(^Burgers) = 400/1400 = 0.285

s(^Chips) = 600/1400 = 0.428

LIFT (^Burgers, ^Chips) = 0.142/(0.285*0.428) = 1.164

LIFT (^Burgers, ^Chips) > 1

Burgers and Chips are positively correlated.

Q2.

calculate the following lift values for the table correlating Ketchup & Shampoo below:

• LIFT (Ketchup, Shampoo)
• LIFT (Ketchup, ^Shampoo)
• LIFT (^Ketchup, Shampoo)
• LIFT (^Ketchup, ^Shampoo)

indicate if each of your answers would suggest independent, positive correlation, or negative correlation.

 Column1 Shampoo ^Shampoo Total Row Ketchup 100 200 300 ^Ketchup 200 400 600 Total Column 300 600 900

LIFT (Ketchup, Shampoo)

s(Ketchup u Shampoo) = 100/900 = 0.111

s(Ketchup) = 300/900 = 0.333

s(Shampoo) = 300/900 = 0.333

LIFT (Ketchup, Shampoo) = 0.111/(0.333*0.333) = 1.001

LIFT (Ketchup, Shampoo) = 1

My answer suggests that Ketchup and Shampoo are independent.

LIFT (Ketchup, ^Shampoo)

s(Ketchup u ^Shampoo) = 200/900 = 0.222

s(Ketchup) = 300/900 = 0.333

s(^Shampoo) = 600/900 = 0.666

LIFT (Ketchup, ^Shampoo) = 0.222/(0.333*0.666) = 1.001

LIFT (Ketchup, ^Shampoo) = 1

My answer suggests that Ketchup and Shampoo are independent.

LIFT (^Ketchup, Shampoo)

s(^Ketchup u Shampoo) = 200/900 = 0.22

s(^Ketchup) = 600/900 = 0.67

s(Shampoo) = 300/900 = 0.33

LIFT (^Ketchup, Shampoo) = 0.222/(0.666*0.333) = 0.22/0.22 = 1.001

LIFT (Ketchup, Shampoo) = 1

My answer suggests that Ketchup and Shampoo are independent.

LIFT (^Ketchup, ^Shampoo)

s(^Ketchup u ^Shampoo) = 400/900 = 0.444

s(^Ketchup) = 600/900 = 0.666

s(^Shampoo) = 600/900 = 0.666

LIFT (^Ketchup, ^Shampoo) = 0.444/(0.666*0.666) = 1.001

LIFT (Ketchup, Shampoo) = 1  (Ketchup and Shampoo, Independent)

Ketchup and Shampoo are independent.

Q3.

calculate the following chi Squared values for the table correlating Burger and Chips below (Expected values in brackets).

• Burgers & Chips
• Burgers & Not Chips
• Not Burgers & Chips
• Not Burgers & Not Chips

For the above options, also indicate if each of your answer would suggest independent, positive or negative correlation.

 Column1 Chips ^Chips Total Row Burgers 900 (800) 100 (200) 1000 ^Burgers 300 (400) 200 (100) 500 Total Column 1200 300 1500
χ² Burgers & Chips (900-800)²/800 = 12.5
Positive
χ² Burgers & not Chips (100-200)²/200 = 50
Negative
χ² Chips & not Burgers (300-400)²/400 = 25
Negative
χ² NotBurgers & not Chips (200-100)²/100 = 100
Positive
(900-800)²/800+(100-200)²/200+(300-400)²/400+(200-100)²/100 = 187.5

Q4:

calculate the following chi squared values for the table correlating burger and sausages below (Expected values in brackets).

• Burgers & Sausages
• Burgers & Not Sausages
• Sausages & Not Burgers
• Not Burgers and Not Sausages

For the above options, please also indicate if each of your answer would suggest independent, positive correlation, or negative correlation?

 Column1 Chips ^Chips Total Row Burgers 800 (800) 200 (200) 1000 ^Burgers 400 (400) 100 (100) 500 Total Column 1200 300 1500

Χ= (800-800)/ 800 + (200-200)/ 200 + (400-400)/ 400 + (100-100)/ 100

= 0/ 800 + 0/ 200 + 0/ 400 + 0/ 100 = 0

Burgers & Chips, Independent Χ2  = 0.

Burgers & Chips– Observed & Expected, 800 – Independent

Burgers & ^Chips – Observed & Expected, 200 – Independent

^Burgers & Chips – Observed & Expected, 400 – Independent

^Burgers & ^Chips – Observed & Expected, 100 – Independent

Q5:

Under what conditions would Lift and Chi Squared analysis prove to be a poor algorithm to evaluate correlation/dependency between two events?

The conditions under Lift & Chi Squared analysis that could prove to be a poor algorithm to evaluate correlation / dependency between two events are when there are too many null transactions observed.

Suggest another algorithm that could be used to rectify the flaw in Lift and Chi Squared?

Another algorithm that could be used to rectify the flow in Lift & Chi squared is: AllConf, Cosine, Jaccard, MaxConf, Kulczynski.