Mathematics AI SL

Sample Internal Assessment

6/7

10 mins Read

1,906 Words

English

Free

Jump To Section

Table of content

Rationale

Aim

Research question

Background information

What is strike rate

Physical benefits in athletics – height

Regression correlation coefficient

Pearson’s correlation coefficient

T – test

Hypothesis

Null hypothesis

Alternate hypothesis

Data collection

Source of data

Justification on selecting the source as IPL T20

Raw data table

Processed data table

Sample calculation

Processed data table analysis

Graphical analysis

Linear correlation

Polynomial correlation

Choice of axes

Trendline for linear correlation

Trendline for polynomial correlation

Outliers

Intercept for linear correlation

Calculation of maxima - minima for polynomial correlation

Calculation of correlation coefficient for linear trendline

Calculation of regression correlation coefficient

Analysis

Calculation of pearson’s correlation coefficient

Analysis

Evaluation of hypothesis

Processed data table

Calculation of t – value

Calculation of degree of freedom

Result of t – test

Conclusion

Reflection

Bibliography

I love sports from a very young age. When I was a kid, I remember my dad taking me to parks every day to play to game of cricket. I guess this is how my relation with sports strengthened. Though I have grown up now, I have an active participation in sports. My studies have never been an excuse to skip games. "All work and no play make Jack a dull boy"- I abide by the statement.

I do not play only for recreation; I follow sports religiously. I am into my school's cricket team. I take coaching classes and practice even after school. I love to play Cricket. I love being a batsman and captain of a team.

Recently, it was announced that players will be selected to be a part of the interschool cricket tournament. I was super excited and wished to grab the opportunity. When surfing the net for some tips, I read various resources, got to know many interesting facts but a statement about height being a factor of selection caught my eyes.

Being the captain of my school team, selecting players to an extent was my responsibility. I looked for confirmation everywhere but could not get a satisfactory answer. I could not decide on the players as I thought their heights should not overshadow their performances. It was a matter of their hard work as well as the name of the school.

Heaped with worries, I decided to research and find the answer to my query. This IA is about the same. In this IA, I have tried to find out if the height of a batsman determines his strike rate. I will also try to find how much height of a batsman act as a deciding factor in the result of the cricket match. This research will help me convince myself on selecting players for the competition.

The main motive of this IA is to study whether or not there exist a correlation between the strike rate of batsman and their height in the game of cricket. Furthermore, this IA will provide a brief information about the benefit or disadvantage a batsman has by default due to his height in scoring runs at a faster rate, i.e., his strike-rate. This exploration will help the team management and selection committee to sign contract with players.

What is the relationship between strike rate of batsman and the height of the batsman?

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

Strike rate1 is one of the most important parameters which measures the performance of any batsman in the game of cricket. It analyses how much the batsman has scored runs with respect to the number of balls he played. The formula of calculation of strike rate is shown below:

\(Strike\ Rate=\frac{Runs\ Scored}{Number\ of\ balls\ played}\times100\)

Height of players could be a benefit for any player in several games. For example, in games like football and basketball, taller players often stand a better chance in the gameplay with respect to performance over the players with comparatively shorter height.

In the game of cricket, taller batsman could have a better chance while playing short balls which will allow then to score a lot of runs in difficult deliveries also.

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

Regression correlation coefficient is a tool to measure the strength of the correlation between the independent variable and the dependent variable. The set of values (*x*_{1}*,y*_{1}), (*x*_{2}*,y*_{2}), (*x*_{n}*,y*_{n}) are used to find the value of * r *as stated by the formula below:

\(r=\frac{n(\sum xy)-(\sum x)(\sum y)}{\sqrt{[n\sum x^2-(\sum x)^2][n\sum y^2-(\sum y)^2]}}\)

In the above-mentioned formula, ** x **is the value of independent variable of each observation,

By squaring the value of * r*, the value of the regression coefficient (

Pearson’s correlation coefficient is a tool to measure the strength of the correlation and also the nature of correlation between the independent variable and the dependent variable. The set of values (*x*_{1}*,y*_{1}), (*x*_{2}*,y*_{2}), (*x*_{n}*,y*_{n}) are used to find the value of \(\mathfrak{R}\) as stated by the formula below:

\(\mathfrak{R}=\frac{\sum (x-\bar x)(y-\bar y)}{\sqrt{\sum(x-\bar x)^2\times\sum (y-\bar y)^2}}\)

In the above-mentioned formula, x is the value of independent variable of each observation, y is the value of dependent variable of each observation, **\(\bar x\)** is the arithmetic mean of all the observations of the independent variable, **\(\bar y\)** is the arithmetic mean of all the observations of the dependent variable and ∑ denotes the sum of all the observation of the mentioned variable. The value of \(\mathfrak{R}\) lies between -1 and 1. A positive value of Pearson’s correlation coefficient implies a direct relationship the independent and the dependent variable whereas, a negative value of Pearson’s correlation coefficient implies a indirect relationship the independent and the dependent variable. If the value of the correlation coefficient is close of 1 or -1, it signifies the correlation exists true. On the other hand, if the value of the correlation coefficient is close to 0, it signifies the correlation does not exist.

T – test is a kind of analysis which predicts the existence of any correlation between an independent variable and a dependent variable. The T – value of any given set of data is firstly calculated. Now, based on the type of data, for example, paired data or independent data, the T- value is checked in the T – table which further predicts the existence of any correlation. The formula of T – value is given below:

\(T\ value=\frac{|\bar x-\bar y|}{\sqrt{\frac{v_x^2}{n_x}+\frac{v_y^2}{n_y}}}\)

Here, **\(\bar x\)** is the arithmetic mean of all the observations of the independent variable, **\(\bar y\)** is the arithmetic mean of all the observations of the dependent variable, *v _{x}* is the variance of independent variable,

Now, the T – value is checked in T – table which predicts the existence of any correlation. The T – table is shown below:

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

It is assumed that there does not exist any correlation between strike rate of batsman and the height of the batsman.

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

It is assumed that there is a correlation between the strike rate of batsman and the height of the batsman.

The strike rate of different batsman with respect to their height has been collected from the very recently organised cricket tournament, Indian Premier League 2020 . Indian Premier League or abbreviated as IPL T20 is a domestic cricket tournament organized by BCCI (Board of Council for Cricket in India). Eight teams each representing a particular city/ state in India competes in a two – three months long tournament where players across the globe are signed contract and assigned in each team. As it is a twenty over match, it is often abbreviated as T20 series.

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

IPL T20 has been selected for collection of data for a various reason. Firstly, IPL, though a domestic tournament organized by BCCI, it offers an amalgamation of players across the globe. It will allow the data set to have more generalized observations rather than specific to any single country. Secondly, IPL T20 is one of the most recently organized tournaments. It will allow the data set to be updated with respect to the current style of playing the game of cricket. Thirdly, IPL is a twenty over game. A twenty over game’s pre-requisite is scoring runs at a smaller number of balls played. As a result, the strike rate of batsman in this tournament will be more than that of any other tournament. Higher observed values offer an ease and perfection to find the correlation than that of smaller observed values.

Sl. No

Batsmen

Height(cm)

Strike rate

1

Shakib al Hassan

155

82.05

2

Mushfiqur Rahim

160

92.67

3

Rashid Khan

168

100.96

4

Kusal Perera

168

110.97

5

Rishabh Pant

170

89.23

6

David Warner

170

89.36

7

JP Duminy

170

97.22

8

Rohit Sharma

170

98.33

9

Kane Williamson

173

99.8

10

Nicholas Pooran

173

100.27

11

Mosaddek Hossain

174

106.36

12

MS Dhoni

175

87.78

13

Mohammed Hafeez

175

88.77

14

Virat Kohli

175

94.04

15

Liton Das

175

110.17

16

Eoin Morgan

175

111.07

17

Aaron Finch

176

102.21

18

Usman Khawaja

177

88.26

19

Jonny Bairstow

178

92.84

20

Colin Munro

178

97.65

21

Shimron Hetmyer

178

101.58

22

Mohammad Saifuddin

179

120.83

23

Najibullah Zadran

180

88.8

24

Mahmudullah

180

89.75

25

Haris Sohail

180

94.28

26

Shikhar Dhawan

180

103.3

27

Jos Buttler

180

122.83

28

Avishka Fernando

181

105.72

29

Jason Roy

182

115.36

30

Glen Maxwell

182

150

31

Alex Carey

182

104.45

32

Joe Root

183

89.53

33

Hazratullah Zazai

183

94.11

34

Colin de Grandhomme

183

100.52

35

Soumya Sarkar

183

101.21

36

Hardik Pandya

183

112.43

37

Chris Woakes

185

89.93

38

Ben Stokes

185

93.18

39

Thisara Perera

185

95.31

40

Wahab Riaz

185

127.53

41

Imad Wasim

187

118.24

42

Chris Gayle

188

88.32

43

Rassie van der Dussen

188

90.37

44

Martin Guptill

188

143.13

45

David Miller

191

117.94

46

Nathan Coulter-Nile

191

136.11

47

Carlos Brathwaite

193

106.2

48

Chris Morris

196

121.31

49

Mitchell Stark

197

89.47

50

Jason Holder

201

108.97

\(\text{Mean }= \frac{y_1+y_2+...+y_n}n{}\)

\(\text{Arithmetic Mean }= \frac{82.05+92.67+100.96+...+89.47+108.97}{50} = 103.2144\)

\(\text{Standard Deviation }= \frac{\sqrt{(\bar y-y_1)^2+(\bar y-y_2)^2+...+(\bar y-y_n)^2}}{n}\)

\(\\text{Standard Deviation =}\frac{\sqrt{{\overline{(103.2144}-82.05)^2+(103.2144-92.67)^2+...+(\overline{103.2144}-108.97)^2}}}{50} = 14.967\)

The mean strike rate of all the batsman is 103.2144. On the other hand the standard deviation is 14.967. The value of standard deviation, being high, offers a wide range of values of strike rate with respect to the mean. As a result, it can be assumed that the strike rate varies greatly from each player to the other.

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

The X – Axis of the graph denotes the height of the batsman measured in centimetre (independent variable).

The Y – Axis of the graph denotes the strike rate of the batsman (dependent variable).

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

In this graph, a linear trendline has been obtained using the data that has been collected based on the most recent performance of the players in IPL 2020. The equation of the trendline is shown below:

*y* = 0.5913*x* - 3.1403

From the graph, it can be stated that, there exists a positive increasing correlation between the strike rate and height of each batsman. However, a lot of outliers are seen in the graph.

In this graph, a polynomial trendline has been obtained using the data that has been collected based on the most recent performance of the players in IPL 2020. The equation of the trendline is shown below:

*y* = -0.0089*x*^{2} + 3.7722*x* - 287.37

From the graph, it can be stated that, there exists a positive increasing correlation between the strike rate and height of each batsman. However, the slope of the curve is decreasing which implies the fact that with further increase in height, the strike rate will start to decrease.

There are a lot of outliers between the range of 170 cm to 190 cm height. This may be because of the several other parameters which either offers a partial benefit to the batsman in cricket. For example, if any bowler is at the top of his performance (form) and if any batsman is dismissed by the bowling skill of the bowler, then it significantly affects the correlation study. There are other factors which are responsible for presence of such a high number of outliers. They are – Current Form of Batsman, Pitch Condition, Weather Conditions, etc. All of the factors directly affects the performance of a batsman which in turn affects the correlation study. Due to presence of high number of outliers, the value of regression coefficient is 0.12. Such a small value (close to zero) of regression coefficient nullifies the existence of any linear correlation between the dependent and the independent variable.

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

The Y – intercept of the graph can be studied to comment on the existence of the linear correlation. From the equation of the trendline, the Y – intercept of the trendline has been calculated:

*y* = 0.5913*x* − 3.1403

The value of y for x = 0 will be:

*y* = 0.5913 × 0 − 3.1403

=> *y* = −3.1403

The value of Y – Intercept is -3.1403. A negative intercept is absurd to get in this correlation. This is because, for a height of zero centimetre, the strike rate has come out to be -3.1403. From the formula of strike rate that has been mentioned in the Background Information Section, the value of strike rate cannot be negative. Thus, it justifies the fact that the correlation between strike rate and height of batsman should not be linear.

From the equation of polynomial correlation, the value of maxima of the strike rate can be measured.

*y* = −0.0089*x*^{2} + 3.7722*x* − 287.37

Differentiating both sides with respect to x, we get,

\(\frac{dy}{dx}=-\frac{d(0.0089x^2)}{dx}+\frac{d(3.7722x)}{dx}-\frac{d(287.37)}{dx}\)

\(=>\frac{dy}{dx} = −0.0178x + 3.7722 − 0\)

\(=>\frac{dy}{dx} = −0.0178x + 3.7722\)

Further, differentiating both sides with respect to *x*, we get,

\(\frac{d^2y}{dx^2}=-\frac{d(0.0178x)}{dx}+\frac{d(3.7722)}{dx}\)

\(\frac{d^2y}{dx^2} = − 0.0178 + 0\)

\(\frac{d^2y}{dx^2} = − 0.0178\)

As the value of \(\frac{d^2y}{dx^2}\) is negative, thus it can be stated that the value of the maxima will be

found be putting \(\frac{d^2y}{dx^2} = 0\)

\(\frac{dx}{dy} = 0\)

=> −0.0178*x* + 3.7722 = 0

=> −0.0178*x *= −3.7722

\(=> x=\frac{-3.7722}{-0.0178}\)

=> *x *= 221.92

Thus, the value of maxima of the polynomial trendline is x = 211.92 cm. Thus, a batsman with a height of 211.92 cm, will have the maximum strike rate as per the polynomial correlation.

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

There are five headers of the processed data tables expressed as **x, y, x**^{2}** , y**^{2}* , xy.* The height of the batsman is represented by

*x*

*y*

*x*^{2}

*y*^{2}

*xy*

155

82.05

24025

6732.2025

12717.75

160

92.67

25600

8587.7289

14827.2

168

100.96

28224

10192.9216

16961.28

168

110.97

28224

12314.3409

18642.96

170

89.23

28900

7961.9929

15169.1

170

89.36

28900

7985.2096

15191.2

170

97.22

28900

9451.7284

16527.4

170

98.33

28900

9668.7889

16716.1

173

99.8

29929

9960.04

17265.4

173

100.27

29929

10054.0729

17346.71

174

106.36

30276

11312.4496

18506.64

175

87.78

30625

7705.3284

15361.5

175

88.77

30625

7880.1129

15534.75

175

94.04

30625

8843.5216

16457

175

110.17

30625

12137.4289

19279.75

175

111.07

30625

12336.5449

19437.25

176

102.21

30976

10446.8841

17988.96

177

88.26

31329

7789.8276

15622.02

178

92.84

31684

8619.2656

16525.52

178

97.65

31684

9535.5225

17381.7

178

101.58

31684

10318.4964

18081.24

179

120.83

32041

14599.8889

21628.57

180

88.8

32400

7885.44

15984

180

89.75

32400

8055.0625

16155

180

94.28

32400

8888.7184

16970.4

180

103.3

32400

10670.89

18594

180

122.83

32400

15087.2089

22109.4

181

105.72

32761

11176.7184

19135.32

182

115.36

33124

13307.9296

20995.52

182

150

33124

22500

27300

182

104.45

33124

10909.8025

19009.9

183

89.53

33124

8015.6209

16383.99

183

94.11

33489

8856.6921

17222.13

183

100.52

33489

10104.2704

18395.16

183

101.21

33489

10243.4641

18521.43

183

112.43

33489

12640.5049

20574.69

185

89.93

34225

8087.4049

16637.05

185

93.18

34225

8682.5124

17238.3

185

95.31

34225

9083.9961

17632.35

185

127.53

34225

16263.9009

23593.05

187

118.24

34969

13980.6976

22110.88

188

88.32

35344

7800.4224

16604.16

188

90.37

35344

8166.7369

16989.56

188

143.13

35344

20486.1969

26908.44

191

117.94

36481

13909.8436

22526.54

191

136.11

36481

18525.9321

25997.01

193

106.2

37249

11278.44

20496.6

196

121.31

38416

14716.1161

23776.76

197

89.47

38809

8004.8809

17625.59

201

108.97

40401

11874.4609

21902.97

∑ *x* = 8994

∑ *y* = 5160.72

∑ *x*^{2} = 1621646

∑ *y*^{2} = 543638.162

∑ *xy* = 930560.2

The formula of regression coefficient as mentioned in the background information has been used to find the correlation coefficient. Here, ** x** is the value of independent variable of each observation,

Calculation -

\(r =\frac{n(∑xy)-(∑x)(∑y)}{[n∑x^2-(∑x)^2][n∑y^2-(∑y)^2]}\)

\(=>r =\frac{50(930560.2) − (8994)(5160.72)}{\sqrt{[50 × 1621646 − (8994)^2][50 × 543638.162 − (5160.72)^2]}}\)

=> *r *= 0.348

=> *r*^{2}* *= 0.1212

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

The value of regression coefficient is 0.12. Such a small value (close to zero) of regression coefficient nullifies the existence of any linear correlation between the dependent and the independent variable.

There are seven headers of the processed data table for calculation of Pearson’s correlation coefficient expressed as, ** x, y,x** −

*x*

*y*

\(x-\bar x\)

\(y-\bar y\)

\((x-\bar x)(y-\bar y)\)

\((x-\bar x)^2\)

\((y-\bar y)^2\)

155

82.05

-24.88

-21.1644

526.570272

619.0144

447.931827

160

92.67

-19.88

-10.5444

209.622672

395.2144

111.184371

168

100.96

-11.88

-2.2544

26.782272

141.1344

5.08231936

168

110.97

-11.88

7.7556

-92.136528

141.1344

60.1493314

170

89.23

-9.88

-13.9844

138.165872

97.6144

195.563443

170

89.36

-9.88

-13.8544

136.881472

97.6144

191.944399

170

97.22

-9.88

-5.9944

59.224672

97.6144

35.9328314

170

98.33

-9.88

-4.8844

48.257872

97.6144

23.8573634

173

99.8

-6.88

-3.4144

23.491072

47.3344

11.6581274

173

100.27

-6.88

-2.9444

20.257472

47.3344

8.66949136

174

106.36

-5.88

3.1456

-18.496128

34.5744

9.89479936

175

87.78

-4.88

-15.4344

75.319872

23.8144

238.220703

175

88.77

-4.88

-14.4444

70.488672

23.8144

208.640691

175

94.04

-4.88

-9.1744

44.771072

23.8144

84.1696154

175

110.17

-4.88

6.9556

-33.943328

23.8144

48.3803714

175

111.07

-4.88

7.8556

-38.335328

23.8144

61.7104514

176

102.21

-3.88

-1.0044

3.897072

15.0544

1.00881936

177

88.26

-2.88

-14.9544

43.068672

8.2944

223.634079

178

92.84

-1.88

-10.3744

19.503872

3.5344

107.628175

178

97.65

-1.88

-5.5644

10.461072

3.5344

30.9625474

178

101.58

-1.88

-1.6344

3.072672

3.5344

2.67126336

179

120.83

-0.88

17.6156

-15.501728

0.7744

310.309363

180

88.8

0.12

-14.4144

-1.729728

0.0144

207.774927

180

89.75

0.12

-13.4644

-1.615728

0.0144

181.290067

180

94.28

0.12

-8.9344

-1.072128

0.0144

79.8235034

180

103.3

0.12

0.0856

0.010272

0.0144

0.00732736

180

122.83

0.12

19.6156

2.353872

0.0144

384.771763

181

105.72

1.12

2.5056

2.806272

1.2544

6.27803136

182

115.36

2.12

12.1456

25.748672

4.4944

147.515599

182

150

2.12

46.7856

99.185472

4.4944

2188.89237

182

104.45

2.12

1.2356

2.619472

4.4944

1.52670736

183

89.53

3.12

-13.6844

-42.695328

9.7344

187.262803

183

94.11

3.12

-9.1044

-28.405728

9.7344

82.8900994

183

100.52

3.12

-2.6944

-8.406528

9.7344

7.25979136

183

101.21

3.12

-2.0044

-6.253728

9.7344

4.01761936

183

112.43

3.12

9.2156

28.752672

9.7344

84.9272834

185

89.93

5.12

-13.2844

-68.016128

26.2144

176.475283

185

93.18

5.12

-10.0344

-51.376128

26.2144

100.689183

185

95.31

5.12

-7.9044

-40.470528

26.2144

62.4795394

185

127.53

5.12

24.3156

124.495872

26.2144

591.248403

187

118.24

7.12

15.0256

106.982272

50.6944

225.768655

188

88.32

8.12

-14.8944

-120.94253

65.9344

221.843151

188

90.37

8.12

-12.8444

-104.29653

65.9344

164.978611

188

143.13

8.12

39.9156

324.114672

65.9344

1593.25512

191

117.94

11.12

14.7256

163.748672

123.6544

216.843295

191

136.11

11.12

32.8956

365.799072

123.6544

1082.1205

193

106.2

13.12

2.9856

39.171072

172.1344

8.91380736

196

121.31

16.12

18.0956

291.701072

259.8544

327.450739

197

89.47

17.12

-13.7444

-235.30413

293.0944

188.908531

201

108.97

21.12

5.7556

121.558272

446.0544

33.1269314

The formula of Pearson’s correlation coefficient as mentioned in the background information has been used to find the correlation coefficient. Here, * x* is the value of independent variable of each observation,

Calculation -

\(\bar x=\frac{∑x}{N}=\frac{8994}{50} = 179.88\)

\(\bar y=\frac{∑y}{N}=\frac{5160.72}{50} = 103.2144\)

\(∑(x-\bar x)(y-\bar y)= 2249.8864\)

\(∑(x-\bar x)^2= 3805.28\)

\(∑(y-\bar y)^2= 10977.544\)

Let, the Pearson’s Correlation Coefficient be \(\mathfrak{R}\).

\(\mathfrak{R}=\frac{∑(x-\bar x)(y-\bar y)}{\sqrt{∑(x-\bar x)^2\times∑(y-\bar y)^2}}\)

\(\mathfrak{R}=\frac{2249.8864}{\sqrt{3805.28 × 10977.544}} = 0.3481\)

\(\mathfrak{R}=0.348\)

The value of Pearson’s correlation coefficient is 0.348. As it is a positive value, it can be stated that the correlation is increasing in nature, i.e., with an increase in height of batsman, the strike rate also increases. This might be because taller batsman has a benefit in playing short pitched balls which allows them to score runs from a whole lot of deliveries. However, the value of Pearson’s correlation coefficient is very close to zero. It signifies that the correlation is very weak.

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

There are two headers of the processed data table expressed as ** x**, and

*x*

*y*

155

82.05

160

92.67

168

100.96

168

110.97

170

89.23

170

89.36

170

97.22

170

98.33

173

99.8

173

100.27

174

106.36

175

87.78

175

88.77

175

94.04

175

110.17

175

111.07

176

102.21

177

88.26

178

92.84

178

97.65

178

101.58

179

120.83

180

88.8

180

89.75

180

94.28

180

103.3

180

122.83

181

105.72

182

115.36

182

150

182

104.45

183

89.53

183

94.11

183

100.52

183

101.21

183

112.43

185

89.93

185

93.18

185

95.31

185

127.53

187

118.24

188

88.32

188

90.37

188

143.13

191

117.94

191

136.11

193

106.2

196

121.31

197

89.47

201

108.97

The formula of the T – value is shown below:

\(T \,value = \frac{|\bar x-\bar y|}{\sqrt{\frac{v_x^2}{n_x}+\frac{v_y^2}{n_y}}}\)

Here, **\(\bar x\)** is the arithmetic mean of all the observations of the height of batsman, **\(\bar y\)** is the arithmetic mean of all the observations of the strike rate of the batsman, *v _{x} *is the variance of height of batsman,

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

\(\bar x=\frac{x_1+x_2+...+x_n}{n_x}\)

\(=>\bar x=\frac{x_1+x_2+...+x_n}{n_x}\)

\(=>\bar x=\frac{155 + 160 + ⋯ + 197 + 201}{50} = 179.88\)

\(\bar y=\frac{y_1+y_2+...+y_n}{n_y}\)

\(=>\bar y=\frac{82.05 + 92.67 + ⋯ + 89.47 + 108.97}{50} = 103.2144\)

\(v_x^2=\frac{(\bar x-x_1)^2+(\bar x-x_2)^2+...+(\bar x-x_n)^2}{n_x}\)

\(=>v_x^2=\frac{(179.88 − 155)^2 + (179.88 − 160)^2+ ⋯ + (179.88 − 201)^2}{50} = 77.65877\)

\(v_y^2=\frac{(\bar y-y_1)^2+(\bar y-y_2)^2+...+(\bar y-y_n)^2}{n_y}\)

\(=>v_y^2=\frac{(103.2144 − 155)^2 + (103.2144 − 160)^2 + ⋯ + (103.2144 − 201)^2}{50} = 224.03151\)

Therefore, the T – value can be computed as -

\(T\ value =\frac{|179.88 − 103.2144|}{\sqrt{\frac{77.65877}{50}+\frac{224.03151}{50}}}\)

\(=\frac{76.6656}{\sqrt{1.5531754+4.4806302}}\)

\(=\frac{76.6656}{\sqrt{6.0338056}}\)

\(=\frac{76.6656}{2.45638}\)

= 31.210798

Degree of Freedom = *n _{x }*+

The value of T – Test can be found from the table of values of T as mentioned in Background Information Section. From that table, it can be concluded that the Null Hypothesis is accepted and the alternate hypothesis has been rejected. Thus, it can be stated that there is no correlation between the height of batsman and the strike rate of the batsman.

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

There is no profound correlation between the height of a batsman (measured in cm) and his strike rate in the game of cricket.

- The average strike rate of all the batsmen as studied in this correlative analysis is 103.2144.
- The standard deviation in the values of the standard deviation with respect to height of each batsman is 14.967. Such a high value of standard deviation suggests that the strike rate of each batsman varies greatly from each other.
- A linear correlation trendline was observed between the height of batsman and his strike rate. The equation of the trendline was:
. However, the due to very weak correlation as given by the value of regression correlation coefficient (0.1212), the correlation was rejected.**y = 0.5913x - 3.1403** - A polynomial correlation trendline was observed between the height of batsman and his strike rate. The equation of the trendline was:
**y = -0.0089x**^{2}. However, again due to very weak correlation as given by the value of regression correlation coefficient (0.1266), the correlation was rejected.**+ 3.7722x - 287.37** - The value of Y – intercept of the linear correlation trendline was negative which is absurd to get, as strike rate cannot be negative. This also justifies the claim that there exists no correlation between the height and strike rate of the batsman.
- The maximum value of strike rate as found from the polynomial correlation trendline was 221.92 cm. Thus, a batsman with a height of 221.92 cm will have the maximum strike rate as given by the polynomial trendline.
- The value of T – test also satisfies the claim that the null hypothesis is true for the above-performed correlative study between height and strike rate of batsman.

In this investigation, several process and mathematical tools have been observed to find the correlation along with its strength. The choice of tournament is one of the most important strength of this investigation. It has provided with a data sheet with accurate observations of strike rate and height based on the current form of cricket. Use of two different correlation coefficients – Regression and Pearson’s correlation coefficient has provided the strength and nature of correlation. Furthermore, values of mean, and standard deviation has enabled the investigation to analyse the variation of strike rate (dependent variable) in the observed data sheet. Lastly, the use of T – test has provided the conclusion regarding the correlation.

However, there are few weakness that has been observed during this mathematical investigation. As cricket is a game of uncertainty, there are a lot of parameters which govern the strike rate of the batsman. Few of such parameters are pitch quality, weather report, bowler etc. Different batsman has different cricketing technique which is also another parameter which governs the strike rate. As there are a lot of variables affecting the dependent variable (strike rate) apart from height, the correlation study cannot be efficiently carried on. In order to employ an efficient correlative analysis on the research question, all of these parameters must be controlled or made constant.

- ‘Batting Strike Rate (SR) Calculator (Cricket)’. Captain Calculator, https://captaincalculator.com/sports/cricket/batting-strike-rate-calculator/. 23 Nov. 2020. Accessed
- 'The Advantages of Short Soccer Players'.Sports Rec, https://www.sportsrec.com/1006527-advantages-short-soccer-players.html. Accessed 23 Nov. 2020.
- Correlation. http://www.stat.yale.edu/Courses/1997-98/101/correl.htm. Accessed 22 Nov. 2020.
- Data Analysis Pearson's Correlation Coefficient. http://learntech.uwe.ac.uk/da/default.aspx?pageid=1442. Accessed 22 Nov. 2020.
- T Test (Student's T-Test): Definition and Examples'. Statistics How To, https://www.statisticshowto.com/probability-and-statistics/t-test/. Accessed 23 Nov. 2020.
- https://www.sjsu.edu/faculty/gerstman/StatPrimer/t-table.pdf
- IPLT20.Com - Indian Premier League Official Website. https://www.iplt20.com/. Accessed 23 Nov. 2020.
- 'Board of Control for Cricket in India'. The Board of Control for Cricket in India, http://www.bcci.tv/. Accessed 23 Nov. 2020.

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student