Mathematics AI SL's Sample Internal Assessment

Mathematics AI SL's Sample Internal Assessment

To what extent is there a Correlation between strike rate of batsman & height of the batsman?

6/7
6/7
10 mins read
10 mins read
Candidate Name: N/A
Candidate Number: N/A
Session: N/A
Word count: 1,906

Table of content

Rationale

I love sports from a very young age. When I was a kid, I remember my dad taking me to parks every day to play to game of cricket. I guess this is how my relation with sports strengthened. Though I have grown up now, I have an active participation in sports. My studies have never been an excuse to skip games. "All work and no play make Jack a dull boy"- I abide by the statement.

 

I do not play only for recreation; I follow sports religiously. I am into my school's cricket team. I take coaching classes and practice even after school. I love to play Cricket. I love being a batsman and captain of a team.

 

Recently, it was announced that players will be selected to be a part of the interschool cricket tournament. I was super excited and wished to grab the opportunity. When surfing the net for some tips, I read various resources, got to know many interesting facts but a statement about height being a factor of selection caught my eyes.

 

Being the captain of my school team, selecting players to an extent was my responsibility. I looked for confirmation everywhere but could not get a satisfactory answer. I could not decide on the players as I thought their heights should not overshadow their performances. It was a matter of their hard work as well as the name of the school.

 

Heaped with worries, I decided to research and find the answer to my query. This IA is about the same. In this IA, I have tried to find out if the height of a batsman determines his strike rate. I will also try to find how much height of a batsman act as a deciding factor in the result of the cricket match. This research will help me convince myself on selecting players for the competition.

Aim

The main motive of this IA is to study whether or not there exist a correlation between the strike rate of batsman and their height in the game of cricket. Furthermore, this IA will provide a brief information about the benefit or disadvantage a batsman has by default due to his height in scoring runs at a faster rate, i.e., his strike-rate. This exploration will help the team management and selection committee to sign contract with players.

Research question

What is the relationship between strike rate of batsman and the height of the batsman?

Background information

What is strike rate

Strike rate1 is one of the most important parameters which measures the performance of any batsman in the game of cricket. It analyses how much the batsman has scored runs with respect to the number of balls he played. The formula of calculation of strike rate is shown below:

 

\(Strike\ Rate=\frac{Runs\ Scored}{Number\ of\ balls\ played}\times100\)

Physical benefits in athletics – height

Height of players could be a benefit for any player in several games. For example, in games like football and basketball, taller players often stand a better chance in the gameplay with respect to performance over the players with comparatively shorter height.

 

In the game of cricket, taller batsman could have a better chance while playing short balls which will allow then to score a lot of runs in difficult deliveries also.

Regression correlation coefficient

Regression correlation coefficient is a tool to measure the strength of the correlation between the independent variable and the dependent variable. The set of values (x1,y1), (x2,y2), (xn,yn) are used to find the value of r as stated by the formula below:

 

\(r=\frac{n(\sum xy)-(\sum x)(\sum y)}{\sqrt{[n\sum x^2-(\sum x)^2][n\sum y^2-(\sum y)^2]}}\)

 

In the above-mentioned formula, x is the value of independent variable of each observation, y is the value of dependent variable of each observation, xy is the value of the product of the independent and the dependent variable of each observation, n is the number of observation and denotes the sum of all the observation of the mentioned variable.

 

By squaring the value of r, the value of the regression coefficient (r2 ) will be achieved. The value of r2 lies between 0 and 1 where 1 signifies maximum correlation whereas 0 signifies null correlation.

Pearson’s correlation coefficient

Pearson’s correlation coefficient is a tool to measure the strength of the correlation and also the nature of correlation between the independent variable and the dependent variable. The set of values (x1,y1), (x2,y2), (xn,yn) are used to find the value of \(\mathfrak{R}\) as stated by the formula below:

 

 \(\mathfrak{R}=\frac{\sum (x-\bar x)(y-\bar y)}{\sqrt{\sum(x-\bar x)^2\times\sum (y-\bar y)^2}}\)

 

In the above-mentioned formula, x is the value of independent variable of each observation, y is the value of dependent variable of each observation, \(\bar x\) is the arithmetic mean of all the observations of the independent variable, \(\bar y\) is the arithmetic mean of all the observations of the dependent variable and denotes the sum of all the observation of the mentioned variable. The value of \(\mathfrak{R}\) lies between -1 and 1. A positive value of Pearson’s correlation coefficient implies a direct relationship the independent and the dependent variable whereas, a negative value of Pearson’s correlation coefficient implies a indirect relationship the independent and the dependent variable. If the value of the correlation coefficient is close of 1 or -1, it signifies the correlation exists true. On the other hand, if the value of the correlation coefficient is close to 0, it signifies the correlation does not exist.

T – test

T – test is a kind of analysis which predicts the existence of any correlation between an independent variable and a dependent variable. The T – value of any given set of data is firstly calculated. Now, based on the type of data, for example, paired data or independent data, the T- value is checked in the T – table which further predicts the existence of any correlation. The formula of T – value is given below:

 

\(T\ value=\frac{|\bar x-\bar y|}{\sqrt{\frac{v_x^2}{n_x}+\frac{v_y^2}{n_y}}}\)

 

Here, \(\bar x\) is the arithmetic mean of all the observations of the independent variable, \(\bar y\) is the arithmetic mean of all the observations of the dependent variable, vx is the variance of independent variable, vy is the variance of dependent variable, vis the number of observation of independent variable, and vy is the number of observation of dependent variable.

 

Now, the T – value is checked in T – table which predicts the existence of any correlation. The T – table is shown below:

Figure 1 - Table On T – table

Hypothesis

Null hypothesis

It is assumed that there does not exist any correlation between strike rate of batsman and the height of the batsman.

Alternate hypothesis

It is assumed that there is a correlation between the strike rate of batsman and the height of the batsman.

Data collection

Source of data

The strike rate of different batsman with respect to their height has been collected from the very recently organised cricket tournament, Indian Premier League 2020 . Indian Premier League or abbreviated as IPL T20 is a domestic cricket tournament organized by BCCI (Board of Council for Cricket in India). Eight teams each representing a particular city/ state in India competes in a two – three months long tournament where players across the globe are signed contract and assigned in each team. As it is a twenty over match, it is often abbreviated as T20 series.

Justification on selecting the source as IPL T20

IPL T20 has been selected for collection of data for a various reason. Firstly, IPL, though a domestic tournament organized by BCCI, it offers an amalgamation of players across the globe. It will allow the data set to have more generalized observations rather than specific to any single country. Secondly, IPL T20 is one of the most recently organized tournaments. It will allow the data set to be updated with respect to the current style of playing the game of cricket. Thirdly, IPL is a twenty over game. A twenty over game’s pre-requisite is scoring runs at a smaller number of balls played. As a result, the strike rate of batsman in this tournament will be more than that of any other tournament. Higher observed values offer an ease and perfection to find the correlation than that of smaller observed values.

Raw data table

Sl. NoBatsmenHeight(cm)Strike rate
1Shakib al Hassan15582.05
2Mushfiqur Rahim16092.67
3Rashid Khan168100.96
4Kusal Perera168110.97
5Rishabh Pant17089.23
6David Warner17089.36
7JP Duminy17097.22
8Rohit Sharma17098.33
9Kane Williamson17399.8
10Nicholas Pooran173100.27
11Mosaddek Hossain174106.36
12MS Dhoni17587.78
13Mohammed Hafeez17588.77
14Virat Kohli17594.04
15Liton Das175110.17
16Eoin Morgan175111.07
17Aaron Finch176102.21
18Usman Khawaja17788.26
19Jonny Bairstow17892.84
20Colin Munro17897.65
21Shimron Hetmyer178101.58
22Mohammad Saifuddin179120.83
23Najibullah Zadran18088.8
24Mahmudullah18089.75
25Haris Sohail18094.28
26Shikhar Dhawan180103.3
27Jos Buttler180122.83
28Avishka Fernando181105.72
29Jason Roy182115.36
30Glen Maxwell182150
31Alex Carey182104.45
32Joe Root18389.53
33Hazratullah Zazai18394.11
34Colin de Grandhomme183100.52
35Soumya Sarkar183101.21
36Hardik Pandya183112.43
37Chris Woakes18589.93
38Ben Stokes18593.18
39Thisara Perera18595.31
40Wahab Riaz185127.53
41Imad Wasim187118.24
42Chris Gayle18888.32
43Rassie van der Dussen18890.37
44Martin Guptill188143.13
45David Miller191117.94
46Nathan Coulter-Nile191136.11
47Carlos Brathwaite193106.2
48Chris Morris196121.31
49Mitchell Stark19789.47
50Jason Holder201108.97

Figure 2 - Table On Strike Rate Of 50 Batsman Along With Their Height (In Cm)

Processed data table

Figure 3 - Table On Processed Data Table For Strike Rate Of 50 Batsman Along With Their Height (In Cm)

Sample calculation

\(\text{Mean }= \frac{y_1+y_2+...+y_n}n{}\)

 

\(\text{Arithmetic Mean }= \frac{82.05+92.67+100.96+...+89.47+108.97}{50} = 103.2144\)

 

\(\text{Standard Deviation }= \frac{\sqrt{(\bar y-y_1)^2+(\bar y-y_2)^2+...+(\bar y-y_n)^2}}{n}\)

 

\(\\text{Standard Deviation =}\frac{\sqrt{{\overline{(103.2144}-82.05)^2+(103.2144-92.67)^2+...+(\overline{103.2144}-108.97)^2}}}{50} = 14.967\)

Processed data table analysis

The mean strike rate of all the batsman is 103.2144. On the other hand the standard deviation is 14.967. The value of standard deviation, being high, offers a wide range of values of strike rate with respect to the mean. As a result, it can be assumed that the strike rate varies greatly from each player to the other.

Graphical analysis

Linear correlation

Figure 4 - Linear Correlation Between Strike Rate And Height Of Batsman

Polynomial correlation

Figure 5 - Polynomial Correlation Between Strike Rate And Height Of Batsman

Choice of axes

The X – Axis of the graph denotes the height of the batsman measured in centimetre (independent variable).

 

The Y – Axis of the graph denotes the strike rate of the batsman (dependent variable).