Mathematics AI SL

Sample Internal Assessment

7/7

10 mins Read

1,946 Words

English

Free

Jump To Section

Table of content

Rationale

Aim

Research question

Background information

Atomic power plant

Regression correlation coefficient

Pearson’s correlation coefficient

Exploration methodology

Hypothesis

Null hypothesis

Alternate hypothesis

Data collection

Case 1 for group 1 (30 years to 45 years)

Analysis of graph 1

Case 2: for group 2 (45 years to 60 years)

Case 3: for group 3 (60 years to 75 years)

Conclusion

Reflection

Strength

Weakness

Future scope

Bibliography

Being an inquirer and a creative thinker, I always aspired to contribute to society with the skill and knowledge I procure. I believe, real-life experience is something that genuinely motivates with an internal objective to persuade. I recently came across one of the most harmful diseases called cancer as one of my neighbours recently detected. He, being a worker at a nuclear power plant, doctors have assumed that leakage of radiation was one reason behind cancer. The statement claimed by the doctor has raised several curiosities in our mind. Does working in a nuclear power station causes cancer? Does the age of nuclear power plant employees increase the chance of getting infected by cancer? To derive the answers to the questions, I have done a few research. I have read a few research journals on cancer and medical science, which has enabled me to understand different cancer causative agents.

Understanding several causes of cancer, I have tried to explore the probability of getting infected by cancer based on one of the most significant nuclear power plant parameters, i.e., the number of working employees. To derive a correlation between the chances of getting infected in a nuclear power station based on the total number of employees, I have also researched different correlation coefficients to justify the derived correlation. In the process, I have learnt the use of Pearson’s Correlation Coefficient, which is an extension of the regression correlation coefficient that I have studied in the curriculum of IB.

After all of these researches, I have come to the research question of this exploration intending to find the chance of getting infected by cancer if a person is working in a nuclear power plant with a more significant number of employees than that of a nuclear power plant with less number of employee.

This exploration's prime objective is to derive a relationship on chances of getting infected by cancer for a worker of an Atomic Power Station and the total number of working professionally in the power station.

To what extent is there a correlation for three different age groups of individuals (Gr 1: 30 years to 45 years, Gr 2: 45 years to 60 years, and Gr 3: 60 years to 75 years) between the number of workers getting infected by Cancer during the period of their service as well as after retirement from job in different Atomic Power Plants in the United States of America and the total number of workers working in the Atomic Power Plant?

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

Atomic power plant uses the process of nuclear fission to generate energy. It is performed in nuclear reactors where heat is generated which is further used to generate electricity. During the process, several radiations, such as, α - rays, β - rays, γ - rays and many more are emitted. Amongst the mentioned rays, the most harmful radiation is the γ ray. Though many precautions are taken in atomic power plants to prevent leakage of radiations; however, cases of radiation leakage are observed which invariably affect human life and environment.

Regression correlation coefficient provides information about the stability of any obtained correlation between a dependent variable and its corresponding independent variable. The magnitude of the coefficient lies between 0 and 1. Here, the correlation's maximum strength is denoted by 1, whereas, a minimum strength of correlation or no correlation is represented by 0. The mathematical formulation of the regression correlation coefficient for a linear trend is shown below:

\(r^2=\bigg[\frac{n\big(\sum xy\big)-(\sum x)(\sum y)}{\sqrt{[n\sum x^2-\big(\sum x\big)^2][n\sum y^2-\big(\sum y\big)^2}]}\bigg]^2\)

*x = independent variable*

*y = dependent variable*

*r*^{2}* = regression correlation coefficient*

*n = number of observations*

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

Pearson’s correlation coefficient provides information about the stability and the nature of any obtained correlation between a dependent variable and its corresponding independent variable. The magnitude of the coefficient lies between ^{-1} and 1. Here, the maximum strength of the correlation is denoted by the value of ±1, whereas, a minimum strength of correlation or no correlation is represented by 0. A positive value of Pearson’s Coefficient signifies that the relationship is increasing in nature, and that of a negative value indicates that the relationship is decreasing in nature. The mathematical formulation of Pearson’s correlation coefficient for a linear trend is shown below:

\(R=\frac{\sum(x-\bar x)(y-\bar y)}{\sqrt{\sum(x-\bar x)^2\times\sum(y-\bar y)^2}}\)

*x = independent variable*

*y = dependent variable*

*R = Pearson's correlation coefficient*

\(\bar x = \,mean \,value \,of \,all \,observations \,of \,the \,independent \,variable\)

\(\bar y = \,mean \,value \,of \,all \,observations \,of \,the \,dependent \,variable\)

In this exploration, ten central atomic power stations in the United States of America are chosen. The total number of employees, currently working or have worked in each organisation, has been collected from three different age groups, as mentioned in the research question. The total number of workers infected by cancer during their tenure of service or after retirement is based on each age group and the atomic power station. To verify the collected data's stability, the percentage of infected employees for each nuclear power station has been calculated based on their organisation. Finally, the correlation between the number of infected employees of each age group and each power station has been plotted compared to the total number of employees working or worked in the corresponding power station. To verify the correlation, regression correlation coefficient and Pearson's correlation coefficient has been calculated, and the correlation is evaluated using T-Test.

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

It is assumed that no correlation is obtained between the number of employees getting infected by Cancer during the period of their service as well as after retirement from the job in different Nuclear Power Plants in the United States of America and the total number of employees working in the Nuclear Power Plant.

It is assumed that a correlation is obtained between the number of employees getting infected by Cancer during the period of their service as well as after retirement from the job in different Nuclear Power Plants in the United States of America and the total number of employees working in the Nuclear Power Plant.

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

**Data table -**

Name

Total

Infected

Percentage

Rochester City Project

328

33

10.06

Chicago City Project

348

36

10.34

San Diego City Project

386

42

10.88

Newark City Project

452

72

1.593

Texas City Project

458

53

11.57

Dayton City Project

673

88

13.08

Virginia City Project

724

102

14.09

Utah City Project

977

177

18.12

Boston City Project

1563

301

19.26

Austin City Project

3874

878

22.66

*Sample Calculation:*

Percentage of Infected Worker in Rochester City Project

\(= \frac{33}{328} = 10.06\)

**Graphical Analysis:**

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

The above graph represents the relationship between the number of employees aged between 30 and 45 who are infected by cancer during their tenure of service at different Nuclear Power Plants in the USA. The total number of employees working in various power plants, being the independent variable of the exploration, is plotted along the X-Axis. The cancer-infected employees out of the total working employees, being the dependent variable of the investigation, are plotted along the Y-Axis. The total number of employees working in power plant increases from 328 to 3874; the number of individuals infected by cancer increases from 33 to 878. Hence, an increasing linear trend has been obtained in the graph, i.e., with an increase in the number of workers in each power plant, the number of employees getting infected by cancer increases. The equation of trend obtained in the graph is shown below:

*y *= 0.2386*x* - 54.366

Here, *x *represents the total number of employees working in different power plants, and *y *represents cancer infected employees out of the entire working employees.

Despite having a very high value of the regression coefficient of 0.99, the data set itself questions the correlation's reliability because there is a vast gap in the total number of employees working in the nuclear power plant (independent variable) between 1600 and 3800. As the dependent variable's values for the corresponding range of independent variable are not available, the correlation cannot be said to be reliable.

**Calculation of Regression Coefficient -**

In the processed data table, total number of employees working in nuclear power plant is denoted by *x*, and the number of employees infected by cancer is denoted by *y*, and ∑ denotes the summation.

x

y

x^{2}

Y^{2}

xy

328

33

107584

1089

10824

348

36

121104

1296

12528

386

42

148996

1764

16212

452

72

204304

5184

32544

458

53

209764

2809

24274

673

88

452929

7744

59224

724

102

524176

10404

73848

977

177

954529

31329

172929

1563

301

2442969

90601

470463

3874

878

15007876

770884

3401372

Σx = 9783

Σy = 1782

Σx^{2} = 20174231

Σy^{2} = 923104

Σxy = 4274218

Calculation:

\(r^2=\bigg[\frac{n(Σxy)-(Σx)(Σy)}{\sqrt{[nΣx^2-(Σx)^2][nΣy^2-(Σy)^2]}}\bigg]\)

\(=>r^2=\bigg[\frac{10(4274218)-(9783)(1782)}{\sqrt{[10×20174231-(9783)^2}][10×923104-(1782)^2]}\bigg]^2\)

=> r^{2 }= (0.9987)^{2 }= 0.9975

**Calculation of Pearson’s Correlation Coefficient -**

In the processed data table, total number of employees working in nuclear power plant is denoted by *x*, and the number of employees infected by cancer is denoted by *y*, \(\bar x\) denotes the average number of workers those are working in nuclear power plant, \(\bar y\) denotes the average number of workers those are infected y cancer, and ∑ denotes the summation.

x

y

\(x-\bar x\)

\(y-\bar y\)

\((x-\bar x)(y-\bar y)\)

\((x-\bar x)^2\)

\((y-\bar y)^2\)

328

33

-650.30

-145.20

94423.56

422890.09

21083.04

348

36

-630.30

-142.20

89628.66

397278.09

20220.84

386

42

-592.30

-136.20

80671.26

350819.29

18550.44

452

72

-526.30

-106.20

55893.06

276991.69

11278.44

458

53

-520.30

-125.20

65141.56

270712.09

15675.04

673

88

-305.30

-90.20

27538.06

93208.09

8136.04

724

102

-254.30

-76.20

19377.66

64668.49

5806.44

977

177

-1.30

-1.20

1.56

1.69

1.44

1563

301

584.70

122.80

71801.16

341874.09

15079.84

3874

878

2895.70

699.80

2026410.86

8385078.49

489720.04

Calculation -

\(\bar x=\frac{Σx}{N}=\frac{9783}{10}=978.3\)

\(\bar y=\frac{Σy}{N}=\frac{1782}{10}=178.2\)

\(Σ(x-\bar x)(y-\bar y)=2530887.40\)

\(Σ(x-\bar x)^2=10603522.10\)

\(Σ(y-\bar y)^2=605551.60\)

\(R=\frac{Σ(x-\bar x)(y-\bar y)}{\sqrt{Σ(x-\bar x)^2×Σ(y-\bar y)^2}}\)

\(R=\frac{2530887.40}{\sqrt{10603522.10×605551.60}}=0.998\)

**Evaluation by T – Test -**

In the calculation shown below, the total number of employees working in nuclear power plant is denoted by *x*, and the number of employees infected by cancer is denoted by *y*, \(\bar x\) denotes the average number of workers those are working in nuclear power plant, \(\bar y\) denotes the average number of workers those are infected *y *cancer, *n _{x} *represents the number of observation of total number of working employee (independent variable),

\(S=\frac{Σ(x-\bar x)^2+Σ(x-\bar y)^2}{n_x+n_y-2}\)

The mathematical formulation of T – Value is also shown below:

\(T\ value=\frac{|\bar x-\bar y|}{\sqrt{\frac{S^2}{n_x}+\frac{S^2}{n_y}}}\)

For calculation of T – Value required for this test, Table 1 has been followed:

\(\bar x=\frac{9783}{10}=978.3\)

\(\bar y=\frac{1782}{10}=178.2\)

\(S^2=\frac{Σ(x-\bar x)^2+Σ(x-\bar y)^2}{n_x+n_y-2}=178.2\)

\(=\frac{(328-978.3)^2+...+(3874-978.3)^2+(328-178.2)^2+...+(3874-178.2)^2}{10+10-2}\)

= 1533813.57

\(T\ value=\frac{|978.3-178.2|}{\sqrt{\frac{1533813.57}{10}+\frac{1533813.57}{10}}}=\frac{800.1}{553.86}=1.44\)

Comparing the T – Value with respect to the values in T – Table, it can be stated that the Alternate Hypothesis is true.

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

**Data Table:**

Name

Total

Infected

Percentage

Rochester City Project

333

36

10.81

Chicago City Project

344

38

11.05

San Diego City Project

378

57

15.08

Newark City Project

462

99

21.43

Texas City Project

486

102

20.99

Dayton City Project

620

114

18.39

Virginia City Project

797

144

18.07

Utah City Project

971

160

16.48

Boston City Project

1497

297

19.84

Austin City Project

3388

790

23.32

*Sample Calculation:*

Refer to the Sample Calculation shown for Table No. 1.

**Graphical Analysis:**

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

*Analysis of Graph 2:*

The above graph represents the relationship between several employees aged between 45 and 60 who are infected by cancer during their tenure of service at different Nuclear Power Plants in the USA. The total number of employees working in various power plants, being the independent variable of the exploration, is plotted along with the X-Axis and cancer infected employees out of the total working employees, being the dependent variable of the investigation, is plotted along the Y-Axis. The total number of employees working in power plant increases from 333 to 3388; the number of individuals infected by cancer increases from 36 to 790. Hence, an increasing linear trend has been obtained in the graph, i.e., with an increase in the number of workers in each power plant, the number of employees getting infected by cancer increases. The equation of trend obtained in the graph is shown below: y = 0.2401*x* - 38.697 Here, *x *represents the total number of employees working in different power plants, and *y *represents cancer infected employees out of the entire working employees.

Despite having a very high value of the regression coefficient of 0.99, the data set itself questions the correlation's reliability because there is a vast gap in the total number of employees working in the nuclear power plant (independent variable) between 1500 and 3400. As the dependent variable's values for the corresponding range of independent variable are not available, the correlation cannot be said to be reliable.

**Calculation of Regression Coefficient:**

In the processed data table, total number of employees working in nuclear power plant is denoted by *x*, and the number of employees infected by cancer is denoted by *y*, and ∑ denotes the summation.

x

y

x^{2}

y^{2}

xy

333

36

110889

1296

11988

344

38

118336

1444

13072

378

57

142884

3249

21546

462

99

213444

9801

45738

486

102

236196

10404

49572

620

114

384400

12996

70680

797

144

635209

20736

114768

971

160

942841

25600

155360

1497

297

2241009

88209

444609

3388

790

11478544

624100

2676520

Σx = 9276

Σy = 1837

Σx^{2 }= 16503752

Σy^{2 }= 797835

Σxy = 3603853

*Calculation -*

r^{2 }= 0.9929

For calculation, refer to the calculation of regression coefficient as shown in Case 1.

**Calculation of Pearson’s Correlation Coefficient -**

In the processed data table, total number of employees working in nuclear power plant is denoted by *x*, and the number of employees infected by cancer is denoted by *y*, \(\bar x \) denotes the average number of workers those are working in nuclear power plant,\(\bar y\) denotes the average number of workers those are infected y cancer, and ∑ denotes the summation.

x

y

\(x-\bar x\)

\(y-\bar y\)

\((x-\bar x)(y-\bar y)\)

\((x-\bar x)^2\)

\((y-\bar y)^2\)

333

36

-594.6

-151.7

90200.82

353549.16

23012.89

344

38

-583.6

-149.7

87364.92

340588.96

22410.09

378

57

-549.6

-130.7

71832.72

302060.16

17082.49

462

99

-465.6

-88.7

41298.72

216783.36

7867.69

486

102

-441.6

-85.7

37845.12

195010.56

7344.49

620

114

-307.6

-73.7

22670.12

94617.76

5431.69

797

144

-130.6

-43.7

5707.22

17056.36

1909.69

971

160

43.4

-27.7

-1202.18

1883.56

767.29

1497

297

569.4

109.3

62235.42

324216.36

11946.49

3388

790

2460.4

602.3

1481898.92

6053568.16

362765.29

*Calculation -*

R = 0.996

For calculation, refer to the calculation of Pearson’s coefficient shown for Case 1.

**Evaluation by T – Test -**

In the calculation shown below, the total number of employees working in nuclear power plant is denoted by *x*, and the number of employees infected by cancer is denoted by *y*, \(\bar x\) denotes the average number of workers those are working in nuclear power plant, \(\bar y\) denotes the average number of workers those are infected *y *cancer, *n _{x} *represents the number of observation of total number of working employee (independent variable),

\(S=\frac{Σ(x-\bar x)^2+Σ(x-\bar y)^2}{n_x+n_y-2}\)

The mathematical formulation of T – Value is also shown below:

\(T\ value=\frac{|\bar x-\bar y|}{\sqrt{\frac{S^2}{n_x}+\frac{S^2}{n_y}}}\)

For calculation of T – Value required for this test, Table 4 has been followed:

T - value = 1.45

For calculation, refer to the calculation of T – value as shown in Case 1.

Comparing the T – Value with respect to the values in T – Table, it can be stated that the Alternate Hypothesis is true.

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

**Data Table:**

Name

Total

Infected

Percentage

Rochester City Project

290

45

15.52

Chicago City Project

299

53

17.73

San Diego City Project

302

66

21.85

Newark City Project

402

135

33.58

Texas City Project

435

137

31.49

Dayton City Project

544

106

19.49

Virginia City Project

643

188

29.24

Utah City Project

878

191

21.75

Boston City Project

1271

399

31.39

Austin City Project

2893

983

33.98

*Sample Calculation:*

Refer to the Sample Calculation shown for Table No. 1.

**Graphical Analysis:**

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

*Analysis of Graph 3 -*

The above graph represents the relationship between several employees aged between 60 and 75 who are infected by cancer during their tenure of service at different Nuclear Power Plants in the USA. The total number of employees working in various power plants, being the independent variable of the exploration, is plotted along with the X-Axis and cancer infected employees out of the total working employees, being the dependent variable of the investigation, is plotted along the Y-Axis. The total number of employees working in power plant increases from 289 to 2894; the number of individuals infected by cancer increases from 46 to 982, respectively. Hence, an increasing linear trend has been obtained in the graph, i.e., with an increase in the number of workers in each power plant, the number of employees getting infected by cancer increases. The equation of trend obtained in the graph is shown below:

*y *= 0.352*x* - 50.422

Here, *x *represents the total number of employees working in different power plants, and *y *represents cancer infected employees out of the entire working employees.

Despite having a very high value of the regression coefficient of 0.98, the data set itself questions the correlation's reliability. There is a vast gap in the total number of employees working in the nuclear power plant (independent variable) between 1400 to 2700. As the dependent variable's values for the corresponding range of independent variable are not available, the correlation cannot be said to be reliable.

**Calculation of Regression Coefficient -**

In the processed data table, total number of employees working in nuclear power plant is denoted by *x*, and the number of employees infected by cancer is denoted by y, and ∑ denotes the summation.

x

y

x^{2}

y^{2}

xy

290

45

84100

2025

13050

299

53

89401

2809

15847

302

66

91204

4356

19932

402

135

161604

18225

54270

435

137

189225

18769

59595

544

106

295936

11236

57664

643

188

413449

35344

120884

878

191

770884

36481

167698

1271

399

1615441

159201

507129

2893

983

8369449

966289

2843819

Σx = 7957

Σy = 2303

Σx^{2} = 12080693

Σy^{2} = 1254735

Σxy = 3859888

*Calculation -*

*r*^{2 }= 0.987

For calculation, refer to the calculation of regression coefficient as shown in Case 1.

**Calculation of Pearson’s Correlation Coefficient -**

In the processed data table, total number of employees working in nuclear power plant is denoted by *x*, and the number of employees infected by cancer is denoted by *y*, \(\bar x\) denotes the average number of workers those are working in nuclear power plant, \(\bar y\) denotes the average number of workers those are infected y cancer, and ∑ denotes the summation.

x

y

\(x-\bar x\)

\(y-\bar y\)

\((x-\bar x)(y-\bar y)\)

\((x-\bar x)^2\)

\((y-\bar y)^2\)

290

45

-505.70

-185.30

93706.21

255732.49

34336.09

299

53

-496.70

-177.30

88064.91

246710.89

31435.29

302

66

-493.70

-164.30

81114.91

243739.69

26994.49

402

135

-393.70

-95.30

37519.61

154999.69

9082.09

435

137

-360.70

-93.30

33653.31

130104.49

8704.89

544

106

-251.70

-124.30

31286.31

63352.89

15450.49

643

188

-152.70

-42.30

6459.21

23317.29

1789.29

878

191

82.30

-39.30

-3234.39

6773.29

1544.49

1271

399

475.30

168.70

80183.11

225910.09

28459.69

2893

983

2097.30

752.70

1578637.71

4398667.29

566557.29

*Calculation -*

R = 0.993

For calculation, refer to the calculation of Pearson’s coefficient as shown in Case 1.

**Evaluation by T – Test -**

In the calculation shown below, the total number of employees working in nuclear power plant is denoted by *x*, and the number of employees infected by cancer is denoted by *y*, \(\bar x\) denotes the average number of workers those are working in nuclear power plant, \(\bar y\) denotes the average number of workers those are infected *y* cancer, *n _{x} *represents the number of observation of total number of working employee (independent variable),

\(S=\frac{Σ(x-\bar x)^2+Σ(x-\bar y)^2}{n_x+n_y-2}\)

The mathematical formulation of T – Value is also shown below:

\(T\ value=\frac{|\bar x-\bar y|}{\frac{S^2}{n_x}+\frac{S^2}{n_y}}\)

For calculation of T – Value required for this test, Table 4 has been followed:

T - value = 1.43

For calculation, refer to the calculation of T – value as shown in Case 1.

Comparing the T – Value with respect to the values in T – Table, it can be stated that the Alternate Hypothesis is true.

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

*To what extent is there a correlation for three different age groups of individuals (Gr 1: 30 years to 45 years, Gr 2: 45 years to 60 years, and Gr 3: 60 years to 75 years) between the number of employees getting infected by Cancer during the period of their service as well as after retirement from job in different Nuclear Power Plants in the United States of America and the total number of employees working in the Nuclear Power Plant?*

A linear and increasing trend has been obtained between the number of employees getting infected by Cancer during the period of their service as well as after retirement from job in different Nuclear Power Plants in the United States of America and the total number of employees working in the Nuclear Power Plant for all the age three groups.

- For Group 1, as the total number of employees working in power plant increases from 328 to 3874, the number of individuals infected by cancer increases from 33 to 878 respectively.
- The equation of trend for Group 1 is expressed as:
*y*= 0.2386*x*- 54.366 where,*x*represents total number of employees working in different power plants, and*y*represents cancer infected employees out of the total working employees. - As the value of regression coefficient and the Pearson’s correlation coefficient for correlation in Group 1 are very high (= 0.99) and (= 0.99) respectively, i.e., very close to 1, the correlation can be stated to be existent and valid.
- Alternate Hypothesis has been established for Group 1 using T – Test.
- For Group 2, as the total number of employees working in power plant increases from 333 to 3388, the number of individuals infected by cancer increases from 36 to 790 respectively.
- The equation of trend for Group 2 is expressed as:
*y*= 0.2401*x*- 38.697 where,*x*represents total number of employees working in different power plants, and*y*represents cancer infected employees out of the total working employees. - As the value of regression coefficient and the Pearson’s correlation coefficient for correlation in Group 2 are very high (= 0.99) and (= 0.99) respectively, i.e., very close to 1, the correlation can be stated to be existent and valid.
- Alternate Hypothesis has been established for Group 2 using T – Test.
- For Group 3, as the total number of employees working in power plant increases from 289 to 2894, the number of individuals infected by cancer increases from 46 to 982 respectively.
- The equation of trend for Group 3 is expressed as:
*y*= 0.352*x*- 50.422 where,*x*represents total number of employees working in different power plants, and*y*represents cancer infected employees out of the total working employees. - As the value of regression coefficient and the Pearson’s correlation coefficient for correlation in Group 3 are very high (= 0.98) and (= 0.99) respectively, i.e., very close to 1, the correlation can be stated to be existent and valid.
- Alternate Hypothesis has been established for Group 3 using T – Test.

- Use of two different correlation coefficient in mathematical exploration has justified the validity of the correlation. Moreover, Pearson’s coefficient has enabled the investigation to mathematically conclude the nature of the correlation (increasing or decreasing).
- Age groups have been made considering an equal interval of 15 years. It has useful to maintain a regularity throughout the exploration.
- Calculation of percentage of the infected individual has enabled the exploration to verify the data's reliability. In this exploration, as the number of employees working in different power plants could vary significantly based on the size of the manufacturing unit, calculation of standard deviation will not indicate the reliability of data for each age group.
- Apart from graphical derivation, T-Test has mathematically concluded the correlation, which improves the correlation's strength, hence the exploration.

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

- The data collected of the total number of employees and cancer infected employees are gathered from different sources like news articles, newspaper surveys, official websites of various nuclear power plants, and many more. Though the data are collected from authentic sources, however, the reliability of the data cannot be determined.

- Cancer is one of the very few diseases which cannot be claimed to be cured completely. As a result, mathematics can determine the relationship between chances of getting infected by cancer and presence of different causative agents. Hence, the same methodology as followed in this exploration could be repeated to explore the effect of other causative agents of cancer. Thus, another research question could be framed as follows: “To what extent is there a correlation for three different age groups of individuals (Gr 1: 30 years to 45 years, Gr 2: 45 years to 60 years, and Gr 3: 60 years to 75 years) between the number of traffic police employee getting infected by Cancer during the period of their service as well as after retirement from job cities in India and the carbon dioxide index of atmosphere in the respective cities?”

- What Is Cancer? - National Cancer Institute. 17 Sept. 2007,https://www.cancer.gov/about-cancer/understanding/what-is-cancer.
- Risk Factors: Radiation - National Cancer Institute. 29 Apr. 2015,https://www.cancer.gov/about-cancer/causes-prevention/risk/radiation.
- ‘UV Radiation’. The Skin Cancer Foundation,https://www.skincancer.org/risk-factors/uv-radiation/. Accessed 22 Nov. 2020.
- Nuclear Power Plants - U.S. Energy Information Administration (EIA).https://www.eia.gov/energyexplained/nuclear/nuclear-power-plants.php. Accessed 22 Nov. 2020.
- ‘Electromagnetic Radiation - Gamma Rays’. Encyclopedia Britannica,https://www.britannica.com/science/electromagnetic-radiation. Accessed 22 Nov. 2020.
- Data Analysis - Pearson’s Correlation Coefficient.http://learntech.uwe.ac.uk/da/default. aspx?pageid=1442. Accessed 22 Nov. 2020.
- ‘Nuclear Workers May Face Higher Cancer Risk’. WebMD, https://www.webmd.com/cancer/news/20050628/nuclear-workers-may-face-higher-cancer-risk. Accessed 22 Nov. 2020.
- Parthasarathy, K. s. ‘Is Working in a Nuclear Power Plant Risky?’ The Hindu, 1 Jan. 2014. www.thehindu.com,https://www.thehindu.com/sci-tech/science/is-working-in-a-nuclear-power-plant-risky/article5526497.ece
- Accidents at Nuclear Power Plants and Cancer Risk - National Cancer Institute. 19 Apr. 2011,https://www.cancer.gov/about-cancer/causes-prevention/risk/radiation/nuclear-accidents-fact-sheet.
- Peach Bottom Atomic Power Station Receives Approval to Operate an Additional 20 Years | Transmission Intelligence Service.https://www.transmissionhub.com/articles/2020/03/peach-bottom-atomic-power-station-receives-approval-to-operate-an-additional-20-years.html. Accessed 25 Nov. 2020.
- NRC: Oconee Nuclear Station, Unit 1.https://www.nrc.gov/info-finder/reactors/oco1.html. Accessed 25 Nov. 2020.
- ‘Braidwood Generating Station | Braceville, Ill.’ Nuclear Powers IL,https://www.nuclearpowersillinois.com/braidwood_generating_station. Accessed 25 Nov. 2020.
- NRC: South Texas Project, Unit 1.https://www.nrc.gov/info-finder/reactors/stp1.html. Accessed 25 Nov. 2020.
- NRC: Susquehanna Steam Electric Station, Unit 1.https://www.nrc.gov/info-finder/reactors/susq1.html. Accessed 25 Nov. 2020.
- Energy, Duke. ‘McGuire Nuclear Station Focuses on Operational Excellence and Community Outreach’. Duke Energy | Nuclear Information Center,https://nuclear.duke-energy.com/2013/06/25/mcguire-nuclear-station-focuses-on-operational-excellence-and-community-outreach. Accessed 25 Nov. 2020.
- ‘Aps – Arizona Public Service Electric’. Aps,https://www.aps.com/en/About/Our-Company/Clean-Energy/Nuclear-generation. Accessed 25 Nov. 2020.
- ‘Vogtle 3 and 4’. Georgia Power,http://www.georgiapower.com/company/plant-vogtle.html. Accessed 25 Nov. 2020.

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student

Dr. Adam Nazha

Top IB Math Tutor: 45/45 IBDP, 7/7 Further Math, 7 Yrs Exp, Medicine Student