Mathematics AI SL

Sample Internal Assessment

Table of content

Rationale

Aim

Research question

Introduction

Hypothesis

Data collection

Calculation of correlation coefficient

Evaluation of hypothesis

Conclusion

Reflection

Bibliography

10 mins Read

1,967 Words

"Although September 11 was horrible, it didn't threaten the survival of the human race, like nuclear weapons do." – Stephen Hawking. Despite the heinous terrorism happens across the globe, the destructive power of nuclear bomb has left a scar and fear in every individual of this world.

Since childhood, I have been listening to the nuisance that happened in Hiroshima and Nagasaki in the year 1945 marking the end of World War II. The destructive capacity of nuclear energy was very clear and prominent to me since the early school days.

It was secondary education when the picture of nuclear power and nuclear energy began to change in front of me when I studied about the nuclear energy as a non-conventional source of energy. Despite initial doubts and queries which arose due to the childhood stories of catastrophism, I came across a fact that it is the nuclear energy which is the face of change in providing energy to the mankind. The amount of energy that could be produced by nuclear reaction is unparallel to any other source of energy.

With passing days, the course curriculum became more intense and I started learning in depth concepts of nuclear energy. About two years from now, I have studied that during the nuclear fission reaction which generates the nuclear energy in Physics. Sooner or later, I felt a deep inclination towards the subject. Due to the highly constructive facility that it provides to the mankind, I thought of pursuing higher studies in Nuclear Energy and working in Nuclear Power Station.

However, currently in Biology, I studied about the disease named cancer. Some of the facts have shattered my dream of pursuing a job in nuclear power plant. In the curriculum, I studied that γ - ray causes cancer. The subtle fear which was developed regarding the devastating effects of nuclear energy again filled into my mind because in nuclear fission reaction, the reaction using which nuclear energy is generated, γ - rays are emitted.

To remove the fear and to concentrate on the career, I started doing a few researches. I read a few journals on side – effects of nuclear energy. There were several instances of an increased chance of getting affected by cancer if an individual is exposed to the harmful γ - ray. However, I came across a lot of articles where the preventive measures were discussed which were taken in every nuclear power plant to protect their employees from radiation. To be more confident on this, I read a lot of news journals and articles from which I came across the fact that employees working in nuclear power plant are often getting affected by cancer. However, I could not find any information on the chances of getting affected by cancer for a nuclear plant employee.

To find the answer, I am working on this mathematical exploration so that I can derive some relation on chances of affected by cancer if I pursue my dream job.

The main motive of this investigation is to explore the correlation between the number of employees working in a nuclear power plant and the number of employees getting affected by cancer.

What is the relationship between the number of employees working in a Nuclear Power Station and the number of employees getting infected by cancer during the working period or after retirement for three different age groups – Gr 1: 50 years to 60 years, Gr 2: 60 years to 70 years and Gr 3: 70 years and 80 years?

Cancer 1 is a disease which is characterized by uncontrolled cell division. It results in repetitive division of cell which often causes formation of tumor, cyst, fibroid etc. However, tumors are categorized into two types – Benign and Malignant; Malignant tumors are considered to be cancerous. Cells of malignant tumor or cancerous cells can spread throughout the body through the blood stream and initiate the formation of tumor in any other part of the body. This results in development of pressure on vital organs on where the tumor has originated which leads to organ failure. Tumor also constricts blood vessels at its vicinity resulting in increased heart rate and blood pressure eventually increasing the chances of stroke or heart fail.

There are several causative agents which triggers the cells to divide at an uncontrolled manner. However, in context of this mathematical exploration, radiation is one of the reasons responsible for causing cancer. Radiations like gamma rays, X – rays, etc. are considered to be one of the most eminent causative agents of cancer. These radiations have sufficient ionization energy to trigger the mutagen present in human DNA. On activation of mutagen of any cell, the cell began to divide continuously without maintaining the cell cycle which leads to formation of malignant or cancerous tumor.

From several news reports and scientific research, it is now a clear statement that due to increased emission of greenhouse gas, depletion of ozone layer has caused the harmful ultra violet rays to pass through the Earth’s atmosphere. As a result, cases of skin cancer have increased invariably in the world. This signifies the effect of radiation in causing cancer.

Nuclear power station 4 or nuclear power plant is a power plant which generates energy by nuclear fission reaction. Nuclear fission reaction is performed in a nuclear reactor in which the heat generated by the nuclear reaction is used to convert water into steam. The steam, thus generated is used to run a turbine which generates electricity.

The nuclear fission reaction is accompanied by emission of radiations, such as, α - rays, β - rays, γ - rays etc. Out of which, γ - ray is considered to be the most harmful radiation. The nuclear reactor is constructed in such a way that the leakage of radiation is assured to be null. However, a number of preventive measures in respect to dresses, medical check – up, etc. of employees working in nuclear power plants are taken into consideration. Despite such preventive measures, instances have been noted where radiation has been leaked which has caused severe illness not only to the employees but also to the individuals living in the nearby areas of the power plant. This is because, γ - ray can pass through even inches of metal sheet like lead.

Regression correlation coefficient is a tool to measure the strength of the correlation between the independent variable and the dependent variable. The set of values , , are used to find the value of r as stated by the formula below:

In the above-mentioned formula, x is the value of independent variable of each observation, y is the value of dependent variable of each observation, xy is the value of the product of the independent and the dependent variable of each observation, n is the number of observation and denotes the sum of all the observation of the mentioned variable.

By squaring the value of r, the value of the regression coefficient (r^{2}) will be achieved. The value of r^{2} lies between 0 and 1 where 1 signifies maximum correlation whereas 0 signifies null correlation.

Pearson’s correlation coefficient is a tool to measure the strength of the correlation and also the nature of correlation between the independent variable and the dependent variable. The set of values , , are used to find the value of as stated by the formula below:

In the above-mentioned formula, x is the value of independent variable of each observation, y is the value of dependent variable of each observation, is the arithmetic mean of all the observations of the independent variable, is the arithmetic mean of all the observations of the dependent variable and denotes the sum of all the observation of the mentioned variable.

The value of lies between -1 and 1. A positive value of Pearson’s correlation coefficient implies a direct relationship the independent and the dependent variable whereas, a negative value of Pearson’s correlation coefficient implies a indirect relationship the independent and the dependent variable. If the value of the correlation coefficient is close of 1 or -1, it signifies the correlation exists true. On the other hand, if the value of the correlation coefficient is close to 0, it signifies the correlation does not exist.

Chi squared test is a kind of analysis which predicts the existence of any correlation between an independent variable and a dependent variable. The Chi squared value of any given set of data is firstly calculated. Now, based on the type of data, for example, paired data or independent data, the Chi squared value is checked in the Chi squared table which further predicts the existence of any correlation.

The formula of Chi squared value is given below:

Here, is the observed value, is the expected value, denotes the sum of all the observation of the mentioned variable.

df

0.995

0.99

0.975

0.95

0.90

0.10

0.05

0.025

0.01

0.005

1

---

---

0.001

0.004

0.016

2.706

3.841

5.024

6.635

7.879

2

0.010

0.020

0.051

0.103

0.211

4.605

5.991

7.378

9.210

10.597

3

0.072

0.115

0.216

0.352

0.584

6.251

7.815

9.348

11.345

12.838

4

0.207

0.297

0.484

0.711

1.064

7.779

9.488

11.143

13.277

14.860

5

0.412

0.554

0.831

1.145

1.610

9.236

11.070

12.833

15.086

16.750

6

0.676

0.872

1.237

1.635

2.204

10.645

12.592

14.449

16.812

18.548

7

0.989

1.239

1.690

2.167

2.833

12.017

14.067

16.013

18.475

20.278

8

1.344

1.646

2.180

2.733

3.490

13.362

15.507

17.535

20.090

21.955

9

1.735

2.088

2.700

3.325

4.168

14.684

16.919

19.023

21.666

23.589

10

2.156

2.558

3.247

3.940

4.865

15.987

18.307

20.483

23.209

25.188

11

2.603

3.053

3.816

4.575

5.578

17.275

19.675

21.920

24.725

26.757

12

3.074

3.571

4.404

5.226

6.304

18.549

21.026

23.337

26.217

28.300

13

3.565

4.107

5.009

5.892

7.042

19.812

22.362

24.736

27.688

29.819

14

4.075

4.660

5.629

6.571

7.790

21.064

23.685

26.119

29.141

31.319

15

4.601

5.229

6.262

7.261

8.547

22.307

24.996

27.488

30.578

32.801

16

5.142

5.812

6.908

7.962

9.312

23.542

26.296

28.845

32.000

34.267

17

5.697

6.408

7.564

8.672

10.085

24.769

27.587

30.191

33.409

35.718

18

6.265

7.015

8.231

9.390

10.865

25.989

28.869

31.526

34.805

37.156

19

6.844

7.633

8.907

10.117

11.651

27.204

30.144

32.852

36.191

38.582

20

7.434

8.260

9.591

10.851

12.443

28.412

31.410

34.170

37.566

39.997

21

8.034

8.897

10.283

11.591

13.240

29.615

32.671

35.479

38.932

41.401

22

8.643

9.542

10.982

12.338

14.041

30.813

33.924

36.781

40.289

42.796

23

9.260

10.196

11.689

13.091

14.848

32.007

35.172

38.076

41.638

44.181

24

9.886

10.856

12.401

13.848

15.659

33.196

36.415

39.364

42.980

45.559

25

10.520

11.524

13.120

14.611

16.473

34.382

37.652

40.646

44.314

46.928

26

11.160

12.198

13.844

15.379

17.292

35.563

38.885

41.923

45.642

48.290

27

11.808

12.879

14.573

16.151

18.114

36.741

40.113

43.195

46.963

49.645

28

12.461

13.565

15.308

16.928

18.939

37.916

41.337

44.461

48.278

50.993

29

13.121

14.256

16.047

17.708

19.768

39.087

42.557

45.722

49.588

52.336

30

13.787

14.953

16.791

18.493

20.599

40.256

43.773

46.979

50.892

53.672

40

20.707

22.164

24.433

26.509

29.051

51.805

55.758

59.342

63.691

66.766

50

27.991

29.707

32.357

34.764

37.689

63.167

67.505

71.420

76.154

79.490

60

35.534

37.485

40.482

43.188

46.459

74.397

79.082

83.298

88.379

91.952

70

43.275

45.442

48.758

51.739

55.329

85.527

90.531

95.023

100.425

104.215

80

51.172

53.540

57.153

60.391

64.278

96.578

101.879

106.629

112.329

116.321

90

59.196

61.754

65.647

69.126

73.291

107.565

113.145

118.136

124.116

128.299

100

67.328

70.065

74.222

77.929

82.358

118.498

124.342

129.561

135.807

140.169

It is assumed that there does not exist any correlation between the number of employees working in a Nuclear Power Station and the number of employees getting infected by cancer during the working period or after retirement for three different age groups – Gr 1: 50 years to 60 years, Gr2: 60 years to 70 years and Gr3: 70 years and 80 years.

It is assumed that there is a correlation between the number of employees working in a Nuclear Power Station and the number of employees getting infected by cancer during the working period or after retirement for three different age groups – Gr 1: 50 years to 60 years, Gr2: 60 years to 70 years and Gr3: 70 years and 80 years.

A data sheet has been prepared based on several news articles, reports and surveys in different nuclear power plant across the globe. It has been possible to record the data of number of employees got infected by cancer during their tenure of service because of the health insurance policy that the company offers to all its employees. Similarly, the health status of the retired employees has been achieved from the health benefit that the company offers even after retirement.

The employees working in nuclear power plant has been categorized into three groups to illustrate the correlation in a proper and intensive way. It has been studied that immunity against cancer is more in young age than that of the elder. However, there are lot of exceptions; mutagen is activated in elder people with very less exposition to radiations than that of others. On the other hand, it has been observed that an individual at a young age has been exposed to cancer causing radiation, however, the cancer has been observed at a very later period of his life. Thus, considering the strength of immunity in an individual, the age groups are made accordingly.

Name

Total

Infected

Byron Nuclear Power Station

329

34

Peach Bottom Atomic Power Station

347

37

Oconee Nuclear Station

387

47

Braidwood Generating Station

451

71

South Texas Project Electric Generating Station

459

52

Susquehanna Nuclear Power Plant

674

89

Mcguire Nuclear Power Plant

725

103

Browns Ferry Nuclear Plant

978

178

Palo Verde Generation Station

1564

302

Vogtle Nuclear Power Station

3875

879

Name

Total

Infected

Byron Nuclear Power Station

334

37

Peach Bottom Atomic Power Station

345

38

Oconee Nuclear Station

379

58

Braidwood Generating Station

463

98

South Texas Project Electric Generating Station

487

103

Susquehanna Nuclear Power Plant

621

115

Mcguire Nuclear Power Plant

798

145

Browns Ferry Nuclear Plant

970

161

Palo Verde Generation Station

1498

298

Vogtle Nuclear Power Station

3389

789

Name

Total

Infected

Byron Nuclear Power Station

289

46

Peach Bottom Atomic Power Station

297

52

Oconee Nuclear Station

303

67

Braidwood Generating Station

401

132

South Texas Project Electric Generating Station

432

136

Susquehanna Nuclear Power Plant

543

105

Mcguire Nuclear Power Plant

641

187

Browns Ferry Nuclear Plant

879

190

Palo Verde Generation Station

1273

398

Vogtle Nuclear Power Station

2894

982

Total No. of Employees

Infected Employees

Percentage

329

34

10.33

347

37

10.66

387

47

12.14

451

71

15.74

459

52

11.32

674

89

13.20

725

103

14.20

978

178

18.20

1564

302

19.30

3875

879

22.68

Total No. of Employees

Infected Employees

Percentage

334

37

11.08

345

38

11.01

379

58

15.30

463

98

21.17

487

103

21.15

621

115

18.52

798

145

18.17

970

161

16.60

1498

298

19.89

3389

789

23.28

Total No. of Employees

Infected Employees

Percentage

289

46

15.92

297

52

17.51

303

67

22.11

401

132

32.92

432

136

31.48

543

105

19.34

641

187

29.17

879

190

21.62

1273

398

31.26

2894

982

33.93

Sample Calculation

Percentage of Infected Employee =

In Table 4 to Table 6, percentage of employee who were getting infected by cancer out of the total number of employees have been found. As the interval in total number of employees (independent variable) is not regular, the mean value and standard deviation will not serve any purpose in analyzing the data. Rather, the number of employees infected by cancer is completely depending upon the total number of employees working in that particular power plant. Thus, percentage has been calculated.

In table 4, it has been observed that the percentage of employees infected by cancer is ranging between 10% and 23%. However, it is noticed that number of infected employees is increasing with the total number of employees working in a power plant. Similarly, in table 5, the percentage of infected employee is ranging between 11% and 24% with 11% infected being in the power plant with least number of working employees and 24% being the maximum number of employees working. In table 6, as the age group is between 70 years and 80 years, it can be assumed that the total number of employees who worked for the power plants may have decreased due to death rates in the age. Thus, the total number of employees currently alive is less than that of the other groups. On the other hand, the percentage of infected employees has also increased over the other groups, ranging between 15% and 34% with 15% infected being in the power plant with least number of working employees and 34% being the maximum number of employees working.

The X – Axis of the graph denotes the total number of employees working or worked in nuclear power plants (independent variable).

The Y – Axis of the graph denotes number of employees who are currently infected by cancer (dependent variable).

In all the graphs from no. 1 to no. 3, a linear trendline has been obtained using the data that has been collected from the official websites of the nuclear power plants, newspapers, journals, articles etc.

In graph 1, the equation of trendline is:

y = 0.2386x - 54.366

In graph 2, the equation of trendline is:

y = 0.2401x - 38.697

In graph 3, the equation of trendline is:

y = 0.352x - 50.422

From the graphs, it can be stated that, there exists a positive increasing correlation between the number of employees getting infected by cancer and the total number of employees either currently working or worked in the nuclear power plants. However, a few outliers have been noticed in the graphs as well.

There are a few outliers when the total number of employees are in the range of 500 to 750. Due to presence of very less number of outliers, the value of regression coefficient is 0.99. Such a high value (close to one) of regression coefficient satisfies the existence of any linear correlation between the dependent and the independent variable.

From the equation of the trendline of graph 1, the Y – intercept of the trendline has been calculated:

The value of y for x = 0 will be:

From the equation of the trendline of graph 2, the Y – intercept of the trendline has been calculated:

The value of y for x = 0 will be:

From the equation of the trendline of graph 3, the Y – intercept of the trendline has been calculated:

The value of y for x = 0 will be:

The value of Y – Intercept is -54.366, -38.697, and -50.422 for graph 1, graph 2 and graph 3 respectively. A negative intercept is suggests that if the total number of employee is zero, then the number of infected individual should be negative. However, literally it cannot be possible; mathematical significance of the statement is if the total number of employee is considered to be null, there will not be any infected patient as well; this justifies the fact of increase in the rate of cancer through nuclear power plant.

From the equation of the trendline of graph 1, the X – intercept of the trendline has been calculated:

The value of x for y = 0 will be:

From the equation of the trendline of graph 2, the Y – intercept of the trendline has been calculated:

The value of x for y = 0 will be:

From the equation of the trendline of graph 3, the Y – intercept of the trendline has been calculated:

The value of x for y = 0 will be:

The value of X – Intercept is 228, 161, and 143 approximately for graph 1, graph 2 and graph 3 respectively. The mathematical significance of the statement is if the total number of employee is 228 for age group 50 years to 60 years, 161 for age group 60 to 70 years and 143 for age group 70 to 80 years, there will not be any infected patient.

Processed Data for calculation of R^{2}:

There are five headers of the processed data tables expressed as x, y, x^{2}, y^{2}, xy. The total number of employees is represented by x and the number of employee infected by cancer is represented by y. The remaining headers has usual meaning. The calculation of R^{2} correlation coefficient is shown explore the efficiency and stability of the trendline and the correlation.

329

34

108241

1156

11186

347

37

120409

1369

12839

387

47

149769

2209

18189

451

71

203401

5041

32021

459

52

210681

2704

23868

674

89

454276

7921

59986

725

103

525625

10609

74675

978

178

956484

31684

174084

1564

302

2446096

91204

472328

3875

879

15015625

772641

3406125

334

37

111556

1369

12358

345

38

119025

1444

13110

379

58

143641

3364

21982

463

98

214369

9604

45374

487

103

237169

10609

50161

621

115

385641

13225

71415

798

145

636804

21025

115710

970

161

940900

25921

156170

1498

298

2244004

88804

446404

3389

789

11485321

622521

2673921

289

46

83521

2116

13294

297

52

88209

2704

15444

303

67

91809

4489

20301

401

132

160801

17424

52932

432

136

186624

18496

58752

543

105

294849

11025

57015

641

187

410881

34969

119867

879

190

772641

36100

167010

1273

398

1620529

158404

506654

2894

982

8375236

964324

2841908

The formula of regression coefficient as mentioned in the background information has been used to find the correlation coefficient. Here, x is the value of independent variable of each observation, y is the value of dependent variable of each observation, xy is the value of the product of the independent and the dependent variable of each observation, n is the number of observation and denotes the sum of all the observation of the mentioned variable.

Calculation for Group 1:

Calculation for Group 2:

Calculation for Group 3:

The value of regression coefficient is 0.9975, 0.9929, and 0.987 for group 1, group 2 and group 3 respectively. Such a high value (close to one) of regression coefficient satisfies the existence of any linear correlation between the dependent and the independent variable.

Processed Data Table for calculation of Pearson’s Correlation:

There are seven headers of the processed data table for calculation of Pearson’s correlation coefficient expressed as, x, y,, , , and . The total number of employees is represented by x and the number of employees infected by cancer is represented by y, is the arithmetic mean of all the observations of total number of employees, is the arithmetic mean of all the observations of the number of employees infected by cancer. The remaining headers has usual meaning. The calculation of Pearson’s correlation coefficient is shown to explore the efficiency and stability of the trendline and the correlation.

329

34

-649.9

-145.2

94365.48

422370.01

21083.04

347

37

-631.9

-142.2

89856.18

399297.61

20220.84

387

47

-591.9

-132.2

78249.18

350345.61

17476.84

451

71

-527.9

-108.2

57118.78

278678.41

11707.24

459

52

-519.9

-127.2

66131.28

270296.01

16179.84

674

89

-304.9

-90.2

27501.98

92964.01

8136.04

725

103

-253.9

-76.2

19347.18

64465.21

5806.44

978

178

-0.9

-1.2

1.08

0.81

1.44

1564

302

585.1

122.8

71850.28

342342.01

15079.84

3875

879

2896.1

699.8

2026690.78

8387395.2 1

489720.04

334

37

-594.4

-147.2

87495.68

353311.36

21667.84

345

38

-583.4

-146.2

85293.08

340355.56

21374.44

379

58

-549.4

-126.2

69334.28

301840.36

15926.44

463

98

-465.4

-86.2

40117.48

216597.16

7430.44

487

103

-441.4

-81.2

35841.68

194833.96

6593.44

621

115

-307.4

-69.2

21272.08

94494.76

4788.64

798

145

-130.4

-39.2

5111.68

17004.16

1536.64

970

161

41.6

-23.2

-965.12

1730.56

538.24

1498

298

569.6

113.8

64820.48

324444.16

12950.44

3389

789

2460.6

604.8

1488170.88

6054552.3 6

365783.04

289

46

-506.2

-183.5

92887.7

256238.44

33672.25

297

52

-498.2

-177.5

88430.5

248203.24

248203.24

303

67

-492.2

-162.5

79982.5

242260.84

26406.25

401

132

-394.2

-97.5

38434.5

155393.64

9506.25

432

136

-363.2

-93.5

33959.2

131914.24

8742.25

543

105

-252.2

-124.5

31398.9

63604.84

15500.25

641

187

-154.2

-42.5

6553.5

23777.64

1806.25

879

190

83.8

-39.5

-3310.1

7022.44

1560.25

1273

398

477.8

168.5

80509.3

228292.84

28392.25

2894

982

2098.8

752.5

1579347

4404961.4 4

566256.25

The formula of Pearson’s correlation coefficient as mentioned in the background information has been used to find the correlation coefficient. Here, x is the value of independent variable of each observation, y is the value of dependent variable of each observation, is the arithmetic mean of all the observations of the independent variable, is the arithmetic mean of all the observations of the dependent variable and denotes the sum of all the observation of the mentioned variable.

Calculation for Table 8A:

Let, the Pearson’s Correlation Coefficient be ☐.

Calculation for Table 8B:

Let, the Pearson’s Correlation Coefficient be ☐.

Calculation for Table 8C:

Let, the Pearson’s Correlation Coefficient be ☐.

The value of Pearson’s correlation coefficient for three groups are 0.998, 0.996 and 0.993 respectively. As it is a positive value, it can be stated that the correlation is increasing in nature, i.e., with an increase in total number of employees working or worked in the nuclear power plant, the number of cancer infected patient also increases. However, the value of Pearson’s correlation coefficient is very close to one. It signifies that the strength of correlation is very strong.

The hypothesis has been evaluated with the help of T – Test in this section of this mathematical exploration. The T – Test will conclude whether or not the null hypothesis or the alternate hypothesis is true.

Observed Value (O)

Expected Value (E)

10.33

9.52391937

0.80608063

0.64976598

0.06822464

11.08

11.3543268

-0.2743268

0.07525519

0.00662789

15.92

16.4517538

-0.5317538

0.2827621

0.01718735

10.66

9.99590573

0.66409427

0.4410212

0.04412018

11.01

11.9170245

-0.9070245

0.82269344

0.06903514

17.51

17.2670698

0.2429302

0.05901508

0.00341778

12.14

12.6415806

-0.5015806

0.2515831

0.01990124

15.3

15.0711732

0.2288268

0.0523617

0.0034743

22.11

21.8372462

0.2727538

0.07439464

0.00340678

15.74

17.8155717

-2.0755717

4.30799788

0.24181081

21.17

21.2395565

-0.0695565

0.00483811

0.00022779

32.92

30.7748719

2.1451281

4.60157457

0.14952376

11.32

16.3154204

-4.9954204

24.954225

1.5294871

21.15

19.4510903

1.6989097

2.88629417

0.14838727

31.48

28.1834893

3.2965107

10.8669828

0.38557975

13.2

13.0268235

0.1731765

0.0299901

0.00230218

18.52

15.5304561

2.9895439

8.93737273

0.57547394

19.34

22.5027203

-3.1627203

10.0027997

0.44451513

14.2

15.7005625

-1.5005625

2.25168782

0.14341447

18.17

18.7180625

-0.5480625

0.3003725

0.0160472

29.17

27.121375

2.048625

4.19686439

0.15474379

18.2

14.3943084

3.8056916

14.4832886

1.00618162

16.6

17.1607586

-0.5607586

0.31445021

0.01832379

21.62

24.864933

-3.244933

10.5295902

0.42347149

19.3

17.9737509

1.3262491

1.75893668

0.09786141

19.89

21.4281362

-1.5381362

2.36586297

0.11040918

31.26

31.0481129

0.2118871

0.04489614

0.00144602

22.68

20.3821569

2.2978431

5.28008291

0.25905418

23.28

24.2994152

-1.0194152

1.03920735

0.04276676

33.93

35.2084278

-1.2784278

1.63437764

0.04642007

Examining the value of with respect to the degree of freedom using the table as shown in Background Information Section, it is concluded that the Null Hypothesis is rejected and the Alternate Hypothesis is accepted.

What is the relationship between the number of employees working in a Nuclear Power Station and the number of employees getting infected by cancer during the working period or after retirement for three different age groups – Gr 1: 50 years to 60 years, Gr2: 60 years to 70 years and Gr3: 70 years and 80 years?

The relationship between the number of employees working or worked in Nuclear Power Plant and the number of employees out of them who are getting or got infected by cancer respectively is direct, i.e., with increase in total number of employees, the number of employees infected by cancer is also increased.

- The equation of trendline for Group 1, i.e., the age group of 50 to 60 years, is: y = 0.2386x - 54.366.
- The equation of trendline for Group 2, i.e., the age group of 60 to 70 years, is: y = 0.2401x - 38.697.
- The equation of trendline for Group 3, i.e., the age group of 70 to 80 years, is: y = 0.352x - 50.422.
- The value of regression coefficient for Group 1 is 0.997 which satisfies the existence of the increasing correlation between the independent and the dependent variable.
- The value of regression coefficient for Group 2 is 0.992 which satisfies the existence of the increasing correlation between the independent and the dependent variable.
- The value of regression coefficient for Group 3 is 0.987 which satisfies the existence of the increasing correlation between the independent and the dependent variable.
- The value of Pearson’s Correlation Coefficient for Group 1 is 0.998. Positive value of correlation coefficient signifies that the correlation is increasing (direct relation) in nature. Secondly, such a high value (close to 1) of coefficient satisfies the existence of the correlation.
- The value of Pearson’s Correlation Coefficient for Group 2 is 0.996. Positive value of correlation coefficient signifies that the correlation is increasing (direct relation) in nature. Secondly, such a high value (close to 1) of coefficient satisfies the existence of the correlation.
- The value of Pearson’s Correlation Coefficient for Group 3 is 0.993. Positive value of correlation coefficient signifies that the correlation is increasing (direct relation) in nature. Secondly, such a high value (close to 1) of coefficient satisfies the existence of the correlation.
- The minimum percentage of employees getting infected by cancer in all the three groups is in Byron Nuclear Power Station, with values ranging between 10% to 16%.
- The maximum percentage of employees getting infected by cancer in all the three groups is in Vogtle Nuclear Power Station, with values ranging between 22% and 34%.
- The percentage of infected individuals is minimum in first age group (50 years to 60 years). This is because of the strength of immunity each employee possesses. Another reason might be advancement in radiation prevention techniques which protects employees of this generation with more efficiency than that of the others.
- The percentage of infected individuals is maximum in third age group (70 years to 80 years). This is because of the weakened immunity of each retired employee. Another reason might be the number of employees who worked in power plants alive during the survey of data collection. Due to a smaller number of retired employees, the percentage has increased.
- It is concluded that if the total number of employees in age group 50 to 60 years is 227, then there will be no case of cancer.
- Similarly, if the total number of employees in age group 60 to 70 years is 161, then there will be no case of cancer.
- Similarly, if the total number of employees in age group 70 to 80 years is 143, then there will be no case of cancer.
- The test evaluates the hypothesis and concludes that the alternate hypothesis is true.

In this investigation, several process and mathematical tools have been observed to find the correlation along with its strength. The choice of nuclear power plants is one of the most important strength of this investigation. It has provided with a data sheet with accurate observations of employee count. On the other hand, internationally proclaimed newspapers has also contributed in this. Use of two different correlation coefficients – Regression and Pearson’s correlation coefficient has provided the strength and nature of correlation. Furthermore, calculation of percentage of employees infected with cancer has enabled the investigation to analyse the variation of cancer infected employee (dependent variable) in the observed data sheet. Lastly, the use of – test has provided the conclusion regarding the correlation.

However, there are few weakness that has been observed during this mathematical investigation. As immunity of human body is very uncertain and cannot be generalised. Moreover, cancer is one of the disease in which research is still going on and there are a lot of gaps or queries such as causes of cancer, etc. which governs the rate of spreading of cancer. As there are a lot of variables affecting the dependent variable apart from total employee count, thus, the correlation study cannot be efficiently carried on. In order to employ an efficient correlative analysis on the research question, all of these parameters must be controlled or made constant.

- What Is Cancer? - National Cancer Institute. 17 Sept. 2007, https://www.cancer.gov/about-cancer/understanding/what-is-cancer.
- Risk Factors: Radiation - National Cancer Institute. 29 Apr. 2015, https://www.cancer.gov/about-cancer/causes-prevention/risk/radiation.
- ‘UV Radiation’. The Skin Cancer Foundation, https://www.skincancer.org/risk- factors/uv-radiation/. Accessed 22 Nov. 2020.
- Nuclear Power Plants - U.S. Energy Information Administration (EIA). https://www.eia.gov/energyexplained/nuclear/nuclear-power-plants.php. Accessed 22 Nov. 2020.
- ‘Electromagnetic Radiation - Gamma Rays’. Encyclopedia Britannica, https://www.britannica.com/science/electromagnetic-radiation. Accessed 22 Nov. 2020.
- Correlation. http://www.stat.yale.edu/Courses/1997-98/101/correl.htm. Accessed 22 Nov. 2020.
- Data Analysis - Pearson’s Correlation Coefficient. http://learntech.uwe.ac.uk/da/default.aspx?pageid=1442. Accessed 22 Nov. 2020.
- Chi Square Statistics. https://math.hws.edu/javamath/ryan/ChiSquare.html. Accessed 23 Nov. 2020.
- Table: Chi-Square Probabilities. https://people.richland.edu/james/lecture/m170/tbl- chi.html. Accessed 23 Nov. 2020.
- ‘Nuclear Workers May Face Higher Cancer Risk’. WebMD, https://www.webmd.com/cancer/news/20050628/nuclear-workers-may-face-higher- cancer-risk. Accessed 22 Nov. 2020.
- Parthasarathy, K. s. ‘Is Working in a Nuclear Power Plant Risky?’ The Hindu, 1 Jan. 2014. www.thehindu.com, https://www.thehindu.com/sci-tech/science/is-working-in- a-nuclear-power-plant-risky/article5526497.ece
- Accidents at Nuclear Power Plants and Cancer Risk - National Cancer Institute. 19 Apr. 2011, https://www.cancer.gov/about-cancer/causes- prevention/risk/radiation/nuclear-accidents-fact-sheet.
- Exelon. https://www.exeloncorp.com:443/locations/power-plants/byron-generating- station. Accessed 25 Nov. 2020.
- Peach Bottom Atomic Power Station Receives Approval to Operate an Additional 20 Years | Transmission Intelligence Service. https://www.transmissionhub.com/articles/2020/03/peach-bottom-atomic-power- station-receives-approval-to-operate-an-additional-20-years.html. Accessed 25 Nov. 2020.
- NRC: Oconee Nuclear Station, Unit 1. https://www.nrc.gov/info- finder/reactors/oco1.html. Accessed 25 Nov. 2020.
- ‘Braidwood Generating Station | Braceville, Ill.’ Nuclear Powers IL, https://www.nuclearpowersillinois.com/braidwood_generating_station. Accessed 25 Nov. 2020.
- NRC: South Texas Project, Unit 1. https://www.nrc.gov/info- finder/reactors/stp1.html. Accessed 25 Nov. 2020.
- NRC: Susquehanna Steam Electric Station, Unit 1. https://www.nrc.gov/info- finder/reactors/susq1.html. Accessed 25 Nov. 2020.
- Energy, Duke. ‘McGuire Nuclear Station Focuses on Operational Excellence and Community Outreach’. Duke Energy | Nuclear Information Center, https://nuclear.duke-energy.com/2013/06/25/mcguire-nuclear-station-focuses-on- operational-excellence-and-community-outreach. Accessed 25 Nov. 2020.
- ‘Browns Ferry Nuclear Plant’. TVA.Com, https://www.tva.com/energy/our-power- system/nuclear/browns-ferry-nuclear-plant. Accessed 25 Nov. 2020.
- ‘Aps – Arizona Public Service Electric’. Aps, https://www.aps.com/en/About/Our- Company/Clean-Energy/Nuclear-generation. Accessed 25 Nov. 2020.
- ‘Vogtle 3 and 4’. Georgia Power, http://www.georgiapower.com/company/plant- vogtle.html. Accessed 25 Nov. 2020.