Mathematics AI SL's Sample Internal Assessment

Mathematics AI SL's Sample Internal Assessment

To what extent does age influence National Basketball Association players’ performance level?

5/7
5/7
10 mins read
10 mins read
Candidate Name: N/A
Candidate Number: N/A
Session: N/A
Word count: 1,910

Table of content

Introduction

The National Basketball Association (NBA) is widely considered the best basketball league in the world, containing the world's most skilled and competent players of the sport (Klinzing, 2023). As someone who plays basketball at a high level, watching the NBA is one of my favorite pastimes and playing in the league was mine, and many other young basketball players', biggest dream. Based on this love for the sport, I have had interest in the best players in the league. A commonly-used method of ranking players is by their per-game statistics because this gives a way to quantify a player's on-court performance. I have always had interest in these statistics because of their stated effectiveness at rating a player's value on court. With this interest and time spent examining player statistics, I seemed to notice a pattern; the vast majority of player's per-game statistics seemed to improve in the early stages of their career, before hitting a peak and then beginning to decline as they become older and come closer to their retirement.

 

Once I had noticed this, I began to ponder the question: "To what extent does age influence National Basketball Association players' performance level?" I decided that doing a statistical exploration of this topic would be appropriate for attempting to quantify how an NBA player's performance declines with age. The statistical analyses that are ideal for quantifying player performance regression are part of the statistics and probability portion of the International Baccalaureate mathematics curriculum which is the basis for this exploration (IBO, 2023).

Investigation

Methodology

In order to complete this exploration, I plan to use the most commonly referenced NBA player statistics, which are points per game (PPG), rebounds per game (RPG), and assists per game (APG). These are the statistics I will use in order to quantify player performance. I will be adding these statistics up for various players' careers and thus I will be using one number per season per player, which will incorporate points per game + rebounds per game + assists per game. For the purposes of clarity within the statistical analysis, I will call this number "total production per game" or TPPG.

 

My source for this data will be the most extensive basketball statistic repository available online, www.basketball-reference.com (Basketball Reference). This website contains all game data and player statistic data that is publicly available. Thus, it is a reliable source for the statistics I need to use for my statistical analysis. This data is thus collected from a secondary source. My choice of players from which to gather data will be sampled via criterion sampling; the criterion which will be applied here is the amount of NBA seasons played. In order to find the widest range of ages and the largest amount of data points per player, I will be using statistics from 6 players who have played the largest number of seasons in the National Basketball Association. These players are Vince Carter, Robert Parish, Kevin Willis, Kevin Garnett, Dirk Nowitzki, and Lebron James (Wikipedia, 2023). For each of these players, statistics from ages 23 to 38 will be used for the statistical analyses, as this will provide 16 seasons' worth of data points. In addition, all of the selected players played in the NBA from ages 23 to 38, so the impact of age on performance for this age range can be explored adequately.

Calculating statistical improvement and regression

The statistical concepts from the curriculum that will be used in this statistical analysis are the concept of mean as well as correlation coefficients and lines of best fit related to x-on-y regression. Initially, player statistics will be quantified into one number as stated previously, as TPPG. Once these numbers have been made for every season each player played, a line of best fit will be made for each player by plotting TPPG against the player's age at that season. In addition, I will calculate mean statistics for each player in order to obtain a value to which I can compare every season the player played - thus I will be able to express their performance per season as a percentage of their mean career performance which will show the extent to which their later performances are affected by age. Expressing every player's performances as a ratio compared to their average statistics will allow me to express all data in one graph, in order to eliminate differences in raw player production (i.e. if a player averages 30 TPPG for his career and has a season averaging 25, it will be expressed as 0.83 as the ratio of \(25:30\)). Once this is complete, I will be able to identify any potential outliers in the data (showing very large or small amounts of change proportional to the average regression or improvement).

Data

Raw Data and Graphs

AgeV. CarterR. ParishK. WillisK. GarnettD. NowitzkiL. James
2335.417.221.439.735.745.1
2437.022.027.438.138.043.2
2533.930.819.338.533.245.6
2628.329.6-42.438.941.2
2732.130.221.142.938.441.2
2833.932.523.041.436.942.1
2934.331.735.938.635.740.3
3036.031.432.939.336.738.7
3132.429.833.031.435.439.5
3230.627.429.426.832.643.7
3323.630.319.824.330.545.2
3419.824.419.626.226.644.2
3515.733.325.526.930.643.3
3619.527.121.924.925.140.5
3718.026.314.314.626.644.7
389.023.916.715.122.244.0

Figure 1. Vince Carter's total production plotted vs. age. Data taken from Table 1.

Figure 2. Robert Parish's total production plotted vs. age. Data taken from Table 1.

Figure 3. Kevin Willis' total production plotted vs. age. Data taken from Table 1.

Figure 4. Kevin Garnett's total production plotted vs. age. Data taken from Table 1.

Figure 5. Dirk Nowitzki's total production plotted vs. age. Data taken from Table 1.

Figure 6. Lebron James' total production plotted vs. age. Data taken from Table 1.

When plotting this data in scatter chart form, we can draw some immediate observations from the data point distribution and trendlines; while the majority of players (4 out of the 6 players) show a clear downward trend in production, there are two players, Robert Parish and Lebron James, with very low coefficients of correlation \(\left(R^{2}\right)\). This shows that they have very little statistical regression in their performances over the course of their careers, most notably Lebron James, whose trendline is nearly flat \(\left(R^{2}=0.002\right)\). All trendlines for graphs were obtained via Microsoft Excel's data trendline function, a function which automatically calculates trendline equation and \(R^{2}\) value for the line-of-best-fit for a data point. The data used in Microsoft Excel was identical to Table 1 shown in this report.

Calculations/analysis

In order to be able to analyze all players stats within one graph, it is important to express individual season averages as a ratio to their mean statistics. The reason this is important is because certain players have higher total production per game, so comparing them all in a graph based on their raw numbers would not result in reliable data. In order to calculate mean TPPG for each player, the following formula was used:

 

\[\overline{\mathrm{x}}=\frac{1}{n} \sum_{i=1}^{n} \quad x_{i}\]

 

Where \(\sum_{i=1}^{n} \quad\) represents the sum of values, \(n\) represents the number of values, and \(x_{i}\) represents all values in the data set. Thus, to determine mean TPPG for each player, the following calculations were performed (sample calculation shown with Vince Carter's statistics, all mean calculations followed the same pattern):

 

\(\frac{(35.4+37+33.9+28.3+32.1+33.9+34.3+36.0+32.4+30.6+23.6+19.8+15.7+19.5+18.0+9.0)}{16}=27.47=\bar{x}\)

 

Using this equation, we determine each player's seasonal mean for total production per game:

 

Vince Carter: \(\bar{x}=27.47 \quad\) Robert Parish: \(\bar{x}=27.99\) Kevin Willis: \(\bar{x}=24.08\)
Kevin Garnett: \(\bar{x}=31.94 \quad\) Dirk Nowitzki: \(\bar{x}=32.69 \quad\) Lebron James: \(\bar{x}=42.66\)

 

Using these numbers, we can express each season as a ratio to each player's mean seasonal statistics, simply by dividing each season by the mean. For example, Vince Carter's age 23 season can be expressed as 35.4 (season TPPG) / 27.47 (his mean career TPPG) \(=\) 1.289. If we apply this to each player's statistics (from Table 1), we obtain an advanced table. An additional column has been added, to find the mean of the players' ratios:

AgeV. CarterR. ParishK. WillisK. GarnettD. NowitzkiL. JamesMean (x̄)
231.2890.6140.8891.2431.0921.0571.031
241.3470.7861.1381.1931.1621.0131.106
251.2341.1000.8011.2051.0151.0691.071
261.0301.057-1.3271.1900.9661.114
271.1691.0790.8761.3431.1750.9661.101
281.2341.1610.9551.2961.1290.9871.127
291.2491.1321.4911.2081.0920.9451.186
301.3111.1221.3661.2301.1230.9071.176
311.1801.0651.3700.9831.0830.9261.101
321.1140.9791.2210.8390.9971.0241.029
330.8591.0820.8220.7610.9331.0600.920
340.7210.8720.8140.8200.8141.0360.846
350.5721.1901.0590.8420.9361.0150.936
360.7100.9680.9090.7790.7680.9490.847
370.6550.9390.5940.4570.8141.0480.751
380.3280.8540.6940.4730.6791.0320.676

The numbers in this table are important because they represent the exact extent to which a player is performing above or below career standards. For example, a ratio of 0.328 represents total production that is only \(32.8\%\) of the player's average, or a \(67.2\%\) regression. Plotting the mean TPPG / career TPPG ratio against age gives us the following graph:

Figure 7. Mean player season TPPG / mean TPPG ratio plotted against player age at each season. Obtained from plotting Age column against Mean (x̄) column from table 2.

Based on Figure 7, it is clear that, on average, NBA players at age 23 perform approximately identically to their career average, following a period of steadily improving production. We can extrapolate that this continues until the player has their most productive years between the ages of 28 and 30 years old, which is generally in line with the concept of the "athletic prime" as it relates to basketball (Chomik & Jacinto, 2021). Within these years, players' mean performance is between \(12.7\%\) and \(18.6\%\) better than their mean career production. In addition, it is clear that there is a downward trend in production following the players' prime towards the later years of player careers, with a noticeable drop below the player's average production occurring around age 33.

Conclusion

Based on the player statistics collected and analyzed, it is clear that as National Basketball Association players approach the later stages of their careers, their performance begins to drop following their athletic prime. Particularly, age 33 is when these players began to consistently perform below their career averages in terms of their total statistical production per game. In terms of the degree of regression, players on average produced \(8.0\%\) less at age 33 than their mean career performances. At age 34, they produced \(15.4\%\) less statistics than their career mean. At age 35, performances were, on average, only \(6.4\%\) lower than their career average; however, this number is inflated due to uncharacteristically high-performing seasons from 2 out of the 6 players sampled (Kevin Willis and Robert Parish) at age 35. Following age 35, a sharp decline in mean production relative to career means was seen with every additional season played. Specifically, statistics at ages 36,37 and 38 were \(15.3\%, 24.9\%\), and \(32.4\%\) worse than career means, respectively. Of course, there are certain players that do not follow this exact pattern; for example, Lebron James had his poorest seasons relative to career averages between the ages of 26 to 31 years of age, and has not regressed in any way following age 35. Certain players like this are not the norm, as evidenced by the difference between James' older-age performances and the mean of all the players analyzed.

 

Thus, this exploration has resulted in numbers which answer the research question, and through looking at the results, one can see the extent to which age affects NBA player performance. The results of this exploration support data collected in previous studies such as by Harvard's Sports Analysis Collective (Brady, 2017), whose data collected shows sharp declines in player production after age 32, in further statistical categories. In addition, the average age of the NBA's Most Valuable Player (MVP) award recipient, is 27.9 years of age (U. of Washington, 2020). This supports the data from this exploration which seems to support the notion that peak NBA player performance is between ages 28 and 30. This is another portion of the conclusion that can be drawn from the statistical analysis; NBA player performance is at its best around ages 28-30. This is on the opposite end of the spectrum in terms of the effect of age. Ages 28-30 tended to be the most productive years, even more productive than young ages. Thus, older age is beneficial to NBA players until about age 30, after which performance starts to return to a players' mean. Following this, age 33 is when a sharp decline in performance begins for an average National Basketball Association player.

Reflection

There were limitations to this exploration, most notably with the data selected for analysis as well as a portion of the analysis. If possible, I would like to extend this study to contain more measurable statistics for NBA players. For example, there are certain statistics related to player efficiency on top of just the three major counting statistics (points/rebounds/assists per game), such as field goal percentage, player efficiency rating, and assist to turnover ratio. Statistics like these can also help paint a picture of player performance and how it shifts with older age - for example older players may be shooting better performances and losing the ball less as additional experience could lead to mitigation of errors common in young players.

 

In addition, if I were to replicate this exploration, I would analyze the statistics of more players - this would result in more data points which would provide a more precise image of the impact of age on performance. As more data points would be available, a more accurate mean would be provided. In addition, more data points would allow for standard deviation to be calculated. If standard deviation was calculated with more data points to compare the mean to, then outlier data could more efficiently be eliminated. This study could potentially hold further implications regarding more sports; perhaps further sports could be analyzed in the same way to examine the effects of older ages on general athletic performance.

Bibliography

Klinzing, M. (2023, February 23). World's Top 4 Basketball Leagues to Look Out for in 2023. Headstart Basketball. https://headstartbasketball.com/worlds-top-4-basketball-leagues-to-look-out-for-in-2023/

 

IBO (2023, June 06). Maths in the DP. International Baccalaureate Programme. https://www.ibo.org/programmes/diploma-programme/curriculum/mathematics/

 

Players. (2023, August 01). Basketball Reference. https://www.basketball-reference.com/

 

List of National Basketball Association seasons played leaders. (2023). Wikipedia. https://en.wikipedia.org/wiki/List_of_National_Basketball_Association_seasons_played_leaders

 

Chomik, R., Jacinto, M. (2021, August). Peak Performance Age in Sport. Arc Centre of Excellence in Population Aging Research. Retrieved from https://cepar.edu.au/sites/default/files/peak-performance-age-sport.pdf

 

Brady, B. (2017, November 17). What Happens to NBA Players When They Age? Harvard Sports Collective Blog. Retrieved from https://harvardsportsanalysis.org/2017/11/what-happens-to-nba-players-when-they-age/

 

University of Washington. (2020). Analysis of Peak Age and Effects of Aging on NBA Players. https://courses.cs.washington.edu/courses/cse163/20su/files/project/archive/nba.pdf

AI Assist

Expand

AI Avatar
Hello there,
how can I help you today?