As a football fan who is both a football player and a massive supporter of the English Premier League club, Arsenal, I have developed an interest in investigating the relationship between the amount of money a team spends on buying players during summer (June-August) transfer window, and their position on the table at the end of the season. This investigation is aimed at determining whether a team's input( money spent) in the transfer window is a determining factor of the club's overall performance in the league. Exploring this topic will allow me to solve the mystery behind expenditure and standings in the English Premier League(EPL), which can further be extended to other football leagues and hence be used as a predictor of performance.
I expect a weak negative relationship between the variables ( amount of money spent and total points)because I have realized that bigger clubs who realistically spend more, usually end up higher up the table than smaller clubs who understandably spend less. I expect a negative because moving up the table is represented by a decreasing value of the position with 1st being the best position and 20th being the worst. I mention "bigger clubs", but the question is, what makes a club "big" in the EPL? In the EPL, a club is considered a "big club" based on several factors:
Despite the common use of these factors to determine a "big" club, they can be subjective and vary over time (Das, 2023). Hence, a "small" club can be considered as one that is classified on the lower side of the factors stated above.
However, to simplify the process and test my hypothesis, I am going to use the famous "Big Six" of the EPL (Arsenal, Liverpool, Manchester City, Manchester United, Chelsea, and Tottenham) as representatives of the term "big clubs", and to compare their expenditure and standings to that of the rest of the clubs in the EPL (Kelly, 2021).
To conduct my investigation, I used a reliable website, transfermarket.com, that contains up-to-date accurate documentation of transfer information of various clubs and leagues worldwide. I will look at two Premier League seasons: 20/21 and 21/22. This is to come up with an accurate conclusion by conducting two separate investigations and comparing the findings of the data in these two seasons.
Position | Club | Amount spent | Club Value | Major trophies won |
---|---|---|---|---|
1 | Man City | €167.40m | €1.06bn | 22 |
2 | Man Utd | €62.50m | €758.10m | 42 |
3 | Liverpool | €79.70m | €1.03bn | 43 |
4 | Chelsea | €247.20m | €880.10m | 25 |
5 | Leicester | €59.40m | €420.30m | 5 |
6 | West Ham | €27.70m | €295.15m | 4 |
7 | Tottenham | €110.50m | €721.55m | 17 |
8 | Arsenal | €84.00m | €599.35m | 31 |
Position | Club | Amount spent | Club Value | Major trophies won |
---|---|---|---|---|
1 | Man City | €117.50m | €1.04bn | 23 |
2 | Liverpool | €40.00m | €879.50m | 45 |
3 | Chelsea | €118.00m | €881.50m | 25 |
4 | Tottenham | €66.90m | €697.00m | 17 |
5 | Arsenal | €165.60m | €548.50m | 31 |
6 | Man Utd | €142.00m | €937.25m | 42 |
7 | West Ham | €74.50m | €354.75m | 4 |
8 | Leicester | €67.60m | €550.10m | 5 |
The linear correlation is a measure of the relationship between two variables represented in a scatter plot. In my investigation, the two variables are the amount of money spent by a club and the position of the team at the end of the season. When plotted on a scatter diagram, the proximity of the points to each other determines the strength of the linear relationship. The strength of the relationship is quantified using a coefficient r which is the Pearson Product-Moment Correlation Coefficient (PPMCC).
The numerical value, r, usually lies within the range of -1 and 1 where:
The strength of the correlation varies with the proximity of r to either -1 or 1 whereby when:
The above also applies when the value is on the negative scale.
Where \(r=-0.577\) (See Appendix C)
Because the r value lies between -0.5 and -0.75, the result for the 20/21 season suggests a moderate negative correlation between the two variables of the amount spent and the position of the team in the EPL. By observing the scatter diagram above, I was able to deduce a negative relationship based on the gradient of the trendline. The moderate PPMCC value suggests that as the amount of money spent by a club reduced, the position of the club moved lower down the table (representative of decreasing performance). However, to confirm whether this is a recurring trend over seasons, I conducted the same test on the 21/22 season to compare the coefficients.
Where \(r=-0.547\) (See Appendix D)
The r value lies between -0.5 and -0.75 hence suggesting a moderate negative correlation between the variables, though weaker than the previous season's coefficient, they both suggest that a lower value of money spent in the summer transfer window results in poorer performance which is represented by a lower standing in the EPL table.
In support of my hypothesis following the results of testing the relationship between the variable in the 20/21 and 21/22 EPL seasons, I can deduce that the trend is a recurring one and shows some dependence of a team's standing upon the money spent on transfers. However, I will need to test this dependence to confirm this observation.
The χ² test for independence is used to determine whether two sets of data are independent of each other or not. I had earlier observed some dependence between the variables of my investigation and sought out the χ² test for independence as a suitable method to test and verify my observation. The χ² was also used to further examine the relationship between the variables.
To conduct this test I categorized the EPL clubs in terms of performance and spending, where I broke down performance into either good, moderate, or poor, and broke down spending into high, moderate, and low.
Below are the conditions a team had to meet to fall under the specific broken-down groups.
For performance:
For spending:
I used a 1%(0.01) level of significance to minimize the chances of incorrectly rejecting the null hypothesis. I used the critical value of 13.277 which was obtained from a Chi-Square distribution table by Turney (2022).
The null hypothesis and alternative hypothesis:
\(H_0\): The performance of a team is independent of its spending
\(H_1\): The performance of a team is not independent of its spending
High Spending | Moderate Spending | Low spending | Total | |
---|---|---|---|---|
Good Performance | 2 | 3 | 1 | 6 |
Moderate Performance | 2 | 4 | 2 | 8 |
Poor Performance | 0 | 1 | 5 | 6 |
Total | 4 | 8 | 8 | 20 |
I calculated the Expected values using the formula: \(E(x)=\frac{\text{Total}}{\text{No. of cells}}=\frac{20}{6}=3.3\)
Below are the results of the χ² test that I got using my GDC:
\[\begin{aligned}&\chi^2=7.083\\&p=0.132\\&df=4\end{aligned}\]
Following the result, I accepted the \(H_0\) because \(7.083(\chi^2\text{ value})<13.277\) (critical value). These results suggest no significant association between the variables in the 20/21 EPL season.
High Spending | Moderate Spending | Low spending | Total | |
---|---|---|---|---|
Good Performance | 4 | 1 | 1 | 6 |
Moderate Performance | 0 | 5 | 3 | 8 |
Poor Performance | 0 | 3 | 3 | 6 |
Total | 4 | 9 | 7 | 20 |
The results of the χ² test conducted for the 21/22 season were:
\[\begin{aligned}&\chi^2=11.96\\&p=0.018\\&df=4\end{aligned}\]
Following the results above I accepted the \(H_0\) because \(11.96(\chi^2\text{ value})<13.277\) (critical value). Suggesting that there is a lack of a significant relationship between the two variables in the 21/22 season.
The results of the tests from both EPL seasons suggest that there is no dependence between the amount of money spent during a transfer window and a team's overall performance. These findings contradict and invalidate the observations I had made about the presence of some dependence between the variables.
The statistical tests used to investigate the relationship between the variables acted as strengths of my investigation because the PPMCC is well-suited for assessing linear relationships and is commonly used, making it versatile for exploring relationships between two variables. The Chi-Square test of independence played a vital role in verifying the relationship suggested by the PMCC and provided results that are relatively easy to understand.
On the contrary, my investigation was limited since I only took into account two English Premier League seasons hence the representativeness of the findings is limited and would require the testing of a larger sample such as 5 seasons. However, this was impractical due to the vast amount of data collection and tests that it would require.
The expenditure data collected on the clubs can be invalidated by the fact that the clubs might not release the actual amount of money spent hence raising concerns about the accuracy of my results. Therefore, I cannot generalize my findings to other seasons of the EPL or other leagues.
Overall, it can be concluded that my initial hypothesis, whereby I expected a weak negative relationship, has been invalidated by my investigation. The first test, the PMCC, showed the presence of a relationship between the variables however, the second test, the Chi-Square test for independence, contradicted the relationship assumed and suggested that there is no substantial evidence to show an association between a club's spending and their placement in that English Premier League table. Furthermore, it depicts that the amount of money spent in the summer transfer window, which is the longest window, cannot be used as a predictor of a team's performance at the end of the season. This investigation is crucial for football fanatics such as me and others and can be applied to our everyday lives such that we can understand that regardless of the money spent by any team, it does not guarantee their performance, and this helps reduce assumptions and false predictions. The findings have proved that the common relation of a club's spending to the club's performance is a stereotype and is not the reality of the sport. At the end of this investigation, my interest in the relationship between expenditure and performance has been satisfied and I am now able to keep my expectations for my team (Arsenal) realistic by looking at other predictor factors such as squad depth and strength rather than making assumptions based on transfer spendings.
Das, P. (2023). What makes an English Premier League (EPL) club "big"? Quora. https://www.quora.com/What-makes-an-English-Premier-League-EPL-club-big
Kelly, R. (2021, April 21). Who are the Premier League 'big six'? Top English clubs & nickname explained. Goal.com. https://www.goal.com/en-ke/news/who-are-premier-league-big-six-top-english-clubs-nickname-explained/130iokmi8t8dt1k3kudou73s1k
Neuenhaus, M. (n.d.). Football (Running Total of Trophies). KryssTal. http://www.krysstal.com/trophies.html
Premier League - Transfers 20/21. (n.d.). Www.transfermarkt.com. https://www.transfermarkt.com/premier-league/transfers/wettbewerb/GB1/plus/?saison_id=2020&s_w=s&leihe=1&intern=0&intern=1
Premier League - Transfers 21/22. (n.d.). Www.transfermarkt.com. https://www.transfermarkt.com/premier-league/transfers/wettbewerb/GB1/plus/?saison_id=2021&s_w=s&leihe=1&intern=0&intern=1
Turney, S. (2022, May 31). Chi-Square (X²) Table |Examples & Downloadable Table. Scribbr. https://www.scribbr.com/statistics/chi-square-distribution-table/
Position | Club | Amount spent | Club Value | Major trophies won |
---|---|---|---|---|
1 | Man City | €167.40m | €1.06bn | 22 |
2 | Man Utd | €62.50m | €758.10m | 42 |
3 | Liverpool | €79.70m | €1.03bn | 43 |
4 | Chelsea | €247.20m | €880.10m | 25 |
5 | Leicester | €59.40m | €420.30m | 5 |
6 | West Ham | €27.70m | €295.15m | 4 |
7 | Tottenham | €110.50m | €721.55m | 17 |
8 | Arsenal | €84.00m | €599.35m | 31 |
9 | Leeds | €127.80m | €128.05m | 7 |
10 | Everton | €74.37m | €411.05m | 15 |
11 | Aston Villa | €85.50m | €232.70m | 20 |
12 | Newcastle | €39.00m | €228.45m | 11 |
13 | Wolves | €84.59m | €318.80m | 9 |
14 | Crystal Palace | €18.90m | €188.50m | 0 |
15 | Southampton | €37.30m | €211.20m | 1 |
16 | Brighton | €21.90m | €209.05m | 0 |
17 | Burnley | €1.10m | €154.78m | 1 |
18 | Fulham | €37.25m | €144.25m | 0 |
19 | West Brom | €40.45m | €69.00m | 7 |
20 | Sheff Utd | €62.70m | €137.95m | 5 |
Position | Club | Amount spent | Club Value | Major trophies won |
---|---|---|---|---|
1 | Man City | €117.50m | €1.04bn | 23 |
2 | Liverpool | €40.00m | €879.50m | 45 |
3 | Chelsea | €118.00m | €881.50m | 25 |
4 | Tottenham | €66.90m | €697.00m | 17 |
5 | Arsenal | €165.60m | €548.50m | 31 |
6 | Man Utd | €142.00m | €937.25m | 42 |
7 | West Ham | €74.50m | €354.75m | 4 |
8 | Leicester | €67.60m | €550.10m | 5 |
9 | Brighton | €57.00m | €248.10m | 0 |
10 | Wolves | €32.30m | €391.30m | 9 |
11 | Newcastle | €29.40m | €242.90m | 11 |
12 | Crystal Palace | €73.44m | €239.45m | 0 |
13 | Brentford | €38.20m | €167.85m | 0 |
14 | Aston Villa | €99.80m | €406.80m | 20 |
15 | Southampton | €63.40m | €241.30m | 1 |
16 | Everton | €2.00m | €461.75m | 15 |
17 | Leeds | €61.05m | €250.80m | 7 |
18 | Burnley | €31.90m | €145.30m | 1 |
19 | Watford | €18.80m | €133.80m | 0 |
20 | Norwich | €63.55m | €189.55m | 2 |
The value of R is -0.5775.
This is a moderate negative correlation, which means there is a tendency for high X variable scores to go with low Y variable scores (and vice versa).
AI Assist
Expand