Areas of knowledge, when combined, give us a more useful and productive knowledge to understand. Use of mathematical and statistical tools in science have always exemplified and illustrated this claim. Now, during the global pandemic, when collecting primary data in a laboratory set up turned out to be difficult and not feasible, the focus of my investigation was on exploring how mathematics is used for the production of scientific knowledge and the evaluation of progression of it. Having said this, I tried exploring various sectors of chemistry where this happens. As a part of mathematics, I have studied how to determine the correlation between variables when a broad set of values is provided to us and how the delineation of outliers in the data set enables us to figure out new ventures. This brought me to the thought of studying a correlation between any two variables of significant importance and trying to explore the correlation using facts and principles of chemistry. While searching for an appropriate database, I was struck at amino acids, and this is because of the reading I have done about them because of my interest in folding and unfolding of proteins as a part of research in biomolecules. Identifying amino acids after the hydrolysis of proteins is done to elucidate the structure of a new protein and chromatography is one of the most significant analytical methods for that. Here, the retention factors of the amino acids are measured experimentally and compared with literature values to identify them. This brought me to the question- What are the factors that the value of retention factors depends on? Like retention factor, another vital chemical property used to categorise or recognise amino acids is the value of pH at the iso-electric point. Thus, the meta-cognition made me think that there is a connection between these two factors. To make the investigation more comprehensive and conclusive, I have also added molar mass of the amino acids to the list. This investigation will enable us to explore a new chemical relationship exploiting the facts and theories already known to us.
To what extent is there a correlation between the magnitude of the retention factor of amino acids and the molar mass of the acid or the pH at the iso-electric point, determined using regression analysis?
In paper chromatography, a chromatography paper is taken, and a baseline is drawn using a pencil. Then the paper is kept suspended from a height so that the bottom of the paper touches the solvent (usually water or ethanol or ethyl acetate) kept inside a beaker. A capillary is taken, and a mixture of miscible liquids is kept at the centre of the baseline. The solvent is allowed to rise along with the height of the paper vertically, and the mixture rises with it as well. After some time, when the solvent has travelled to the tip of the paper, the components of the mixture get separated and travel to different heights. If the spots are colourless, a colouring agent like ‘ninhydrin’ is used to make the spots visible. The distance travelled by the solvent and that by the spot is measured using a ruler, and thus the retention factor is calculated using the formula given below:
\(\text{Retention factor (Rf) = }\frac{Distance\ travelled\ by\ the\ spot\ (Z_x)}{Distance\ travelled\ by\ the\ solvent\ (Z_f- Z_o)}\)
Zf = Distance travelled by the solvent
Zx = Distance travelled by the sample
Zo = Difference of height between the solvent line and the sample line
Retention factor is a unitless quantity, and the number of spots indicates the number of pure components in the mixture.
Retention factor depends on the affinity of a particular component of the mixture in the solvent that has been used. More the solubility of the component or the affinity of the sample to the solvent, longer the distance it travels along with the solvent, higher the value of x and higher the magnitude of the retention factor. Thus, stronger the intermolecular force of attraction between a particular amino acid and the substance used as a solvent, longer it travels and more the value of retention factor.
Amino acids, in general, are represented as
It has a quaternary C atom with a side chain alkyl group (R ), an amino group (NH2), a carboxylic acid group (COOH) and an H atom. The side chain may also be an H atom as is the case with glycine. The side chain can be a simple hydrocarbon chain as is alanine or a substituted alkyl group as in lysine or phenyl aspartic acid.
At an acidic pH (pH < 7.00), the amino group behaves as a Bronsted Lowry base and accepts a hydrogen ion to get protonated and thus the amino group gets converted into – NH3+. This causes the amino acid to contain a positive charge which eventually makes it to exist as a cation and migrates towards the cathode if electrolysed.
NH2-RCH-COOH (acid) + H+ -------🡪 NH3+-RCH-COOH (conjugate acid)
Similarly, in an alkaline pH (pH>7.0), the amino acid behaves as an Arrhenius acid, and the COOH group loses one Hydrogen ion to exist as COO- in its conjugate base form. This imparts a negative charge on the amino acid, and thus it behaves like an anion and migrates towards the anode if electrolysed.
NH2-RCH-COOH (acid) + OH- -------🡪 NH3+-RCH-COO- (conjugate acid)
At a certain pH, both the amino and the carboxylic group react simultaneously. The amino group gets protonated while the carboxylic acid group (COOH) group gets deprotonated at the same time. This value of pH is known as the iso-electric point of that amino acid. The value of the iso-electric point differs from one amino acid to the other.
At the isoelectric point, the amino acid exists in this form:
NH3+ ---RCH------COO-
The same molecule has both positive and negative charge at the same time but at different locations. Overall, it behaves as a neutral molecule and thus does not migrate either towards the cathode or the anode while electrolysis is done. These kinds of structures are known as Zwitterions.
Presence of an extra amino group or an extra carboxylic acid group in the side chain (R) of the amino acid may interfere with its behaviour as a Zwitterions.
The values of retention factor were obtained from three different sources:
The values of pH at iso-electric point were obtained from three different sources:
The sources I used include research papers, e-text books and a commercial website. The research papers are credible and reliable as they have already been accredited by internationally science journals and published there. Moreover, they have enough citations which makes their credibility evident. The e-text books are purely for academic purpose and are thus reliable sources. The commercial website used is just providing factual information and thus has no purposeful intention to showcase manipulated data.
Amino acids | Molecular formula | Relative molecular mass |
---|---|---|
Alanine (Ala) | C3H7NO2 | 89.0935 |
Arginine (Arg) | C6H14N4O2 | 174.2017 |
Asparagine (Asn) | C4H8N2O3 | 132.1184 |
Aspartic acid (Asp) | C4H7NO4 | 133.1032 |
Cysteine (Cys) | C3H7NO2S | 121.159 |
Glutamine (Gln) | C5H10N2O3 | 146.1451 |
Glutamic acid (Glu) | C5H9NO4 | 147.1299 |
Glycine (Gly) | C2H5NO2 | 75.0669 |
Histidine(His) | C6H9N3O2 | 155.1552 |
Isoleucine(Ile) | C6H13NO2 | 131.1736 |
Leucine(Leu) | C6H13NO2 | 131.1736 |
Lysine (Lys) | C6H14N2O2 | 146.1882 |
Methionine (Met) | C5H11NO2S | 149.2124 |
Phenylalanine (Phe) | C9H11NO2 | 165.19 |
Proline (Pro) | C5H9NO2 | 115.131 |
Serine (Ser) | C3H7NO3 | 105.093 |
Threonine (Thr) | C4H9NO3 | 119.1197 |
Tryptophan (Trp) | C11H12N2O2 | 204.2262 |
Tyrosine (Tyr) | C9H11NO3 | 181.1894 |
Valine (Val) | C5H11NO2 | 117.1469 |
Values of relative atomic mass was taken from WebQC.
Can also be calculated using the IB chemistry data booklet.
Calculation
For glycine (C2H5NO2) :
Molar mass = (2 × 12.01) + (5 × 1.01) + (1 × 14.01) + (2 × 16.00) = 24.02 + 5.05 + 14.01 + 32.00 = 75.08
Retention factor of an amino acid depends on the solvent which is used in the chromatography. Here all the data used for retention factor are the values taken considering ethanol as the solvent. The same amino acid may have different values of retention factor if the solvent used in the chromatography is changed. For example, the value of retention factor of glycine when taken in a solvent- a mixture of n-butanol, ethanoic acid and water in the ratio 4:1:5 and a mixture of phenol and water in the ratio 4:1 are 0.17 and 0.40 respectively.
Chromatography comes in various forms – liquid chromatography, gas chromatography and thin-layer chromatography. This investigation deals with paper chromatography. As the type of chromatography changes, the method of calculating and interpreting the magnitude of retention factor differs.
Amino acid | Molar mass | Retention factor ±0.01 | pH at the iso-electric point |
---|---|---|---|
Alanine (Ala) | 89.0935 | 0.27 | 6.07 |
Arginine (Arg) | 174.2017 | 0.15 | 10.76 |
Asparagine (Asn) | 132.1184 | 0.17 | 5.41 |
Aspartic acid (Asp) | 133.1032 | 0.21 | 2.87 |
Cysteine (Cys) | 121.159 | 0.26 | 5.08 |
Glutamine (Gln) | 146.1451 | 0.23 | 5.65 |
Glutamic acid (Glu) | 147.1299 | 0.26 | 3.12 |
Glycine (Gly) | 75.0669 | 0.22 | 6.03 |
Histidine(His) | 155.1552 | 0.13 | 7.62 |
Isoleucine(Ile) | 131.1736 | 0.55 | 6.03 |
Leucine(Leu) | 131.1736 | 0.59 | 6.02 |
Lysine (Lys) | 146.1882 | 0.12 | 9.56 |
Methionine (Met) | 149.2124 | 0.46 | 5.73 |
Phenylalanine (Phe) | 165.19 | 0.61 | 5.71 |
Proline (Pro) | 115.131 | 0.25 | 6.36 |
Serine (Ser) | 105.093 | 0.23 | 5.68 |
Threonine (Thr) | 119.1197 | 0.25 | 5.69 |
Tryptophan (Trp) | 204.2262 | 0.57 | 5.88 |
Tyrosine (Tyr) | 181.1894 | 0.49 | 5.64 |
Valine (Val) | 117.1469 | 0.44 | 6.00 |
Ethanol is used as a solvent in the chromatography data used here. If we consider the interaction between the solvent (ethanol) and the amino acids as covalent in nature and dictated by intermolecular forces like H bonding, Vander wall forces or London dispersion forces then we can propose certain claims and justify or verify them using the pattern in Figure - 8.
Claim 1: As the molar mass of the amino acid increases, the London dispersion forces between the molecules of the amino acid will be more and thus the solvent (ethanol) – sample (amino acid) interaction would be less than the sample (amino acid)-sample (amino acid) interaction. This will decrease the affinity of the sample towards the solvent and it would consequently travel a lesser distance in comparison to the neutral amino acids which would eventually lower down the magnitude of the retention factor. To be precise, the higher the molar mass, more the intermolecular force between the amino acids and lower the retention factor. Figure - 8 is in opposition to this statement. It clearly shows that as molar mass increases, the retention factor increases instead of decreasing. This brings us to the conclusion that the major or predominant interaction between the amino acids is definitely not the Vander-waal forces of attraction.
Claim 2: If the amino acid can make a intermolecular H bond with the solvent, that would increase the stability of the solvent (ethanol)- sample (amino acid) system in comparison to the sample (amino acid) – sample (amino acid) system. All amino acids are able to make intermolecular H bonds with ethanol due to the presence of the N - H bond in the amine group and the O - H group in the carboxylic acid. This is illustrated in the diagram below:
Formation of these H bonds will increase the affinity of the amino acid towards ethanol and make them travel a longer distance eventually increasing the retention factor. Thus, if there is a chance to make more H bonds, the retention factor would become more. Presence of an extra amine group or an extra carboxylic acid group in the side chain will enable the amino acid to make more number of H bonds with the solvent. Thus, we can claim that basic and acidic amino acids must have a higher value of retention factor than the neutral ones. But, Figure - 8 defies that. Aspartic acid and glutamic acid are both acidic amino acids with lower retention factor than alanine which is a neutral amino acid.
In Figure - 8, a linear trend line has been shown. A lot of outliers have also been indicated and observed in that. The major outliers are – isoleucine, leucine, phenylalanine and tryptophan on the positive side (having data values with positive deviation from the trend line) and histidine, lysine and arginine on the negative side (having data points with values in negative deviation from the trend line). Now, the question is – Is there any structural commonality between the outliers? For the amino acids – histidine and arginine which shows a value of retention factor much lower than the trend line has an amide group and are resonance stabilised. Arginine has an amide group and can exhibit resonance. Arginine has a heterocyclic amine ring and can thus also display resonance stability. If an amino acid is able to display resonance, the stability of the amino acid will increase and thus it will have a lower tendency to bind to the solvent and travel a longer distance in the chromatogram. Thus, we can conclude that presence of a group in the side chain which enables the amino acid to display resonance may decrease the retention factor for that amino acid.
Although any definite correlation cannot be predicted between the retention factor and the pH at iso-electric point, yet the discussion of the outliers can give us some more inputs about how the pH at iso-electric point may depend on the structural feature of the amino acid. As indicated in Figure - 3, amino acids can be classified as neutral, acidic and basic according to the nature of the side chain. Aspartic acid and glutamic acid are acidic amino acids as they have a COOH group in the side chain while lysine and arginine are basic amino acids as they have a protonated amine group in the side chain. Amino acids like glycine or alanine are neutral amino acids as they have neither a basic amino group or an acidic COOH group in the side chain.
As indicated in Figure - 5, it can be generalized that the acidic amino acids have lower values of pH at isoelectric point and basic amino acids have higher values of pH at iso electric point in comparison to the neutral ones. Let us consider three examples to make the discussion simple. Let us take alanine as an example of a neutral amino acid with a pH at iso electric point = 6.07, aspartic acid as an example of an acidic amino acid with pH at iso-electric point = 2.87 and lysine as an example of a basic amino acid with a pH at iso-electric point = 9.56. As the acidic amino acids have an additional COOH group, in the basic medium, there are two negative charges on the molecules ; one on the COOH group of the amino acid and one on the COOH group of the side chain. Thus, to make it neutral, lower pH values are required which finally decreases the pH at iso electric point for them. Similarly, the presence of an extra amine group in the side chain of basic amino acids also imparts an extra positive charge on the molecule which causes the molecule to be neutral only at a higher value of pH.
But the concern is whether these behavioural properties of the molecule have any correlation with the retention factor or not. The retention factor of any amino acid will entirely depend on how strongly it can bind with the solvent used in chromatography. This again depends mostly on the electrostatic force of interaction between the sample and the solvent molecule. Stronger the interaction between the sample and the solvent, longer the distance travelled by the sample and more the retention factor. In case, we consider the interaction between the amino acid and the ethanol as purely ionic, we can claim that production of charges will make the interaction stronger. In that case, the retention factor for basic or acidic amino acids must be higher than the neutral ones as they have an extra amine or COOH group to have an extra positive or negative charge. Thus the claim can be that acidic or basic amino acids will have higher values of retention factor in comparison to the neutral amino acids. Figure - 12 is not in support of this statement. Alanine is a neutral amino acid with a retention factor of 0.27 while aspartic acid and lysine are acidic and basic amino acids having retention factor 0.21 and 0.15 respectively. This brings us to the conclusion that the interaction between the amino acids and the solvent (ethanol) is not ionic or electrostatic in nature. Thus, the charges on the amino acid molecules would have no effect on the retention factor of the amino acids. We can conclude that presence of a group in the side chain which enables the amino acid to display resonance may decrease the retention factor for that amino acid as seen in case of histidine or arginine while data from phenylalanine and tryptophan counters this.
The investigation aimed to answer the question “To what extent is there a correlation between the magnitude of retention factor of amino acids (in water as solvent) and the molar mass of the acid or the pH at iso-electric point, determined using regression analysis?”
As the solvent plays a major role in chromatography, the effect of the nature of solvents on the values of the retention factor can be studied. If glycine is taken and chromatography is performed with a solvent of aqueous ethanol, the percentage of ethanol can be varied and the magnitude of retention factor can be calculated against that. Graphing will allow us to interpret how the magnitude of retention factor would depend on the percentage concentration of the ethanol in the solvent.