Enrichment and rearrangement of isotopic distribution in mass spectrometry

273 Views Asked by At

I am currently working on a project with some emphasis on the physics behind peptide mass spectrometry (a peptide being a fragment of a protein) and would like some help with creating a general formula for the isotopic distribution rearrangement upon isotope enrichment and fragmentation (I will explain what this means shortly). I have already got some help from this wonderful community, so I hope that you will find this problem interesting too.

As I am not a mathematician, I am likely to write things in a manner which may not be correct. In that case you are very welcome to correct me, as well as if there is anything else that is unclear or simply wrong.

In mass spectrometry one measures the mass (really the mass over charge ratio, but let's say mass for our purposes) of molecules, and their intensity. As there are natural abundances of different isotopes of elements with additional neutrons, thus different mass(see table 1), the measured signals form clusters of peaks for each molecule (in this case peptide) that follow a beautiful symmetry according to the molecule composition:

Figure 1: The black cluster of peaks is from one peptide, and the red from another.

In the left graph (spectrum) of the figure the black cluster of peaks is from one peptide, and the red from another. The first peak is called M+0, the second M+1 and so on, as in "mass plus 1 neutron" for M+1. The M+1 peak thus consists of peptides that contain 1 more neutron than peptides in the M+0 peak while the M+2 peak consists of peptides that contain 2 more neutrons than peptides in the M+0 peak, and so on. The isotopes to consider can be seen in table 1 below where the number inside the brackets are the number of protons and neutrons in the atom combined:

Table 1:

Isotope Additional neutrons
C[12]   0
H[1]    0
N[14]   0
O[16]   0
S[32]   0
C[13]   1
H[2]    1
N[15]   1
O[17]   1
S[33]   1
O[18]   2
S[34]   2

The important thing to remember is that a peptide has a specific chemical composition ($C_{34}H_{53}N_{7}O_{15}$ for the peptide with an amino acid chain "PEPTIDE" for example) but exists in a variety of combinations of stable isotopes of carbon (C), oxygen (O), nitrogen (N), hydrogen (H) and sulfur (S). The peaks in figure 1 thus consist of peptides with combinations of isotopes that give them the certain peak number. For example M+2 is constituted of peptides with 2 C[13], 1 C[13] and 1 N[15], 1 O[18], etc, while the M+0 peak only consists of isotopes with 0 additional neutrons in table 1. See table 2 for combinations of heavy isotopes in peptides constituting peak M+0 to M+2:

Table 2:

Of course, there are also peaks M+3, M+4, and so on. If we consider only one element (carbon for example) the peptide can be seen as a chain of black and white beads where black beads symbolizes C[13] and white beads C[12]. The different combinations of beads constitutes the peaks as follows for chains with five beads (or a peptide with five carbon atoms):

w = white (C[12])
b = black (C[13])

M+0: wwwww
M+1: bwwww, wbwww, wwbww, wwwbw, wwwwb
M+2: bbwww, bwbww, bwwbw, bwwwb, wbbww, wbwbw, wbwwb, wwbbw, wwbwb, wwwbb
M+3: bbbww, bbwbw, bbwwb, bwbbw, bwbwb, bwwbb, wbbbw, wbbwb, wbwbb, wwbbb
M+4: wbbbb, bwbbb, bbwbb, bbbwb, bbbbw
M+5: bbbbb

The relative distribution between peaks is called the isotopic distribution, but it may also refer to the peptide/chain sequences themselves (wwwww, bwwww, etc) and abundance refer to the fraction of molecules that constitutes a peak or the fraction of molecules of a certain sequence. The sum of all abundances is of course 1. In peptide mass spectrometry there are two steps that change the isotopic distribution:

  1. Isolation of certain peaks in the peptide cluster.
  2. Fragmentation of the peptide molecules.

For a certain peptide, before these steps we have a relative distribution between peaks M+0, M+1, M+2, M+3, etc as in figure 1.

Step 1: We isolate peak M+0 and additional peaks in a continuous window. That is; peaks in consecutive order (never M+0 and M+2 without also including M+1). We always include M+0 in the window. In this example only the M+0 peak is isolated though:

Figure 2:

enter image description here

Step 2: The peptide is fragmented, producing fragments (duh...) that are measured. The fragmentation means that each fragment, now shorter than its parent peptide have fewer atoms, thus fewer "positions" (atoms) that can be taken up by heavy isotopes and the isotopic distribution is thus changed.

Now to what you have all been waiting for; the math. Credit to Michael Seifert that helped me out immensely with this.

For now we are only considering one element (let's say carbon). Consider one particular peptide with $N$ carbon atoms that has fragmented to a fragment with $M$ carbon atoms where $M<N$. After isolation we have included peaks ranging from M+0 to M+$n$; $n$ is thus the maximum number of additional neutrons of the peptide isotopes that we isolated compared to the M+0 peak. Now consider one of the peaks in the fragment peak cluster with $m$ additional neutrons. Each fragment have a set of possible parent peptides; lets make an example with the beads on the chain as before:

w = white (C[12])
b = black (C[13])

N = 5 (chain length)
M = 3 (fragment length)
n = 2 (highest number of black beads)

M+0 (m=0): 
Fragments:
www

Possible parent chains:
www: wwwww, wwwbw, wwwwb, wwwbb

M+1 (m=1):
Fragments:
bww, wbw, wwb

Possible parent chains:
bww: bwwww, bwwbw, bwwwb
wbw: wbwww, wbwbw, wbwwb
wwb: wwbww, wwbbw, wwbwb

Notice that we did not include for example parent chain bwwbb for fragment bww as $n=2$, which limits parents to have 2 black beads maximum. As can be seen, a parent chain/peptide can be formed by adding a sequence of $N-M$ beads/atoms of carbon to the given fragment with no more than $n-m$ black beads/heavy isotopes of carbon. The number of parents to form a given fragment is thus: $$\sum_{k = 0}^{n - m} {N - M \choose k}$$ Furthermore, though each peak consists of fragments with the same number of additional neutrons, their sequences are different (bww, wbw, wwb in M+1 in the previous example). The number of fragments with $M$ carbon atoms and $m$ carbon 13 is ${M \choose m}$. The number of parent peptide-fragment pairs is thus: $${M \choose m}\sum_{k = 0}^{n - m} {N - M \choose k}$$

Considering one element (let's say carbon again) with the natural abundances $\sigma_0$ (C[12]) and $\sigma_1$ (C[13]) the abundance $I_l$ of one particular parent peptide (a specific sequence) with $l$ additional neutrons (M+$l$, or $l$ black beads in the chain example) can be calculated as: $$I_l = {\sigma_{0}}^{(N-l)}{\sigma_1}^l$$

For a certain element $e$, the same formula is:

$$I_{l,e} = {\sigma_{0}}^{(N_e-l)}{\sigma_1}^l$$

where $N_e$ is the number of atoms in the peptide of element $e$. Though I believe this is only the case for elements that only have one stable heavy isotope with one additional neutron. With this in mind though, we can conclude that at least with only carbon in mind, for a fragment peak M+$m$ the abundance is:

$$A = {M \choose m}\sum_{k = 0}^{n - m} {N - M \choose k} {I_{l}}$$

where $l = m + k$ is the peak number of, as well as the number of additional neutrons in the specific fragment's parent peptide:

$$A = {M \choose m}\sum_{k = 0}^{n - m} {N - M \choose k} {I_{m+k}}$$

As heavy isotopes of a specific element only can take up "slots" in the peptide that are atoms of that particular element, the total peak abundances is a product of the abundances calculated from each element separately: $$A_{tot} = \prod_{e} A_e = A_CA_OA_HA_NA_S$$

The abundance considering all elements in the peptide composition for isotopes of one additional neutron is thus:

$$A_{tot} = \prod_{e} {M_e \choose m}\sum_{k = 0}^{n - m} {N_e - M_e \choose k} {I_{m+k,e}}$$

where

$N_e$ is the number of atoms of element $e$ in the peptide,

$M_e$ is the number of atoms of element $e$ in the fragment,

$n$ is the maximum number of neutrons in the isolated peptide cluster peaks,

$m$ is the number of additional neutrons compared to the M+0 peak in the fragment cluster,

$I_{m+k},e$ is the abundance of the parent peptide of a specific peptide that generated the fragment.


First of all; does this seem right to you? How can I correct the formula to be valid for isotopes with 2 additional neutrons (O[18] and S[34])?

Please don't be afraid to criticize or ask questions. I know this got a bit technical so if you want me to formulate the problem in a more direct way, let me know and I will do so.

Thanks!

Max