Fitting the OPLS forcefield dihedral to ab initio data

52 Views Asked by At

Background

I am trying to fit a set of energy vs. dihedral angle data obtained from the quantum mechanical software Gaussian. The data was obtained by rotating a certain group of atoms 10 degrees at a time relative to another certain group of atoms and calculating the energy at each rotation. The result is two peaks (corresponding to the symmetric, highest energy conformations) separated by a minimum (corresponding to the lowest energy conformation).

Data

The data are given below:

Angle Energy
-45.29984 -0.50819
-35.29987 -0.51017
-25.29983 -0.51172
-15.29985 -0.51276
-5.29986 -0.5133
4.70011 -0.51335
14.7001 -0.51291
24.70008 -0.51196
34.70007 -0.51049
44.69999 -0.50855
54.70017 -0.50628
64.70004 -0.50395
74.70002 -0.50194
84.70007 -0.50069
94.70016 -0.50054
104.70017 -0.50151
114.70015 -0.50332
124.70016 -0.50558
134.70014 -0.50789
144.70018 -0.50995
154.70013 -0.51159
164.70015 -0.51271
174.70014 -0.51329
-175.29989 -0.51334
-165.29991 -0.51286
-155.29992 -0.51184
-145.29994 -0.51028
-135.29997 -0.50826
-125.29997 -0.50596
-115.29995 -0.50365
-105.29993 -0.50173
-95.29982 -0.50061
-85.29985 -0.5006
-75.29985 -0.50171
-65.29985 -0.50362
-55.29986 -0.50591
-45.29987 -0.50819

Note: By symmetry, in this case, I could just as easily take only the positive angles.

Fitting

The molecular modeling software LAMMPS uses the following expression to model dihedral energies for the OPLS forcefield (Optimized Potentials for Liquid Simulations; see Watkins and Jorgensen, J Phys Chem A, 105, 4118-4125, 2001) (https://docs.lammps.org/dihedral_opls.html):

$$E = \frac 12 K_1[1+cos(\phi)] +\frac 12 K_2[1-cos(2\phi)] + \frac 12 K_3[1+cos(3\phi)] + \frac 12 K_4[1-cos(4\phi)]$$

The expression requires the 4 K-values to be identified. Some may be zero.

I have tried using the Matlab curve fitting toolbox, but the straightforward approach of using sum of sines has not worked.

What is the best way to fit this data to this expression in order to identify the K-values?

Please note: This is my first time posting on math stackexchange. If this question is not considered sufficiently math-related, just let me know.

1

There are 1 best solutions below

4
On BEST ANSWER

This problem is just a multivariable linear regression.

Being myself a quantum mechanicist, I can tell you that, in the model, the term $K_0$ is missing. It should be $$E =K_0+ \frac 12 K_1[1+\cos(\phi)] +\frac 12 K_2[1-\cos(2\phi)] + \frac 12 K_3[1+\cos(3\phi)] + \frac 12 K_4[1-\cos(4\phi)]\tag 1$$ If you look at my papers in the mid $70$'s, you will see that I proposed and used $$E=\sum_{n=0}^p a_n\,\cos(n \phi)$$ $p$ being determined by stepwise regression procedures.

What I suspect is that, as written, the model is for $\Delta E$ and not for $E$.

Anyway, using your data and $(1)$ we have $$\begin{array}{clclclclc} \text{} & \text{Estimate} & \text{Standard Error} & \text{Confidence Interval} \\ K_0 & -0.513347 & 0.000051 & \{-0.513451,-0.513243\} \\ K_1 & -0.000198 & 0.000048 & \{-0.000295,-0.000101\} \\ K_2 & +0.012812 & 0.000048 & \{+0.012714,+0.012910\} \\ K_3 & +0.000203 & 0.000048 & \{+0.000106,+0.000300\} \\ K_4 & -0.001285 & 0.000047 & \{-0.001381,-0.001190\} \\ \end{array}$$ all coefficients being highly significant.

The fit is almost perfect : maximum absolute error $0.0001715$ and mean absolute error $0.000083$.

Edit

If I may suggest, you can improve your model from a physical point of view. After this analysis (which must be done), you find a maximum at $\phi=90.6443^{\circ}$; mother nature does not like such kind of numbers except in some sterically hindered molecules.

So for your case, there two parameters which must be non fitted : $K_0$ which is $E$ for $\phi=0^{\circ}$ and $K_3=\frac 2 3 K_1$. This means three parameters to be adjusted instead of five.

Ignoring the value of $K_0$ ($\phi=0^{\circ}$ is absent from the table), I refitted the model with $K_3=\frac 2 3 K_1$. This gives $$\begin{array}{clclclclc} \text{} & \text{Estimate} & \text{Standard Error} & \text{Confidence Interval} \\ K_0 & -0.513303 & 0.0000734 & \{-0.513453,-0.513152\} \\ K_1 & -0.000045 & 0.0000585 & \{-0.000165,+0.000074\} \\ K_2 & +0.012812 & 0.0000704 & \{+0.012669,+0.012956\} \\ K_3 & -0.000030 & & \\ K_4 & -0.001300 & 0.0000685 & \{-0.001439,-0.001160\} \\ \end{array}$$ which are quite different from the previous ones for almost the same quality of fit.