Monomial Symmetrization for Potential Energy Surface (PES) Fitting- MSA
The Software
MSA is software that does a linear least-squares fit (with the option of weighting) of electronic energies and gradients, using fitting bases that are invariant with respect to permutations of like atoms. These are called permutationally invariant polynomials (PIPs). The software also provides the potential gradient. The energies are given in standard format at nuclear configurations in Cartesian coordinates and typical data sets consist of roughly 10 000 to 100 000 energies. There is no restriction on the size of the data set. As mentioned in the video, there is an option to weight the energies and also to change the Morse range parameter. Default values are in the file param.inp and they can be used as is, or changed in the interactive script shown in the video below. We suggest that the user experiment with both parameters to achieve an optimum precision of the fit.
A data set for CH4 is provided as part of the download of ‘MSA.zip’ file on Github. Below is a sample: no. of atoms, energy in hartree, atom label followed by cartesian coords in angstroms and cartesian components of grad (if inputed) in hartree/bohr blank otherwise. Based on the order of atoms the symmetry label is 4 1. This indicates that the full permutation group of the 4 H atoms will be used. A reduced symmetry, for example, 2 2 1 could also be used, but that is not done in this case.
5
-40.48132472
H -0.10095840 -0.41955010 -1.31205540 0.00475800 -0.00753200 0.00876200
H -0.33382290 -1.64227710 -0.08089010 0.00501000 -0.01045300 -0.00392400
H 0.27898620 0.01183520 0.37889600 0.00242100 0.01037400 0.00814600
H -1.41320490 -0.14046190 -0.03074290 -0.00415000 0.00917100 0.01147400
C -0.41724430 -0.55439810 -0.28616510 -0.00804000 -0.00155500 -0.02444600
A Short Video on Creating a PIP Basis and Fitting
In the video we take you through the process of using this MSA software and provide an example for a fit of the H3O2– potential.4 The driver is “msa.py”. The wall clock time to generate the basis is about 2 minutes and also several minutes to do the fit. This is on a single 2018 Intel. CoreTM i7-8750H processor. Note the default value of the Morse range parameter used in the fit in the video is 2 bohr. A value of 3 bohr was used in ref. 4 and this gave smaller fitting root mean square errors.
What is Needed in Order to Run the Codes
- Fortran 90 compiler. We used the Intel® Fortran Compiler (“ifort 15.0.0 20140723”) in the example. gfortran is also included as an option in the makefile.
- The “dgelss” subroutine from LAPACK, which is embedded in Intel® Math Kernel Library (Intel® MKL). Freely available.
- C++ compiler. We used the GNU Compiler Collection on our Linux cluster (“GCC 4.4.7 20120313 (Red Hat 4.4.7-16)) in the example. Freely available.
- Perl. We used Perl v5.10.1 (*) built for x86_64-linux-thread-multi. Freely available.
- Python. We used Python 2.6.6.
- Users have to provide the data set of electronic energies.
References About the Software
1. Xie, Z., Bowman, J.M. Permutationally Invariant Polynomial Basis for Molecular Energy Surface Fitting via Monomial Symmetrization. J. Chem. Theory Comput. 2010, 6, 26-34. Link to the paper. Please cite this as the primary reference to the MSA software.
2. PL Houston, et al.,Permutationally invariant polynomial regression for energies and gradients, using reverse differentiation, achieves orders of magnitude speed-up with high precision compared to other machine learning methods, J. Chem. Phys. 156, 044120 (2022);
3. PL Houston, C Qu, Q Yu, R Conte, A Nandi, JK Li, JM Bowman, PESPIP: Software to fit complex molecular and many-body potential energy surfaces with permutationally invariant polynomials, J. Chem. Phys. 158, 044109 (2023)
4. Assessments of PIP and sGDML Potential Energy Surfaces for Hydrated Hydroxide H3O2–
Priyanka Pandey, Mrinal Arandhara, Paul L. Houston, Chen Qu, Joel M. Bowman, and Sai G. Ramesh, to be published
Contact Information
Joel Bowman: jmbowma at emory.edu
Funding
Funding from the National Science Foundation, Army Research Office and NASA is acknowledged.