MOKIT Doc
for MOKIT version 1.2.7
MOKIT developers
Feb 9, 2025
Contents
- Introduction and how-to-cite
- Installation
- Quick Start
- User's Guide
- Syntax rule, Memory and parallel settings
- Methods and keywords in
automr
- Utilities
- APIs
- Methods and keywords in
autosr
- Examples
- Developer's Guide
- Papers citing MOKIT
- Appendix:
- FAQs, limitations and bug report
Hot Topics
How to perform GKS-EDA/SAPT How to transfer MOs
News
Release v1.2.6 1000 downloads on anaconda
Tips
You can use the 🔍 tool in the left upper corner of this page to search the whole manual, or simply press s
on your keyboard.
Introduction
1.1 Introduction to MOKIT
The full name of MOKIT is Molecular Orbital KIT. MOKIT offers various utilities
and modules to transfer MOs among various quantum chemistry software packages.
Besides, the automr
program in MOKIT can set up and run common multi-reference
calculations in a black-box way.
A list of important utilities along with their functions are shown below

With MOKIT, one can perform multi-reference calculations in a quite simple way, and utilize the best modules of each program, e.g.
UHF(UNO) -> CASSCF -> CASPT2
Gaussian PySCF OpenMolcas
or
UHF(UNO) -> GVB -> CASSCF -> NEVPT2
Gaussian GAMESS PySCF PySCF
or
RHF -> GVB -> CASSCF -> ic-MRCISD+Q
Gaussian GAMESS PySCF OpenMolcas
Negligible energy loss (usually < 1e-6 a.u., for the same wave function method in two programs) are ensured during transferring MOs, since the basis order of angular momentum up to H(i.e. l=5) is explicitly considered.
Note that although MOKIT aims to make the multi-reference calculations black-box, the users are still required to have practical experiences of quantum chemistry computations (e.g. familiar with routine DFT calculations in Gaussian). You are encouraged to learn how to use Gaussian first if you are a fresh hand. See Appendix A2 for more information.
1.2 How to Cite
If you use any module or utility of MOKIT in your work, you should cite MOKIT in the main body of your paper. Three examples are shown below:
E1. The HF/DFT calculations are performed using Gaussian 16[1] software package, and the MC-PDFT calculation is performed by OpenMolcas[2]. The utility fch2inporb in MOKIT[3] is used to transfer molecular orbitals from Gaussian to OpenMolcas.
E2. The GKS-EDA computations are performed by the XEDA[1] module interfaced with the GAMESS[2] program. For better SCF convergence and accelerating the EDA computation, converged molecular orbitals of each monomer and the complex are taken and converted from Gaussian[3] DFT calculations. This procedure is automatically conducted by the MOKIT[4] package.
E3. The GVB, CASSCF and CASPT2-K computations are performed using the GAMESS[1], PySCF[2] and ORCA[3] packages, respectively. All the multi-configurational computations are performed in a black-box way with the help of MOKIT[4].
Currently we have not published the paper of MOKIT program. As for the content of the reference, you should cite like
[4] Jingxiang Zou, MOKIT program, https://gitlab.com/jxzou/mokit (accessed Apr 13, 2024).
where the "Apr 13" should be modified as the version you used. The version number can be found by running automr --version
, or simply automr -v
. Note that citing MOKIT only in the supporting information of a paper is insufficient, since such citation can hardly be gathered by search engine or database. Citation files of MOKIT can be found in the $MOKIT_ROOT/doc/
directory, you can import them into EndNote. See published papers which cite MOKIT.
If any of the keyword ist=0,1,3
(see Section 4.4.4) in program automr
is used, citing the following two papers would be appreciated
[2] J. Chem. Theory Comput. 2019, 15, 141–153. DOI: 10.1021/acs.jctc.8b00854
[3] J. Phys. Chem. A 2020, 124, 8321–8329. DOI: 10.1021/acs.jpca.0c05216.
Besides, you need to CITE all quantum chemistry software packages called in your automr
job(s). For example, the file examples/automr/00-h2o_cc-pVDZ_1.5.gjf
is a CASSCF job. 3 software packages will be called:
(1) Gaussian will be called to perform RHF/UHF;
(2) GAMESS will be called to perform GVB;
(3) PySCF will be called to perform CASSCF.
Therefore, in this case you should also cite corresponding references of Gaussian, GAMESS and PySCF, as well as possible references of electronic-structure methods and basis sets.
You can use MOKIT to perform computations for other people. But remember to remind him/her that MOKIT should be properly cited in the mainbody of the paper. And of course, all quantum chemistry software packages used in the calculations should be properly cited (您可以使用MOKIT为他人做计算(包括代算),但务必提醒他/她在发表文章时在正文中恰当引用MOKIT。同时,计算中所用到的量子化学软件也应当在发表的文章中恰当地引用之).
Installation
The installation of MOKIT does not require any root privilege. So there is no need to install MOKIT using the root account under Linux.
- 2.1 Windows Pre-built
- 2.2 Linux Pre-built & MacOS brew build
- 2.3 Linux: Build from Source
- 2.4 Installation on Cluster
- 2.5 Dependency on QC Programs
2.1 Windows Pre-built
If you only want to use some utilities of MOKIT (e.g. transfer MOs among various programs, generate input files of various programs), the most convenient way is probably to use the pre-built (or pre-compiled) Windows executables. Note that the version of pre-built Windows executables is usually older than that of Linux pre-built executables, so you are recommended to download the latter one.
Only 20+ out of all utilities in MOKIT are provided. These executables are compressed into .7z file and can be downloaded at Releases. Download, uncompress it, and set the environment variables, then these utilities can used in any directory.
If you are only interested in the usage of the utility frag_guess_wfn
under Windows, please read here for a quick guide.
To set the environment variables, search 'environment' ("环境变量" in Chinese) in the Windows search bar, then press Enter
and click Edit
to edit the PATH
variables. Create a new path and type the path of MOKIT binary directory. For example, on my computer the path is D:\360Downloads\mokit\bin
.

Press the combination keys Win+R
on the keyboard, type cmd
to prompt CMD, and change into the directory where your .fch(k) file holds, simply run like

2.2 Linux Pre-built and MacOS brew Build
This section goes over several installing approaches that, unlike Section 2.3, does not require building from source manually.
-
Install from conda (for Linux only). Binary package is provided in this approach.
-
Use homebrew build (for MacOS only). Although there's no binaries for MacOS yet, this approach will build it automatically.
-
Manually download pre-built Linux binaries (it's not a conda package).
Unlike conda, this approach will not take care of dependency versions (like numpy) for you, and you need to set environment variables manually. So this approach is usually not recomended.
However, this approach does not require network, which may be a reason for choosing it.
-
Even simpler way to use pre-built Linux binaries.
Setups after installation
Like Section 2.3, there is still something to do after "installation".
- Setup the environment variables of MOKIT itself. Except conda users, all other users need to set that, which is mentioned at the end of each approaches.
- Setup the environment variables of dependencies, like Gaussian, GAMESS, PySCF, etc. Please read Section 2.5 to determine which dependencies are necessary for you and read Section 2.3.4 to set up them.
2.2.1 Online Installation
You can choose option 1 or 2 below. After mokit is successfully installed, if you want GAMESS to be called by automr
, you need to install GAMESS properly and write related environment variables.
Option 1: Install from conda (for Linux only)
This is the easiest way, but network is required to auto-download the requirements (like Intel MKL).
If you have no access to network, but still do not want to compile MOKIT manually, you can try options in Section 2.2.2.
use MOKIT with default channel
Creating a new environment before installing is highly recommended, to avoid changing your base environment.
conda create -n mokit-py39 python=3.9 # 3.8-3.11 are available
conda activate mokit-py39
conda install mokit -c mokit
Each time when you login onto the machine, you need to activate the virtual environment by conda activate mokit-py39
and then you can use MOKIT.
If you have not installed PySCF and you want to install it now, you can choose to
(1) run pip install pyscf
in this virtual environment.
(2) follow the instruction below to install pyscf and MOKIT with conda-forge channel.
Update MOKIT with conda
Usually conda update mokit -c mokit
works. Sometimes it may fail to find the latest version of MOKIT. In this case, a workaround can be
conda uninstall mokit mkl
conda install mokit -c mokit
use MOKIT with conda-forge channel
The MOKIT installed following instructions above, which is from mokit
channel by default, is hardly compatible with conda-forge environment. If you have to enable conda-forge, an alternative way is to install from mokit/label/cf
If you have enabled conda-forge (by conda config --add channels conda-forge
or modifying condarc), you can directly do
conda install mokit -c mokit/label/cf
If not, it's recommended to create an environment first before installing.
conda create -n mokit-cf python=3.9 -c conda-forge # 3.9-3.11 are available
conda activate mokit-cf
conda install mokit -c mokit/label/cf -c conda-forge
When pyscf is also needed, it can be installed in the conda-forge channel at the same time
conda install mokit pyscf -c mokit/label/cf -c conda-forge
or separately
conda install pyscf -c conda-forge
conda install mokit -c mokit/label/cf -c conda-forge
Option 2: Use homebrew-toolchains (for MacOS only)
- Prerequisites:
- You need to install homebrew on your mac
If you are mainland China user, follow brew mirrors help doc to install prerequisites
- A detailed brew-tap install guideline is located in homebrew-mokit github repo
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
Release Installation
Assume you will use mokit in a python 3.9 environment (3.8-3.11 are all available)
brew install ansatzx/homebrew-mokit/mokit --with-py39
or
brew tap ansatzx/homebrew-mokit
brew install mokit --with-py39
However, the release version is not up-to-date, it's highly recommended to install the latest commit.
Latest Commit Installation
Also assume you use mokit in a py-39 environment.
brew install ansatzx/homebrew-mokit/mokit --with-py39 --HEAD
Finally, follow caveats guides, add these commmand in your ~/.zshrc
(or bash/fish etc. profile).
You can run brew info mokit
to check details.
export MOKIT_ROOT="$(brew --prefix mokit)"
export PATH=$MOKIT_ROOT/bin:$PATH
export PYTHONPATH=$MOKIT_ROOT:$PYTHONPATH
export LD_LIBRARY_PATH=$MOKIT_ROOT/mokit/lib:$LD_LIBRARY_PATH
Upgrade Mokit
release upgrade
brew update -f
Run the command below if there's a newer mokit release.
brew upgrade mokit # or `brew upgrade ` to upgrade everything
latest commit upgrade
brew upgrade --fetch-HEAD mokit
2.2.2 Pre-built Linux Executables and Libraries
Unlike the conda install approach, using pre-built MOKIT in this subsection do not require network. If you want full functionality of MOKIT, you still need to have necessary dependencies: Python3 environment and NumPy, which can be achieved by anaconda/miniconda (Read here for installing anaconda offline). If you only want part of MOKIT, especially some certain binary utilities, see Section 2.2.3 for a simpler instruction.
Download
centos7_conda_py38
centos7_conda_py39
py39_gcc10
py310_gcc10
After downloading the pre-built artifacts, you need to set the following environment
variables (assuming MOKIT is put in $HOME/software/mokit
) in your ~/.bashrc
:
export MOKIT_ROOT=$HOME/software/mokit
export PATH=$MOKIT_ROOT/bin:$PATH
export PYTHONPATH=$MOKIT_ROOT:$PYTHONPATH
export LD_LIBRARY_PATH=$MOKIT_ROOT/mokit/lib:$LD_LIBRARY_PATH
export GMS=$HOME/software/gamess/rungms
The LD_LIBRARY_PATH
is needed since the OpenBLAS dynamic library is put there.
Remember to modify the GMS
path to suit your local environment.
Note that you need to run source ~/.bashrc
or exit the terminal as well as
re-login, in order to activate newly written environment variables.
How to choose Anaconda Version if installing offline
Installing prebuilt MOKIT offline usually means the user cannot get NumPy with network. So it's important to get a certain version of anaconda which provides proper version of NumPy needed by MOKIT. The recommended versions of anaconda for some prebuilts are listed below (compatible miniconda version is also listed but it does not mean miniconda comes with NumPy). See Anaconda release note for more information.
If your linux kernel is roughly as old as Centos7's, choose the one started with centos7_
.
Artifacts | Python version | Compatible Anaconda version | Compatible Miniconda version | NumPy version |
---|---|---|---|---|
centos7_conda_py38 | 3.8 | 2021.05 | py38_4.10.3 | 1.20 |
centos7_conda_py39 | 3.9 | 2022.10 | py39_22.11.1 | 1.21 |
py39_gcc10 | 3.9 | 2022.10 | py39_22.11.1 | 1.21 |
py310_gcc10 | 3.10 | 2023.03 | py310_23.3.1 | 1.23 |
How to download certain version of anaconda?
For example, if you are using TUNA Mirror, you can go to mirrors.tuna.tsinghua.edu.cn/anaconda/archive/ and find
Anaconda3-[some version]-Linux-x86_64.sh
. The page address is similar in other mirror site, like NJU Mirror.
Detailed Compatibility Note
Artifacts | Compatible OS | Maybe Compatible | Python version | GCC version | NumPy version |
---|---|---|---|---|---|
centos7_conda_py38 | Centos 7 | 3.8 | 4.8.5 | 1.20 | |
centos7_conda_py39 | Centos 7 | 3.9 | 4.8.5 | 1.21 | |
py39_gcc10 | Debian 11, Ubuntu 20.04 | SUSE 15 | 3.9 | 10.2 | 1.21 |
py310_gcc10 | Debian 11 | Ubuntu 22.04 | 3.10 | 10.2 | 1.23 |
Tips:
- Do not extract the zip with right-click (like the one in KDE)! Use
unzip
in command line. - The artifacts started with 'centos7_conda' need to be used with Anaconda3/Miniconda3, and the rest works with system-provided python (conda is also OK).
- We cannot list every supported linux distribution here, especially those similar to listed ones: Rocky Linux, OpenEuler, etc. More compatibility tests and reports are welcome.
- The GCC and NumPy version listed refer to the version used to compile the artifacts.
- NumPy can be sensitive to version sometimes. Try upgrade (or downgrade) numpy if your python complained about version when
import
. - The NumPy version for each prebuilt is fixed, because offline anaconda users cannot update anything and have to follow the anaconda version recommendation above. We may switch to newer miniconda image when the current ones are considered too old. The NumPy version for
py*_gcc*
prebuilts used to be latest but after 1.2.6rc26 they are also fixed.
- NumPy can be sensitive to version sometimes. Try upgrade (or downgrade) numpy if your python complained about version when
2.2.3 Only want frag_guess_wfn
?
If you do not need full functionality of MOKIT and only want frag_guess_wfn
for generating the input file of various EDA methods (or other binary utilities, like fch2mkl
), the easiest way is to download the pre-compiled MOKIT in Section 2.2.2. There is no need to install Miniconda/Anaconda Python in this case, and no need for conda install
.
Firstly, download a pre-compiled MOKIT package according to your OS (e.g. CentOS 7), and change the directory name
unzip mokit-master_linux_centos7_conda_py39.zip
rm -f mokit-master_linux_centos7_conda_py39.zip
mv mokit-master_linux_centos7_conda_py39 mokit
Then write proper environment variables in your ~/.bashrc
file,
export MOKIT_ROOT=$HOME/software/mokit
export PATH=$MOKIT_ROOT/bin:$PATH
export LD_LIBRARY_PATH=$MOKIT_ROOT/mokit/lib:$LD_LIBRARY_PATH
Remember to run source ~/.bashrc
to make the environment variables valid. If you are working on a Cluster(集群) and using a script to submit any computational task, you should write the environment variables into your script.
2.3 Linux: Build from Source
The latest version of MOKIT source code can be downloaded via mokit-master.zip or mokit-master.tar.gz.
2.3.1 Prerequisite
- Fortran compiler:
ifort
(>=2017) orgfortran
(>=4.8.5) - Intel MKL(recommended) or OpenBLAS
f2py
: Anaconda Python3(recommended) or Miniconda + Numpy
It is recommended to install the Intel compiler and Anaconda Python3 on your computer/node. Although these two packages may be large, they meet all prerequisites of compiling MOKIT. Note that the Intel compiler is free of charge for academic use. You can download Anaconda Python3 from the NJU mirror website, e.g the package Anaconda3-2024.02-1-Linux-x86_64.sh
.
Currently python >= 3.12 is not supported due to the new f2py backend has not been fully supported yet. Please use python <= 3.11 (create such an environment with network, or download an old Anaconda3 (2024.02 and earlier)).
If you do not have the ifort
compiler or Intel MKL on your computer, you may want to download and install them. There are several versions recommended, you can choose any one of them:
(1) Intel Parallel Studio XE 2017~2020 are all OK (2019 or 2020 preferred).
(2) Since 2021, there is no Parallel Studio XE, but only the Intel OneAPI. You should download and install both HPC Toolkit and MKL if you want to use 2021 version.
If you want to use other compilers, please read Section 2.3.5.
2.3.2 Using make to compile
Assuming you have already downloaded the MOKIT .zip package, just run
unzip mokit-master.zip
to uncompress it. Note that DO NOT unzip the package in Windows OS, but unzip it in Linux OS. Then enter the source code directory
mv mokit-master mokit
cd mokit/src
and run
make all
to compile MOKIT. This will take about 2 minutes. There is no make install
step.
If you want to use gfortran+OpenBLAS instead of ifort+MKL, you should run
make all -f Makefile.gnu_openblas
to compile MOKIT.
If you do not need automr
for automatic multireference calculations and only want to compile one or several modules, e.g. fch2inp
(for Gaussian -> GAMESS orbital transferring), then you simply need to run
make fch2inp
Be careful with hints on the screen, some modules depend on other modules, thus a compilation of two or three modules is necessary sometimes.
2.3.3 Using fpm to compile
If you prefer to use fpm
rather than make
, you are supposed to run
unzip mokit-master.zip
mv mokit-master mokit
cd mokit/
fpm build --compiler=ifort --flag="-O2 -fpp -fPIC -qopenmp"
fpm install --compiler=ifort --flag="-O2 -fpp -fPIC -qopenmp" --prefix .
This will use ifort+MKL to compile all binary executables of MOKIT, i.e. no Python dynamic libraries will be compiled by fpm
. In such case you can use MOKIT utilities to transfer MOs among quantum chemistry packages. If you also want Python dynamic libraries, you should enter mokit/src/
and run make pymodules
.
If you prefer gfortran+MKL, you are supposed to run
fpm build --flag="-O2 -cpp -fPIC -fopenmp"
fpm install --flag="-O2 -cpp -fPIC -fopenmp" --prefix .
If you prefer gfortran+OpenBLAS, you are supposed to run
mv fpm.toml tpm_mkl.toml
mv fpm_openblas.toml fpm.toml
fpm build --flag="-O2 -cpp -fPIC -fopenmp" --link-flag="-L$HOME/software/openblas-0.3.26/lib64 -lopenblas"
fpm install --flag="-O2 -cpp -fPIC -fopenmp" --link-flag="-L$HOME/software/openblas-0.3.26/lib64 -lopenblas" --prefix .
Here the OpenBLAS is assumed to be already installed in $HOME/software/openblas-0.3.26
.
One can use fpm to compile the latest MOKIT source code under Windows OS, e.g. in msys2.
2.3.4 Environment variables
After successful compilation, you need to add the following environment variables into your ~/.bashrc
file:
export MOKIT_ROOT=$HOME/software/mokit
export PATH=$MOKIT_ROOT/bin:$PATH
export PYTHONPATH=$MOKIT_ROOT:$PYTHONPATH
export GMS=$HOME/software/gamess/rungms # optional
export PSI4=$HOME/psi4conda/bin/psi4 # optional
export BDF=$HOME/software/bdf-pkg/sbin/run.sh # optional
Please modify the above paths to suit your situations. Since PySCF is run by python
, OpenMolcas is run by pymolcas
, Molpro is run by molpro
, PSI4 is run by psi4
and Dalton is run by dalton
, there is no extra environment variable to be exported here. But if you define export PSI4=...
, then this variable has priority to the path found by which psi4
.
For Dalton, you can install either MKL or MPI version since both are supported, and it would be automatically detected by MOKIT. The environment variable BDF
are optional, there is no need to write it if you do not want to use BDF.
The configuration file 'program.info' is no longer used since MOKIT 1.2.1. After writing environment variables, a logout and re-login on your terminal is strongly recommended.
Setup GAMESS
Note that the original GAMESS source code can only deal with GVB up to 12 pairs. To go beyond that (which is routine type of calculation in automr
of MOKIT), please read GVB_prog carefully. After re-compiling GAMESS (followed by instructions in Section 4.4.10), an executable gamess.01.x
will be generated. This is the executable to be called by automr
.
Setup scratch directory
The scratch directory (in Chinese, 临时文件存放目录) is not determined by MOKIT or automr
, but by each quantum chemistry package (Gaussian, OpenMolcas, GAMESS, etc). Please do not submit two computations with the same input filename (even in different work directories) at the same time, since their temporary files may be overwritten by each other. For example, GAMESS, OpenMolcas and Molpro jobs are sensitive to the files in the scratch directory, while Gaussian, PySCF and ORCA are not that sensitive.
2.3.5 Test
To see the version and help information, you can run
automr -v (or -V, --version)
automr -h (or -help, --help)
To test whether the automr program works, please change into the example directory, and pick up a .gjf file. For example,
cd $MOKIT_ROOT/examples/automr/
automr 00-h2o_cc-pVDZ_1.5.gjf >00-h2o_cc-pVDZ_1.5.out 2>&1 &
If you encounter any errors in output, please search your problem in Section A1 FAQ in Appendix firstly. Some common questions have been listed. For bug reporting, please read Section A3.
It is strongly recommended to create a new directory and put in the input file. Because during computation many files will be generated by automr
and those common software packages.
2.3.6 Compiler notes
If you want to use any Fortran compiler other than ifort
(e.g. gfortran
), you need to install the MKL library or OpenBLAS on your node. The recommended version is gfortran>=4.8.5
. Note that gfortran<=4.7
is outdated and thus not recommended. Even if you can successfully compile all utilities using gfortran<=4.7
, they may not work normally or correctly.
Option 1: gfortran + Intel MKL
make all -f Makefile.gnu_mkl
Option 2: gfortran + OpenBLAS
make all -f Makefile.gnu_openblas
Option 3: gfortran + OpenBLAS for Arch/Manjaro distributions
make all -f Makefile.gnu_openblas_arch
If you had compiled MOKIT before, and assuming now you want to use another compiler, REMEMBER to run make distclean
before re-compiling. Note that if you change the version of Python on your node/machine, the dynamic library files *.so
in $MOKIT_ROOT/mokit/lib/
may become unrecognized, in which case you have to re-compile MOKIT.
If you want to use Miniconda instead of Anaconda Python3, you should install numpy
since numpy
is not in Miniconda by default. f2py
will be installed along with numpy
. If your f2py
comes from psi4conda/bin/f2py
, then errors may occur when compiling MOKIT. In such case you are recommended to comment environment variables of PSI4 and use your previous python, e.g. Anaconda Python 3. MOKIT can still call PSI4 even if PSI4 environment variables commented (see Section 2.2.3). If you encounter compilation errors when using Intel 2017 and Anaconda Python 3.8, you can try to use gfortran+MKL instead of Intel compiler by running make all -f Makefile.gnu_mkl
.
For trouble shooting during compiling, please read Section A1.
2.3.7 Notes on Quantum chemistry packages
The automr
program in MOKIT is an integrated interface program connecting common quantum chemistry software packages. The automr
itself does not contain any code for computing electron integrals currently, it just transfers MOs among software, sometimes read one-electron integrals from some package, and call various packages to do specified computations. Therefore, the users are assumed to have successfully installed some of these common software packages, which are Gaussian, PySCF, GAMESS, OpenMolcas, Molpro, ORCA, BDF, PSI4, Dalton and QChem. Some recommended versions are shown below
Program | Recommended Version |
---|---|
BDF | >= 0.9.8 |
CFOUR | >= v2.1 |
Dalton | >= 2020.0 |
GAMESS | >= 2017 (>= 2019 preferred) |
Gaussian | >= g09 (>=g16 preferred) |
Molpro | >= 2015 (>= 2019 preferred) |
OpenMolcas | >= v21.02 |
ORCA | >= 4.2.1 (>= 5.0 preferred) |
PSI4 | >= 1.3.2 (>= 1.4 preferred) |
PySCF | >= 1.7.4 |
QChem | >= 5.0 |
For DMRG related packages:
Program | Recommended Version |
---|---|
(py)block2 | >= preview-0.5.0 |
Block | >= 1.5.3 |
QCMaquis | >= 3.0.3 |
CheMPS2 | >= 1.8.9 |
The (py)block2 the most recommended one among these DMRG packages.
For MC-PDFT related packages:
pyscf-forge latest
Older versions are not recommended, since (1) they are possibly not tested by the developers, (2) they had been tested but some functionality had not been (correctly) implemented at that time. So that they may work or may not. OpenMolcas-v18.09 has also been tested, but you need to take care of a problem.
Of course, not all packages will be called in an automr
job. It depends on the job type and the program specified by users (see Section 2.5 for details). Usually three software packages (Gaussian, PySCF and GAMESS) are extensively used in routine computations. Thus you are recommended to install at least these 3 packages.
Installation tips and instructions
If you have any difficulty in installing software, please read their corresponding manuals carefully. Or you can find answers from their official websites, forums, and GitHub/GitLab pages:
- Gaussian
- PySCF issues
- ORCA forum
- Molcas forum, OpenMolcas issues
- Molpro group
- GAMESS issues
- PSI4
- Dalton
If you can read Chinese, the following installation instructions or tutorials of common software packages on the WeChat Official Accounts 'quantumchemistry'(微信公众号"量子化学") are strongly recommended:
Linux下安装Intel oneAPI
Linux下Gaussian 16安装教程
ORCA 5.0安装及运行
GAMESS简易编译教程
PSI4程序安装及运行
离线编译OpenMolcas+QCMaquis
安装基于openmpi的mpi4py
自动做多参考态计算的程序MOKIT
also these installation instructions on GitLab
离线安装PySCF-2.x
离线安装PySCF-2.x-extensions (for pyscf[dmrgscf], etc.)
离线安装OpenMolcas-v22.06
编译MPI并行版OpenMolcas
block2的编译和安装
离线安装量子化学软件Dalton
2.4 Installation on Cluster
2.4.1 Compile MOKIT on Cluster(集群)
If you already install MOKIT using conda
, or you prefer using pre-built packages of MOKIT, you can skip this subsection. This is for compiling MOKIT from the source code on a Cluster. If you can write environment variables freely into ~/.bashrc
, then there is nothing different with installation on your local Linux computer.
However, sometimes you are not allowed, or you do not want to modify ~/.bashrc
. And usually environment variables of various packages (e.g. ifort, python, etc) are controlled by the module
function. For example, if you need some version of python (other than system default), you have to run module load python/...
to make it accessible. In such case, you have to firstly load proper versions of Intel compiler and Python before installing MOKIT. For example,
module avai # find which compilers/versions you can use
module load intel/2019u5 # load the Intel version you want to use
module load anaconda/2024.2 # load the Python version you want to use
cd $MOKIT_ROOT/src # enter MOKIT source code directory
make all # compile MOKIT
The first line finds which versions of software packages you can choose. The 2nd and 3rd lines load the corresponding software. The 4th line is that you enter the MOKIT source code directory. And the 5th line is compiling MOKIT. This is just a simple example telling you how to load modules on a Cluster. For detailed questions, please consult the administrator of your Cluster.
2.4.2 Use MOKIT on Cluster
Usually you are not allowed to submit any job in the main node or login node. And there exists a queue/batch system on your cluster, e.g. bsub
of LSF, qsub
of PBS, or sbatch
of SLURM, which distributes your job to a certain computation node. Taking bsub
as an example, you should write a Shell script like
#!/bin/sh
#BSUB -q ... # queue name
#BSUB -n 4 # number of cores
module load intel/2019u5
module load anaconda/2024.2
gjfname=$1
outname=${gjfname%.*}.out
automr $gjfname > $outname 2>&1
Then you can run
bsub run.sh 00-h2o_cc-pVDZ_1.5.gjf
to submit the job.
If you are using the SLURM, the Shell script would look like
#!/bin/bash
#SBATCH -J automr
#SBATCH -e automr.err.%j
#SBATCH -p ... # queue name
#SBATCH -N 1
#SBATCH -n 1 # number of MPI processors
#SBATCH -c 32 # number of OpenMP threads
#SBATCH --mem=64G
gjfname=$1
outname=${gjfname%.*}.out
automr $gjfname > $outname 2>&1
And you need to run
sbatch run.sh 00-h2o_cc-pVDZ_1.5.gjf
to submit the job. This job will call the quantum chemistry packages Gaussian, GAMESS and PySCF in succession, and all three packages are of OpenMP parallelism. Note that before you submit the calculation job, you are supposed to be in the appropriate Python virtual environment. For example, the Bash in the terminal on my computer looks like
(mokit-py39) [jxzou@mu012 src]$
(mokit-py39) [jxzou@mu012 src]$ sbatch run_automr.sh
which means that I am in the mokit-py39
virtual environment currently. Some users may want to write conda init
and conda activate mokit-py39
in the SLURM script, but it often does not work since SLURM does not support conda
commands properly. So it is recommended that you are already in an appropriate virtual environment.
If some MPI programs (e.g. ORCA) would be called during computations, remember to swap the parameters of -n
and -c
. For example, if mokit{CASSCF_prog=ORCA}
is specified in your input file, you will need a Shell script like
#!/bin/bash
#SBATCH -J automr
#SBATCH -e automr.err.%j
#SBATCH -p ... # queue name
#SBATCH -N 1
#SBATCH -n 32 # number of MPI processors
#SBATCH -c 1 # number of OpenMP threads
#SBATCH --mem=64G
gjfname=$1
outname=${gjfname%.*}.out
automr $gjfname > $outname 2>&1
Do not worried about -c 1
. OpenMP programs can also run parallelly.
If you need to run a small MOKIT job on the current node without modifying your ~/.bashrc
, you should create a Shell script, e.g. run.sh
, with
module load intel/2019u5
module load anaconda/2024.2
gjfname=$1
outname=${gjfname%.*}.out
automr $gjfname > $outname 2>&1
written in this script. Then you can run
chmod +x run.sh
./run.sh 00-h2o_cc-pVDZ_1.5.gjf &
If you still encounter problems of installing or running MOKIT on your cluster, please consult the administrator of your Cluster. This is beyond the scope of this manual.
Dependency on QC Programs
Minimal requirements
Dependencies on quantum chemistry packages are different for each executable or module. Here the minimum requirements for binary executables automr
, autosr
, frag_guess_wfn
and other various are listed:
automr
: PySCF, GAMESSautosr
: Gaussianfrag_guess_wfn
: Gaussian- Most of the utilities do not depend on quantum chemistry packages except that the modules
py2gau
,py2orca
,py2molpro
, etc, work with PySCF installed.
For people who cannot use Gaussian
The MOKIT developers understand that some people cannot (or do not want to) use Gaussian. You can set HF_prog=PySCF
to call PySCF to perform the RHF/UHF calculations, e.g.
%mem=4GB
%nprocshared=2
#p CASSCF/cc-pVDZ
mokit{HF_prog=PySCF}
0 1
O -0.23497692 0.90193619 -0.068688
H 1.26502308 0.90193619 -0.068688
H -0.73568721 2.31589843 -0.068688
This does not require Gaussian installed on your computer.
Setup Gaussian and PySCF
Usually there are no special requirements on setup Gaussian.
When installing PySCF and MOKIT from conda, please read this instruction and this.
Read Section 2.3.7 and FAQs if you have trouble installing them or calling them with MOKIT.
Modify and Setup GAMESS
Note that the original GAMESS code can only deal with GVB <=12 pairs. But nowadays we can do hundreds of pairs. It is required by MOKIT to modify GAMESS to go beyond 12 pairs. See instructions in Section 4.4.10.
Also, you need to set the environment variables as mentioned in Section 2.3.4.
Additional requirements for using specific methods
Some methods supported by MOKIT cannot be used with minimal requirements above. Please read the documentation of "XXXX_prog" keywords of automr
or autosr
for more information. For example, DMRGSCF_prog of automr
and CC_prog of autosr
.
Here we provide a list for common used prog keywords and corresponding dependencies.
automr methods | additional dependencies | instructions |
---|---|---|
DMRGCI/DMRGSCF | pyscf[dmrgscf] and Block | here |
CASPT2 | OpenMolcas/Molpro/ORCA | here |
MRCISD | See keyword documentation | |
MC-PDFT | pyscf-forge |
autosr methods | additional dependencies | instructions |
---|---|---|
DLPNO-CC | ORCA |
Setups for calling other QC programs
See Section 2.3.4 for setting environment variables for those programs. See Section 2.3.7 for installation tips and instructions.
3 Quick Start
- 3.1 Quick Start with Examples
- 3.2 Which method should I use?
- 3.3 Which basis set should I use?
- 3.4 Do I need to specify the active space?
- 3.5 Are my computation results reasonable?
3.1 Quick Start with Examples
3.1.1 Automatic multiconfigurational computations
The input syntax of the automr
program is the same to Gaussian gjf. For example, the input file 00-h2o_cc-pVDZ_1.5.gjf
of the water molecule at d(O-H) = 1.5 A is shown below
%mem=4GB
%nprocshared=4
#p CASSCF/cc-pVDZ
mokit{}
0 1
O -0.23497692 0.90193619 -0.068688
H 1.26502308 0.90193619 -0.068688
H -0.73568721 2.31589843 -0.068688
Run the following command
automr 00-h2o_cc-pVDZ_1.5.gjf >00-h2o_cc-pVDZ_1.5.out 2>&1
The automr
program will successively perform HF, GVB, and CASSCF computations by calling Gaussian, GAMESS and PySCF, respectively. The active space will be automatically determined as (4,4) during computations. Detailed instructions for automr
input can be found in Section 4.1 - 4.4.
See more examples in Section 5.1.
3.1.2 Automatic multireference computations
The automr
program supports automatic computation of many multireference methods, e.g. NEVPT2/CASPT2/MRCISD/MC-PDFT.
%mem=8GB
%nprocshared=4
#p NEVPT2/cc-pVTZ
mokit{}
0 1
O -0.23497692 0.90193619 -0.068688
H 1.26502308 0.90193619 -0.068688
H -0.73568721 2.31589843 -0.068688
3.1.3 Automatic single-reference computations
There are many factors to be considered in some advanced single-reference computations: the auxiliary basis set for RIJK, the auxiliary basis set for correlated methods, the complementary auxiliary basis sets for F12 calculations, SCF convergence, HF wave function stability, etc. These factors can be automatically dealt with by using autosr
. For example,
%mem=8GB
%nprocshared=4
#p CCSD(T)-F12/cc-pVTZ-F12
mokit{CC_prog=ORCA}
0 1
O 0.00000000 0.00000000 0.06200700
H 0.00000000 -0.78397600 -0.49205200
H 0.00000000 0.78397600 -0.49205200
Run the following command to submit the autosr
job.
autosr h2o.gjf >h2o.out 2>&1 &
autosr
will call Gaussian to perform HF/cc-pVTZ-F12 calculations, transfer MOs to ORCA, and call ORCA to perform the CCSD(T)-F12/cc-pVTZ-F12 calculation. cc-pVTZ-F12 is not a built-in basis set in Gaussian, but autosr
will deal with this problem automatically since MOKIT has included this basis set.
3.1.4 Using one utility
Running the following command
fch2inp h2o.fch
will generate the GAMESS input file h2o.inp
, in which the Cartesian coordinates, basis set data, molecular orbitals(MOs) and some necessary keywords are written. If the MOs in h2o.fch
are RHF/ROHF/UHF MOs, you can simply submit the GAMESS job and you will find the SCF is converged in ~1 cycle in GAMESS. If you want to perform other types of calculation, remember to modify h2o.inp
. Note that keywords nosymm int=nobasistransform
should be written in h2o.gjf
before submitting the Gaussian job, since this facilitates the orbital transferring and SCF convergence.
Some commonly used utilities and their functionalities are listed below
Utility name | Functionality |
---|---|
fch2amo | Gaussian -> AMESP |
fch2bdf | Gaussian -> BDF |
fch2cfour | Gaussian -> CFOUR |
fch2com | Gaussian -> Molpro |
fch2dal | Gaussian -> Dalton |
fch2inp | Gaussian -> GAMESS |
fch2inporb | Gaussian -> (Open)Molcas |
fch2mkl | Gaussian -> ORCA |
fch2psi | Gaussian -> PSI4 |
fch2tm | Gaussian -> Turbomole |
mkl2fch | ORCA -> Gaussian |
molden2fch | others -> Gaussian |
You can search a utility and read the documentations in Section 4.5 or 4.6.
3.2 Which method should I use?
For wave function analysis or visualization of active orbitals, GVB and CASSCF methods are sufficient.
For practical use, e.g. to properly compare results with those from experiments, NEVPT2, CASPT2, MC-PDFT or MRCISD are recommended. The GVB and CASSCF methods are only qualitatively correct. Additional general recommendations are provided below:
(1) NEVPT2 is free of the intruder-state problem, thus is preferable to CASPT2.
(2) If you find NEVPT2/CASPT2 is too expensive for your system, consider the MC-PDFT method.
(3) If you find the computation cost of NEVPT2/CASPT2 is acceptable for your system and you want higher accuracy, consider ic-MRCISD in OpenMolcas/Molpro. For very small systems such as two or three atoms, the uncontracted MRCISD in ORCA is a good choice.
3.3 Which basis set should I use?
For testing or debug, you can use any type of basis sets. You can, of course, use a small basis set like def2-SVP
to test whether automr
works. However, for practical use or to obtain publishable data, please use at least triple-zeta basis set, e.g. cc-pVTZ
, def-TZVP
(.ie. TZVP
in Gaussian), def2-TZVP
or def2-TZVPP
. Results obtained from combinations like MRCISD/def2SVP
, NEVPT2/6-31G(d)
are almost useless. If, unfortunately, you have very limited computational resource, e.g. less than 8 CPU cores, then the cc-pVDZ
basis set is recommended. Additional general recommendations are provided below:
(1) If your system is large, or has complicated electronic structure and you want to see whether automr
works, or you just want to see the workflow of automr
, you can use a small basis set like def2SVP
to go through the computation. After it normally terminates, you can switch to using a larger basis set.
(2) If your system is large (>600 basis functions), you should consider properly simplify your system or use mixed basis sets for different elements/atoms. For example, replace unimportant methyl group(s) by hydrogen atoms, use cc-pVTZ
for important atoms and cc-pVDZ
(or even 6-31G(d)
) for other atoms, etc. Also/Alternatively you can turn on the RI approximation (see Section 4.4.29) in CASSCF and CASSCF-NEVPT2 computations.
(3) Pople-type basis set like 6-311G(d,p)
or 6-31G(d,p)
is less recommended, but can be used for unimportant atoms.
(4) If pseudopotential (PP) is desired, better to use cc-pVTZ-PP
, def2-TZVP
. Be careful with the built-in basis set (with PP) SDD in Gaussian software for elements >= the 4th period. Often there is no d, f polarization functions in the built-in SDD basis set of Gaussian (although the PP part seem pretty good), but the d, f polarization functions are usually important for high-accuracy computations. If you insist on using SDD, you should search in previous papers the data of d, f polarization functions of SDD for your atoms, and add the data to the .gjf file manually.
(5) If your molecule is an anion, e.g. MnO4-, it is strongly recommended to use basis set with diffuse functions, such as aug-cc-pVTZ
, ma-def2-TZVP
. Similar to (1), if your system is large, you can use basis set with diffuse functions for atoms with significant negative charges, and use no diffuse functions for other atoms. If the atoms involving significant negative charges are not so important in your study, you can just use aug-cc-pVDZ
or ma-def2-SVP
.
(6) If all-electron relativistic calculations are desired (DKH2 or X2C), do remember to use basis sets like cc-pVTZ-DK
, x2c-TZVPall
or ANO-RCC
series. All-electron relativistic computations using CASSCF/cc-pVTZ
or cc-pVTZ-PP
with DKH2 is almost non-sense.
(7) Remember that the two basis sets def-TZVP
and def2-TZVP
are used formally in papers, but TZVP
and def2TZVP
are used as the names of basis set in Gaussian syntax.
(8) If you want to compute the NMR shielding constants, basis sets like pcSseg-1
or pcSseg-2
are strongly recommended. Large basis sets like pcSseg-3
, def2-QZVP
or cc-pVQZ
would be better but they are very time-consuming.
(9) If you only want to obtain the radical indices printed by automr
, and accurate electronic energies are not desired, then basis sets such as cc-pVDZ
is sufficient. There is no need to use triple-zeta basis set like def2TZVP
. If your studied molecule is very small (e.g. <10 atoms), then you can still use any triple-zeta basis set.
(10) If you only want to perform energy decomposition analysis, e.g. GKS-EDA, usually def2TZVP
basis set is sufficient. If there is any anion in your studied molecule(s), it is recommended to use diffuse functions only for anion-related atoms. It is not recommended to apply ma-TZVPP for all atoms (which will probably cause basis set linear dependence problems and make SCF in GAMESS converges slowly, even with MOs written in the .inp file). Also, it is recommended to use the implicit solvent model PCM rather than SMD, since PCM is frequently used in original papers of GKS-EDA.
(11) Some special basis sets can be used in the .gjf file of MOKIT, although these basis sets are not included in the Gaussian program. This is because MOKIT put these basis data in $MOKIT_ROOT/mokit/basis/
and can import them when necessary. For example, the following syntax will immediately cause errors in Gaussian, but are supported in MOKIT
Example 1. DKH2 scalar relativistic Hamiltonian with the ANO-RCC-VDZP basis set
#p CASSCF/ANO-RCC-VDZP
mokit{DKH2,CASSCF_prog=ORCA}
Example 2. CASSCF NMR calculation with the pcSseg-1 basis set
#p CASSCF/pcSseg-1
mokit{NMR}
3.4 Do I need to specify the active space?
The automr
program can automatically determine the active space, thus you need not specify that. If you do not want the automatically determined one, you can manually specify it, such as CASSCF(m,n) or NEVPT2(m,n), with m, n being the number of active electrons and active orbitals, respectively. Currently only m=n is supported (this is very reasonable, see DOI: 10.1021/acs.jctc.0c00123). While for GVB, you may specify GVB(n), where n is the number of pairs. Manually specifying is only recommended for experienced users.
Note that usually there is no need to write DMRGCI or DMRGSCF, the users can just write CASCI or CASSCF. Once automr
detects the size of active space is larger than (15,15), it will switch from CASCI/CASSCF to DMRGCI/DMRGSCF automatically. Similarly, the NEVPT2/CASPT2/MC-PDFT will be automatically switched into DMRG-NEVPT2/DMRG-CASPT2/DMRG-PDFT when the active space is larger than (15,15). The only exception is that the users just want to perform a DMRG calculation with active space smaller than (15,15), then the DMRGCI or DMRGSCF must be specified.
Usually the automatically determined active space is reasonable. The algorithm in MOKIT is designed to automatically find the minimum active space for a given molecule. However, when you are studying a potential energy curve/surface, the automatically determined active space may be not the same size for each geometry. For example, for N2 molecule at d(N-N) = 1.15 Å, the CAS(4,4) may be automatically determined by automr
, but for for d(N-N) = 4.0 Å, the active space turns into CAS(6,6). Thus, if you want to keep the size to be CAS(6,6), you need to specify CASSCF(6,6), NEVPT2(6,6), MRCISD(6,6), etc.
3.5 Are my computation results reasonable?
(1) You should make sure that your CASSCF initial orbitals and final(i.e. converged) orbitals contain proper active orbitals. Here are two examples:
(i) Assuming you are studying the bond breaking of a C-C bond, you use CAS(2,2) at least. And you should make sure that your CASSCF initial orbitals and final orbitals contain the bonding and anti-bonding orbitals this C-C bond. Note that the RHF orbitals or MP2 natural orbitals are usually poor to be used as the initial guess when studying bond breaking or transition-metal-containing molecules.
(ii) Assuming you are studying the \( \pi \) -> \( \pi \)* excited energies, then your active space is expected to contain desired \( \pi \) and \( \pi \)* orbitals.
(2) If you perform the CASSCF (or CASSCF-based methods like NEVPT2, CASPT2, etc), there would be a file with suffix *CASSCF_NO.fch
generated, in which the CASSCF natural orbitals and corresponding occupation numbers are held. You can use GaussView or Multiwfn to open this file and visualize whether these active orbitals are reasonable, if you know your molecule fairly well.
If you perform CASCI (or CASCI-based jobs), the CASCI NOs are held in a *CASCI_NO.fch
file. If you only perform GVB computation, the GVB NOs are held in the *gvb2_s.fch
file.
(3) If you are comparing relative energies of two target molecules (e.g. two local minima of the same molecule), be sure that you are using the same size of active space for two molecules. For example, it is incorrect(or unfair) to compare two NEVPT2 energies if one is based on CAS(4,4) and the other one is based on CAS(8,8). If, unfortunately, you encounter such special case, you have two options:
(i) explicitly specify the active space in the former calculation, i.e. NEVPT2(8,8) in the input file.
(ii) if you find there are 4 orbitals not so active in the latter calculation (meaning occupation numbers close to 0 or 2), you can specify NEVPT2(4,4) in the input file.
(4) If you are calculating the bond dissociation energy, you should check whether the sum or combination of active space from two fragments equals to the active space of the original molecule. You are supposed to analyze main components of the active space. For example, assuming the original molecule has a CAS(6,6) active space, and assuming it contains 2 singly occupied orbitals from fragment 1, two bonding orbitals and two anti-bonding orbitals constructed by two fragments. Then quintuplet CAS(4,4) should be used for fragment 1 and triplet CAS(2,2) for fragment 2.
(5) For the rigid/unrelaxed scanning of a chemical bond, you are always recommended to scan the bond from long distance to short distance, rather than from short to long. And you are recommended to write the keyword scan
in the Gaussian keyword #p
line. This will perform the rigid/unrelaxed scan automatically. It is not recommended to calculate each scanned geometry manually. Note: the scan
functionality in MOKIT has not been implemented yet.
User's Guide
- 4.1 Syntax Rules of automr
- 4.2 Memory and Parallel Settings
- 4.3 Keywords of supported methods in automr
- 4.4 List of automr Keywords
- 4.5 List of Utilities in MOKIT
- 4.6 APIs in MOKIT
- 4.7 Methods and Keywords in
autosr
4.1 Syntax Rules of automr
Syntax rules of the input file of automr
is almost identical to those of Gaussian software. Therefore, it takes little time to become familiar with the usage of automr
. A simple input (.gjf) example is shown below
%mem=4GB
%nprocshared=2
#p CASSCF/cc-pVDZ
mokit{}
0 1
O -0.23497692 0.90193619 -0.068688
H 1.26502308 0.90193619 -0.068688
H -0.73568721 2.31589843 -0.068688
This is a water molecule with two O-H bonds stretched. CAS(4,4) active space will be automatically determined during computations. One can also provide a .fch(k) file if he/she already performed a UHF job. An example is shown below
%mem=4GB
%nprocshared=4
#p CASSCF/cc-pVTZ
mokit{ist=1,readuhf='N2_cc-pVTZ_4.0_uhf.fchk'}
For the memory and parallel settings, please read Section 4.2. For the theoretical method and basis set, please read Section 4.3. Here we focus on the keywords in mokit{}
(also known as "automr
keywords", see Section 4.4 for details).
-
All
automr
keywords must be written in the curly bracket ofmokit{}
in the Title Card line of a .gjf file. -
Using upper or lower case of keywords in
automr
makes no difference. For example, the following two lines have identical meanings:MOKIT{readuhf='a.fchk',ist=1,LocalM=Boys}
mokit{readuhf='a.fchk',ist=1,localm=boys}
-
If
readrhf
,readuhf
, orreadno
keyword is used, a filename of the provided .fch(k) file must be included in a pair of single quotation marks''
. Do not use double quotation marks""
or no quotation marks. -
If pure Cartesian functions of basis set are used in the provided .fch(k) file, you need to write keyword
cart
. The default is pure spherical harmonic functions. And you do not need to write any keyword about this if you provide a .fch(k) file in which pure spherical harmonic functions are used. -
Use a comma to separate different keywords. Do not use other symbols like spacing or dot.
4.2 Memory and Parallel Settings
The settings of memory and the number of processors are identical to that in Gaussian. For example, the following syntax
%nprocshared=8
%mem=16GB
requests a automr
calculation use 16GB memory and 8 processors. Please specify reasonable/proper computational resource. For example, %mem=1GB
with %nprocshared=8
is usually unreasonable. Note that only %nprocshared
and the unit GB
is supported. Some DON'T things are listed below:
(1) Do not use units MB, MW, GW, and do not use %cpu
to specify the number of processors.
(2) Do not write %chk=xxx.chk
since it is useless for automr
.
Multireference computations usually require a large memory, so it is recommended to use at least %mem=1GB for even a small molecule. It is also recommended to set mem (in GB) ≥ nproc. Note that the memory and parallel settings of OpenMolcas in your ~/.bashrc
(MOLCAS_NPROCS
and MOLCAS_MEM
) will be overridden by settings in the .gjf file, and the OpenMP version or MPI version of OpenMolcas can be automatically detected. The memory and parallel settings of the BDF program is controlled by variables in the Shell script bdf-pkg/sbin/run.sh, you should properly modify the script before doing computations.
The memory and parallel settings are passed into various programs called by automr
(e.g. PySCF, Gaussian, etc). The automr
itself only needs negligible memory and usually run in a serial mode.
4.3 Keywords of supported methods in automr
Currently, the keywords of all supported methods in automr
are:
GVB, CASCI, CASSCF, DMRGCI, DMRGSCF,
NEVPT2, CASPT2, CASPT2K, CASPT3, SDSPT2, MRMP2, OVBMP2, MRCISD, MRCISDT,
MCPDFT, DFTCI, MRCC, BCCC2b, BCCC3b.
More multi-configurational and multi-reference methods will be supported in the future.
These terms should be written in the #p ...
line in the .gjf file.
There must be a '/' symbol between the method and the basis set, e.g. CASSCF/cc-pVTZ
.
automr
does not allow the use of a spacing to separate the method and basis set (which is allowed in Gaussian).
When readrhf
, readuhf
, or readno
is used in mokit{}
, the basis set after '/' symbol is usually useless, since the geometry and basis set data will be read from the given .fch(k) file.
Note that, however, the user still needs to provide a basis set name, although it is not used in this case.
There is also an exception that the basis set after '/' symbol matters.
If RI approximation is turned on, the auxiliary basis set will be automatically determined (by automr
) according to the basis set.
And if F12 is turned on, the F12-CABS will be automatically determined (by automr
) according to the basis set, too.
4.3.1 GVB
Generalized Valence Bond theory.
Note that the original GAMESS source code can only deal with GVB up to 12 pairs. To go beyond that (which is routine type of calculation in automr
of MOKIT), you need to modify the GAMESS source code, please read GVB_prog carefully.
If a high-spin GVB computation is performed (it can be the target computation, or an intermediate step in a CASSCF computation) and there are more than 1 singly occupied orbital, the singly occupied orbitals will be localized and saved into xxx_s.fch
. The corresponding orbital localization method is Pipek-Mezey and it can be controlled by LocalM.
4.3.2 CASSCF
Complete Active Space Self-consistent Field.
Currently excited state computations are not supported. Please read Section A2.3 for comments on excited state computations. Please read related keyword CASSCF_prog in Section 4.4.
4.3.3 CASCI
Complete Active Space Configuration Interaction.
CASCI can be viewed as the 0-th step of the CASSCF step, i.e. CASSCF without orbital optimization. It is always recommended to perform CASSCF rather than CASCI, expect for those who wish to compare the quality of different initial guesses, and/or do not want the CASSCF result. Please read related keyword CASCI_prog in Section 4.4.
4.3.4 DMRGSCF
Here the keyword DMRGSCF actually means the DMRG-CASSCF method. Please also read DMRGSCF_prog for related information.
4.3.5 DMRGCI
Here the keyword DMRGCI actually means the DMRG-CASCI method. Please also read DMRGCI_prog for related information.
DMRGCI can be viewed as the 0-th step of the DMRGSCF step, i.e. DMRGSCF without orbital optimization. It is always recommended to perform DMRGSCF rather than DMRGCI, expect for those who wish to compare the quality of different initial guesses, and/or do not want the DMRGSCF result.
4.3.6 NEVPT2
Second order N-Electron Valence state Perturbation Theory based on the CASSCF reference.
Please read related keyword NEVPT2_prog in Section 4.4.
4.3.7 CASPT2
Second order Perturbation Theory based on CASSCF reference.
Generally speaking, NEVPT2 is more recommended than CASPT2 since there is no need for IP-EA shift, real or imaginary shift in NEVPT2. If you are really a fan of CASPT2, it is recommended to use CASPT2K instead. Please read related keyword CASPT2_prog in Section 4.4.
4.3.8 CASPT2K
Second order Perturbation Theory based on CASSCF reference.
This is a new feature since ORCA 5.0. A revised zeroth order Hamiltonian is applied to alleviate the intruder state problem of CASPT2 method. No IP-EA shift is needed in this method.
Note here you can write this keyword as either CASPT2K or CASPT2-K in .gjf file,
but you'd better use the method name CASPT2-K in official writing or publishing.
The keyword CASPT2_prog
will automatically be set as ORCA, since this method is
only supported in ORCA currently.
4.3.9 CASPT3
Third order Perturbation Theory based on CASSCF reference.
Only Molpro program can be called to perform CASPT3, and it is default.
4.3.10 SDSPT2
Static-dynamic-static multi-state multi-reference second-order perturbation theory (SDS-MS-MRPT2, or SDSPT2 for short).
There are several restrictions when you use this method:
(1) only the single (electronic) state version can be used in automr, since only the ground state is calculated.
(2) SDSPT2 based on the CASCI reference is not supported currently. CASSCF-SDSPT2 is supported.
(3) background point charges are not supported.
This version of SDSPT2 is performed by the Xi'an CI module of BDF program. So you are assumed to have successfully installed BDF. You should cite this paper DOI: 10.1080/00268976.2017.1308029
if you use this method.
4.3.11 MRMP2
Second order Multi-reference Perturbation Theory based on CASSCF reference.
After CASSCF converged, automr
will call the GAMESS program to perform MRMP2 calculation. No other program is supported. Also, DMRG-MRMP2 is not supported and automr
will abort in that case.
4.3.12 OVBMP2
Orthogonal valence bond Møller-Plesset 2 based on CASSCF reference.
After CASSCF converged, automr
will call the Gaussian program to perform OVB-MP2 calculation. No other program is supported. Also, DMRG-OVB-MP2 is not supported and automr
will abort in that case.
Note that OVBMP2 is a keyword in automr
but OVB-MP2 is the method name. Do not mix them up. Some literature might call this method as 'CASSCF MP2' but that is actually misleading (those authors were fooled by keywords "CASSCF MP2" in Gaussian input file).
4.3.13 MRCISD
Multi-reference Configuration Interaction Singles and Doubles, based on the CASCI/CASSCF reference.
Please read related keyword MRCISD_prog in Section 4.4.
4.3.14 MRCISDT
Multi-reference Configuration Interaction Singles, Doubles and Triples, based on the CASCI/CASSCF reference.
Please read related keyword MRCISDT_prog in Section 4.4.
4.3.15 MCPDFT
Multi-configurational Pair Density Functional Theory, based on CASSCF reference.
Note that MCPDFT is a keyword in automr
but MC-PDFT is the method name. Do not mix them up.
Supported programs are OpenMolcas(default)/PySCF/GAMESS. If you use MCPDFT_prog=PySCF
, you need to install pyscf-forge.
Note that if the active space is larger than (15,15), the MC-PDFT will be automatically switched to DMRG-PDFT. In this special case you need to install the QCMaquis package (interfaced with OpenMolcas) or Block (interfaced with PySCF). DMRG-PDFT is not supported in GAMESS currently.
Please read related keyword MCPDFT_prog in Section 4.4.
4.3.16 DFTCI
The DFT/MRCI method by Stefan Grimme, Mirko Waletzke, and Martin Kleinschmidt et. al.
Note that the name of this method is DFT/MRCI, but you should write the keyword DFTCI in the input file of automr
, in order to perform DFT/MRCI computations.
Currently, to perform DFT/MRCI computations, you should have ORCA and DFT/MRCI program installed on your node/machine.
Note: full implementation of this keyword has not been finished yet.
4.3.17 FIC-MRCCSD
Multi-reference Coupled Cluster theory, based on the CASCI/CASSCF reference.
Currently only the FIC-MRCCSD method in ORCA(>=5.0.0) is supported.
4.3.18 BCCC2b
Block-correlated coupled cluster theory based on the GVB reference.
This is in fact a multi-reference coupled cluster theory based on GVB wave function, where 2b means only the two-block excitation operators \( {\hat{T_2}} \) are considered in the cluster expansion. Moreover, this method is a spin-pure coupled-cluster method.
Currently only spin singlet is supported. This program is developed by jxzou during his Ph.D. in Prof. Shuhua Li's research group. Currently this program has not been released yet, but will probably be released after its corresponding paper published.
Currently only correlations between two pairs are taken into consideration (i.e. occ->pair, occ->vir, pair->vir not considered so far). So BCCC2b is just a rough theory. Note that the intra-pair excitation operator \( \hat{{T_1}} \) plays little role, so the BCCC2b (i.e. only \( \hat{{T_2}} \) ) is extremely close to BCCC2 (i.e. \( \hat{{T_1}} + \hat{{T_2}} \) ). For GVB(2), the BCCC2 is equivalent to CASCI(4,4) using GVB orbitals, and thus BCCC2b is extremely close to CASCI(4,4). For GVB(n), n>2, the GVB(n)-BCCC is an approximation method to CASCI(2n,2n) using GVB orbitals.
This program scales as O(n4), where n is the number of GVB pairs. But the integral transformation needed for the BCCC2b scales as O(n5), so the overall scaling might behave as O(n5) for large number of pairs.
4.3.19 BCCC3b
Block-correlated coupled cluster theory based on the GVB reference, where 3b means \( \hat{{T_2}} + \hat{{T_3}} \).
This means two-pair and three-pair correlations are considered based on the GVB reference. Also, this method is spin-pure. Currently only spin singlet is supported.
For practical computations with desired accuracy, you should use BCCC3b rather than BCCC2b. This program scales as O(n5), where n is the number of GVB pairs. As stated in 4.3.18 BCCC2b, this program has not been released yet.
4.4 List of automr
Keywords
Here are the list of all automr
keywords, grouped by category.
For program specification | |||
---|---|---|---|
HF_prog | GVB_prog | CASCI_prog | CASSCF_prog |
DMRGCI_prog | DMRGSCF_prog | CASPT2_prog | NEVPT2_prog |
MRCISD_prog | MRMP2_prog | MCPDFT_prog |
For workflow specification | |||
---|---|---|---|
Input wavefunction | readrhf | readuhf | readno |
Workflow settings | ist (most important!) | LocalM | CIonly |
ON_thres | UNO_thres | Skip_UNO | |
excludeXH, OnlyXH | HFonly |
For method details | ||||
---|---|---|---|---|
General | Cart | hardwfn, crazywfn | charge | DKH2, X2C |
For MR methods | CtrType for MRCI | MaxM for DMRG | OtPDF | FIC for NEVPT2 |
For GVB | GVB_conv | Inherit | Npair | FcGVB |
Wavefunction settings | noDMRGNO |
Others | |||
---|---|---|---|
Acceleration technique | RI, RIJK_bas | F12, F12_cabs for NEVPT2 | DLPNO for NEVPT2 |
For additional properties | Force | NMR | ICSS |
For excited states | Nstates | Mixed_Spin | Root, Xmult |
If any of the readrhf
, readuhf
, and readno
keywords is used, there is no need
to write Cartesian coordinates in .gjf file, since the geometry will be read from
the specified .fch(k) file.
4.4.1 readrhf
Read RHF/ROHF orbitals from a specified .fch file. Do not provide a UHF-type .fch
file using this keyword. This keyword is usually used along with another keyword
ist=3
(see ist).
4.4.2 readuhf
Read UHF orbitals from a specified .fch(k) file. automr
will firstly check the difference between alpha and beta MOs. If the difference is tiny, the wave function in .fch file will be identified as a RHF one and will call utility fch_u2r to generate a RHF-type .fch file (in which all beta information is deleted). Otherwise (i.e. truly UHF), automr
will generate UHF natural orbitals (UNO) using input UHF orbitals. It is strongly recommended to check the stability of UHF wave function (using keyword 'stable=opt' in Gaussian). An instable or not-the-lowest UHF solution sometimes leads to improper GVB or CASSCF results.
It is not recommended to provide UDFT orbitals.
This keyword is usually used along with another keyword ist=1 or 2 (see ist). Note that do not provide a UNO .fch file with this keyword. If you want to use any type of NOs as the initial guess, see readno
in the following section.
4.4.3 readno
Read natural orbital occupation numbers (NOON) and natural orbitals (NO) from a specified .fch file. In this case, you must ensure that the 'Alpha Orbital Energies' section in the .fch file contains occupation numbers, not energy levels or something else, since the size of the active space will be determined according to the (threshold of the) occupation numbers.
With this keyword, you may try different NOs as the initial guess, like MP2 NOs, CCSD NOs, or some type of NOs generated by your own program. You may also use UNOs from UHF as the initial guess. In this case, it is equivalent to use the keyword readuhf
to read in the UHF orbitals and generate UNOs by MOKIT.
This keyword is usually used along with another keyword ist=5 (see ist).
4.4.4 ist
Request the use of the i-th strategy. Default is 0. This means: (1) if the spin of the target molecule is singlet, MOKIT will call the Gaussian software to perform RHF and UHF computations, then determine whether to change ist
to 1 or 3. If the EUHF = ERHF, ist will be changed to 3. If EUHF <ERHF, ist will become 1. (2) if not singlet, ist will be changed to 1 immediately.
For simple organic molecules which have multireference characters (like diradicals), the UHF performed by MOKIT (calling Gaussian) can always find the lowest (and stable) UHF solution. But for complicated systems like binuclear transition metal complex, there often exist multiple UHF solutions. And the UHF solution found by MOKIT is not necessarily the lowest one. In this case you are recommended to do UHF computations by yourself and use ist=1 to read in the Gaussian .fch file. See a practical guide for advanced UHF computations on http://gaussian.com/afc. If you can read Chinese, you are recommended to read Sobereva's blog.
Currently, there are 7 allowed values for ist:
0: meaning that if RHF wave function is stable, use strategy 3; otherwise use strategy 1
1: UHF -> UNO -> associated rotation -> GVB -> CASCI/CASSCF -> ...
2: UHF -> UNO -> (associated rotation ->) CASCI/CASSCF -> ...
3: RHF -> virtual orbital projection -> localization -> pairing -> GVB -> CASCI/CASSCF -> ...
4: RHF -> virtual orbital projection -> CASCI/CASSCF -> ...
5: NOs -> CASCI/CASSCF -> ...
6: minimal basis GVB -> target basis GVB -> CASCI/CASSCF -> ...
The value 0 (default) is recommended, if you do not know which one to choose.
4.4.5 LocalM
Specify the orbital localization method. Only the Boys (also called Foster-Boys) localization and PM (Pipek-Mezey) localization method are supported. The corresponding keywords are LocalM=Boys
and LocalM=PM
(default), respectively.
Note that there are 3 localization steps controlled by this keyword: (1) the orbital localization during constructing GVB initial guess; (2) the orbital localization of singly occupied orbitals (only if you perform a high-spin GVB calculation, here high-spin means triplet or higher); (3) the orbital localization of doubly occupied orbitals (only if you specify the keyword LocDocc).
Note: the Boys method will mix \( \sigma \) and \( \pi \) orbitals, while the PM method tends to keep them separated. These two methods make no difference when the target molecule contains only \( \sigma \) bonds (and possibly a few isolated \( \pi \) bonds). But if you are dealing with multiple \( \pi \) bonds or conjugated \( \pi \) systems like oligoacene (benzene, naphthalene, etc), or if you want the active space to contain only \( \pi \) orbitals, better use the PM method. The GVB and CASSCF optimized orbitals will be affected by the localization method sometimes. If you explicitly specify the size of active space which is equal to the \( \pi \) space (note that frontier natural orbitals are usually \( \pi \) orbitals), then using LocalM=Boys
is OK since Boys localization among pure \( \pi \) orbitals is safe (no sigma orbital is in the set).
For people who are keen on comparing initial guesses generated from different methods/algorithms, LocalM=Boys
is strongly recommended to be taken into consideration, to see whether a lower GVB, CASCI or CASSCF energy occurs.
4.4.6 CIonly
Skip the CASSCF orbital optimization in the CASPT2 or NEVPT2 job. Obviously, this keyword only applies to CASPT2 or NEVPT2 case. Writing CIonly means a CASCI -> CASPT2 or CASCI -> NEVPT2 job. In fact, the CASSCF orbital optimization is always recommended to be performed, unless it is too time-consuming, or you happen to want this type of result.
Note: if you simply need a CASCI computation, do not use CIonly in a CASSCF job, but simply write CASCI/basis_set in the keyword line of .gjf file.
4.4.7 Force
Request a calculation of the analytical nuclear gradient. Currently this keyword only applies to CASSCF. Geometry optimization is currently not supported, but you can use the generated files to perform optimization using corresponding software. Numerical gradient is not supported.
Note that this keyword do not have any attribute value, i.e. do not write 'force=.True.' but only force
in {}. Also keep in mind that force is negative gradient.
4.4.8 Cart
Request the use of Cartesian type atomic basis functions. The default basis type in automr
(i.e. spherical harmonic functions) will then be disabled. These two types of basis functions correspond to '6D 10F' (Cartesian functions) and '5D 7F' (spherical harmonic functions) in Gaussian. It is strongly recommended to use spherical harmonic functions, especially in all-electron relativistic computations (DKH2, X2C, etc).
Note that for computations involving the ORCA program, this keyword cannot be used since ORCA only support spherical harmonic functions.
If you do not know the meaning of 5D or 6D, you are referred to Schlegel and Frisch's paper (DOI: 10.1002/qua.560540202), and a good explanation from Chemissian. If you can read Chinese, you are recommended to read Sobereva's blog.
4.4.9 HF_prog
Specify the program for performing Hartree-Fock (HF) calculations. Supported programs are Gaussian(default), PySCF, PSI4 and ORCA. If the input molecule is singlet, RHF and broken symmetry UHF (plus wave function stability analysis) will be successively performed. If not singlet, only UHF (plus wave function stability analysis) will be performed.
It is strongly recommended to use the default HF_prog (i.e. Gaussian), since the SCF algorithm in Gaussian is the most effective/robust in almost all cases, among all quantum software packages. The only case when you are recommended to use PySCF/PSI4/ORCA, is that if your studied molecule is large and cannot be further simplified (Nbasis>1500), you can consider turn on the RI approximation of HF to accelerate computations in PySCF, PSI4 or ORCA. For example, in this special case you are supposed to write
mokit{RI,HF_prog=PySCF} # (or PSI4, ORCA if you wish)
in .gjf file. For people who cannot (or are not allowed to) use Gaussian, HF_prog=PySCF
is a recommended choice.
4.4.10 GVB_prog
Specify the GVB program. Supported programs are GAMESS(default), Gaussian and QChem. The second-order SCF (SOSCF) algorithm in GAMESS is used to converge the GVB wave function. The GAMESS version >=2017 is strongly recommended. Older versions of GAMESS may work or may not, since they are not tested by the developers. It is always recommended to use the default program GAMESS, rather than Gaussian (due to poor convergence for transition metal).
Note the original GAMESS can only do GVB up to 12 pairs. Nowadays we can do a black-box GVB computation with hundreds of pairs. So, to go beyond 12 pairs, you need to modify and re-compile the source code of GAMESS.
MOKIT offers a Shell script to help you automatically handle this. Assuming you have compiled GAMESS before (i.e. all *.o
files are still in gamess/object/
directory of GAMESS), now you simply need to copy two files (modify_GMS1.sh
and modify_GMS2.f90
) from mokit/src/
into gamess/source/
directory, and run ./modify_GMS1.sh
. For example, these Linux commands on my computer are
cp $MOKIT_ROOT/src/modify_GMS1.sh ~/software/gamess/source/
cp $MOKIT_ROOT/src/modify_GMS2.f90 ~/software/gamess/source/
cd ~/software/gamess/source/
./modify_GMS1.sh
You may need to type y
after running ./modify_GMS1.sh
. The script will modify source code and re-compile GAMESS, taking about 2 minutes. The linked GAMESS executable will be gamess.01.x
. If you already had an executable named gamess.01.x
, please rename it to another filename. Otherwise it will be destroyed and replaced during re-compilation.
Note for conda users: the MOKIT conda package does not contain src/
. An alternative way is to go to GitLab repo and download those two files.
Besides, for GAMESS earlier than version 2021-R1, it only supports 32 CPU cores. If your machine has more cores (and if you want to use >32 cores), you need to modify the variable MAXCPUS in file gamess/ddi/compddi. See a simple guide in file src/modify_GMS_beyond32CPU.txt
. Since GAMESS 2021, the MAXCPUS is already set to 128, so modification is not needed.
It is strongly recommended to check whether the GAMESS runs normally after modifying. Such check can be done by running ./runall 01
in gamess/ directory, where 01
means checking the executable gamess.01.x
. This is same as checking for version 00
. All tests are supposed to be passed successfully.
If a high-spin GVB computation is performed (it can be the target computation, or an intermediate step in a CASSCF computation) and there are more than 1 singly occupied orbital, the singly occupied orbitals will be localized and saved into xxx_s.fch
. The corresponding orbital localization method is Pipek-Mezey and it can be controlled by LocalM.
4.4.11 CASCI_prog
Specify the program for performing the CASCI calculation, e.g. CASCI_prog=PySCF
. Supported programs are PySCF(default), Molpro, GAMESS, OpenMolcas, Gaussian, ORCA, BDF, PSI4 and Dalton.
If you use some old versions of BDF as the CASCI_prog, you may find the NOONs are zero. In that case you should update to the latest version of BDF.
4.4.12 CASSCF_prog
Specify the program for performing the CASSCF calculation, e.g. CASSCF_prog=PySCF
. Supported programs are PySCF(default), Molpro, GAMESS, OpenMolcas, Gaussian, ORCA, BDF, PSI4 and Dalton. If you want to use a CASSCF program instead of PySCF, I recommend Molpro, OpenMolcas or GAMESS.
If you use some old versions of BDF as the CASSCF_prog, you may find the NOONs are zero. In that case you should update to the latest version of BDF. For excited state calculations. Please read related comments in A2.3.
4.4.13 DMRGCI_prog
The usage of this keyword is the same as DMRGSCF_prog.
4.4.14 DMRGSCF_prog
Specify the program for performing DMRG-CASSCF calculation, e.g. DMRGSCF_prog=PySCF
. Currently only PySCF is supported and it is the default program. According to the computational experience of jxzou
, the Block-1.5/Block2 are the fastest DMRG program currently.
Note that in fact the DMRG-CASSCF calculations are performed by Block and PySCF, not only by PySCF. Thus you should install the Block-1.5 or Block2 program. And you should also cite corresponding reference of the Block program. For PySCF >= 2.0, you also need to install the pyscf[dmrgscf] extension.
Also note that the MPI version used by Block is probably contradicted with MPI version used by ORCA, thus you have to comment one version of MPI environment variables at a time. Or you can use a Shell script to submit the automr job, in which your desired MPI environment variables are written.
By default, the OpenMP version of Block would be called during performing DMRG calculations. But if you want to use MPI version of Block (and you have installed it), you need to write mokit{block_mpi}
.
4.4.15 CASPT2_prog
Specify the program for performing CASPT2 calculation, e.g. CASPT2_prog=OpenMolcas
.
Currently only OpenMolcas(default), Molpro and ORCA are supported. All core orbitals
are not frozen. Note that a default IP-EA shift 0.25 a.u. will be applied and it
cannot be modified. If your CASPT2 results are sensitive to the IP-EA shift, it
implies that CASPT2 is not suitable to your problem. Another two types of shift
(the real or imaginary shift) is not supported.
Generally speaking, NEVPT2 and CASPT2-K are more recommended than CASPT2 since there is no need for IP-EA shift, real or imaginary shift in NEVPT2 or CASPT2-K methods.
4.4.16 NEVPT2_prog
Specify the program for performing NEVPT2 calculation, e.g. NEVPT2_prog=PySCF
. Currently supported programs are PySCF(default), Molpro, ORCA, OpenMolcas and BDF. All core orbitals are not frozen.
Note that there exist at least two variants of the NEVPT2: SC-NEVPT2 and FIC-NEVPT2
(aka PC-NEVPT2). By default, for NEVPT2_prog=PySCF/Molpro/ORCA/OpenMolcas, SC-NEVPT2
is chosen; while for NEVPT2_prog=BDF
, FIC-NEVPT2 is chosen. To turn on the FIC-NEVPT2
when using PySCF/Molpro/ORCA/OpenMolcas, please read FIC. Also
note that
(1) When you specify NEVPT2_prog=Molpro
or OpenMolcas, both of the SC-NEVPT2 and
FIC-NEVPT2 energies are actually printed in Molpro/OpenMolcas output (.out file).
You can open the output file and read it manually, if you need that energy.
(2) If you specify NEVPT2_prog=OpenMolcas
, it actually turns into a DMRG-NEVPT2
computation, no matter how large/small the size of active space is. In this special
case, you need to install the QCMaquis package (interfaced with OpenMolcas) for DMRG
computations.
4.4.17 MRCISD_prog
Specify the program for performing MRCISD calculation. By default, MRCISD_prog=OpenMolcas
. You MUST also specify a contraction type, please read CtrType carefully. Currently, automr
supports the interfaces of three MRCISD variants:
(1) uncontracted MRCISD (uc-MRCISD)
(2) internally contracted MRCISD (ic-MRCISD)
(3) fully internally contracted MRCISD (FIC-MRCISD)
where the computational cost and accuracy is (1)>(2)>(3). The ic- and FIC-MRCISD are both approximations of uncontracted MRCISD. If the Davidson size-consistency correction energy is added, then the method should be denoted as MRCISD+Q. It is recommended to use ic-MRCISD+Q or FIC-MRCISD+Q in practical calculations since the uncontracted MRCISD is usually too expensive.
automr
is able to call OpenMolcas/Molpro/ORCA/Gaussian/GAMESS/PSI4/Dalton programs to perform MRCISD. Currently all core orbitals are not frozen. automr
is also able to call PySCF+Block2 to perform the DMRG-FIC-MRCISD calculation.
If MRCISD_prog=OpenMolcas
, only variants (1) and (2) are supported. The Davidson correction can be provided for both methods.
If MRCISD_prog=ORCA
, only (1) and (3) are supported. The Davidson correction can be provided for both methods. However, only spherical harmonic functions of basis sets are supported in ORCA. But this is often not a problem, since it is recommended to use spherical harmonic functions than Cartesian functions.
Note that you should use ORCA>=5.0.0 for FIC-MRCISD+Q computations since older versions have a tiny bug in the Davidson correction.
If MRCISD_prog=Gaussian
or Dalton, only (1) is supported and no Davidson correction is given.
If MRCISD_prog=GAMESS
or PSI4, only (1) is supported. The Davidson correction energy will also be printed.
If MRCISD_prog=Molpro
, only (2) is supported. The Davidson correction energy will also be printed. Note that the MRCIC
program of Molpro will be called to perform ic-MRCISD. This ic-MRCISD is not exactly identical to that of OpenMolcas, so their electronic energies are different with (but close to) each other. If you want to compare (relative) electronic energies of two molecules using ic-MRCISD method, please choose the same type, i.e. both using Molpro or both using OpenMolcas.
4.4.18 MRCISDT_prog
Specify the program for performing an MRCISDT calculation. MRCISDT_prog
can be OpenMolcas(default), Dalton, PSI4 and GAMESS. No Davidson correction will be provided, and only uncontracted MRCISDT is supported. The Davidson size-consistency correction is not supported.
4.4.19 MRMP2_prog
Specify the program for performing an MRMP2 calculation. Only GAMESS is supported and this is the default.
4.4.20 MCPDFT_prog
Specify the program for performing a MC-PDFT calculation. Supported programs are OpenMolcas(default)/PySCF/GAMESS. If you use MCPDFT_prog=PySCF
, you need to install pyscf-forge.
Note that if the active space is larger than (15,15), the MC-PDFT will be automatically switched to DMRG-PDFT. In this special case you need to install the QCMaquis package if you use the default MCPDFT_prog
, i.e. OpenMolcas. If you use MCPDFT_prog=PySCF
in this case, you need to install Block. DMRG-PDFT is not supported in GAMESS currently.
Also note that in GAMESS, the MC-PDFT is only supported for version >= 2019(R2), and currently it can only be run in serial.
4.4.21 CtrType
Specify the contraction type of the MRCISD method. The default value is 0. When you specify the MRCISD method and the MRCISD_prog
, you must assign an integer for this variable, where
1 for uc-MRCISD
2 for ic-MRCISD
3 for FIC-MRCISD.
Generally, the ic- and FIC-MRCISD methods are recommended. See here for more explanations of MRCISD computations. If your calculation involves Cartesian-type basis functions, then you cannot use FIC-MRCISD in ORCA. In such case, you can choose ic-MRCISD in OpenMolcas instead.
4.4.22 MaxM
Specify the bond dimension MaxM in DMRG-related calculations. The default values is 1000 (e.g. MaxM=1000
). When maxM increases, the DMRG-CASCI energy will become closer to the CASCI energy, but the computational cost increases as well. The value 1000 is suitable for common cases. But do check whether it is valid for your system. For example, three computations using different MaxM (e.g. 500, 1000, 1500) may be conducted to study whether the energy converges with MaxM.
4.4.23 hardwfn
This option can only be applied to CASCI/CASSCF calculations using PySCF, OpenMolcas,
GAMESS or PSI4. By specifying hardwfn
, automr
will add extra keywords into the
CAS input files to ensure a better convergence. Note that normally you do not need
this keyword, and it is useless if you specify other programs as the CAS solver.
4.4.24 crazywfn
This option can only be applied to CASCI/CASSCF calculations using PySCF, OpenMolcas,
GAMESS or PSI4. By specifying crazywfn
, automr
will add extra keywords (more than
those of hardwfn
) into the CAS input files to ensure a better convergence. Note that
usually you do not need this keyword, and it is useless if you specify other programs
as the CAS solver.
For example, when the N2 molecule is stretched to d(N-N) = 4.0 Å, this is
a system which features strong correlation and requires a CAS(6,6) active space.
The Davidson iterative diagonalization in determinant CASCI (using GAMESS) may
not find the singlet state in the lowest 5 states. In this case, specifying crazywfn
will increase the NSTATE to 10, so that the singlet state can be found.
4.4.25 charge
This keyword has identical meaning with the same keyword in Gaussian software, i.e. including background point charges in calculations. This keyword is supported for almost all methods in automr
. Methods which are incompatible with background point charges will signal errors immediately. The charge-charge and charge-nuclei interaction energies are both included in all electronic energies printed (UHF, GVB, CASSCF, NEVPT2, etc).
The including of background point charges is useful for QM/MM calculations or fragmentation-based linear scaling methods (like GEBF, Many-body expansion, etc).
Note: please write this keyword in mokit{}
. DO NOT WRITE this keyword in the Route
Section of .gjf file (i.e. '#p ...' line).
4.4.26 OtPDF
The choice of the on-top pair density functional. This keyword has identical meaning with the keyword KSDFT in (Open)Molcas software. Currently available functionals are tPBE(default), tBLYP, tLSDA, trevPBE, tOPBE, ftPBE, ftBLYP, ftLSDA, ftrevPBE and ftOPBE. For more details please refer to the Molcas manual. Note that the available functionals depends on your version of OpenMolcas or GAMESS. Old versions may not support some of the functionals.
Note that for Openmolcas >= v22.02, the on-top pair density functional keywords in .input file of OpenMolcas have been changed to T:PBE, FT:PBE, etc. The user need not worry about this problem in MOKIT, since automr
will automatically detect the version of OpenMolcas and change the keyword tPBE into T:PBE if needed.
Note that in GAMESS, the MC-PDFT is only supported for version >= 2019(R2).
4.4.27 DKH2
Request the 2nd order scalar relativistic Douglas–Kroll–Hess (DKH2) correction to the one-electron Hamiltonian. Note that: (1) The two keywords DKH2 and X2C are mutually exclusive, i.e. you can only write one of them. (2) You should use all-electron basis sets like 'cc-pVTZ-DK', 'x2c-TZVPall' or 'ANO-RCC-VDZ' for all-electron relativistic calculations. Pseudopotential should not be used. (3) It is strongly not recommended to use Cartesian-type function of basis set (severe numerical instability observed), please just use the default spherical harmonic functions.
Currently only the point nuclei charge distribution is supported. The DKH0 Hamiltonian is rough and thus not supported. These programs support the DKH2 Hamiltonian: Dalton, Gaussian, GAMESS, OpenMolcas, ORCA, Molpro, PSI4.
4.4.28 X2C
Request the scalar (i.e. spin-free) relativistic X2C (eXact-two-Component) corrections to the one-electron Hamiltonian. Note that: (1) The two keywords DKH2 and X2C are mutually exclusive, i.e. you can only write one of them. (2) You should use all-electron basis sets like 'cc-pVTZ-DK', 'x2c-TZVPall' or 'ANO-RCC-VDZ' for all-electron relativistic calculations. Pseudopotential should not be used. (3) It is strongly not recommended to use Cartesian-type function of basis set (severe numerical instability observed), please just use the default spherical harmonic functions.
Also note that by default, the RHF/UHF is performed using Gaussian called by automr
, and GVB is performed using GAMESS called by automr
. When you specify mokit{X2C}
, the HF_prog
will be switched to PySCF automatically. Since GAMESS does not support X2C currently, in this step the X2C will be replaced by DKH2. According to the limited tests of the developer jxzou, MOs resulting from DKH2 and X2C are similar and often converge in few cycles.
Currently only the point nuclei charge distribution is supported. These programs support the X2C Hamiltonian: BDF, OpenMolcas, Molpro, PSI4, PySCF. X2C will be supported in ORCA in the near future.
4.4.29 RI
Request to turn on the RI-JK approximation for two-electron integrals in CASSCF. Default is off. Please just write mokit{RI}
, do not write 'mokit{RI=True}' or 'mokit{RI on}'. The other two types of RI approximations RI-J and RIJCOSX are not supported.
This option currently can only be used in CASSCF computations conducted by PySCF, OpenMolcas, Molpro, ORCA, or PSI4 programs. To learn more about the auxiliary basis set used in RI-JK approximation, see the following section.
4.4.30 RIJK_bas
Specify an auxiliary basis set for RI-JK approximation in CASSCF computations conducted by ORCA. Usually you do not need to specify this, since the automr program will automatically assign a proper auxiliary basis set according to the basis set (e.g. def2/JK for def2TZVP, cc-pVTZ/JK for cc-pVTZ). You can simply open the output file of automr and see what auxiliary basis set is assigned.
In the current version of OpenMolcas, there is no auxiliary basis set in it. The
automr
program in MOKIT will automatically transformed the needed basis set file
and put that into $MOLCAS/basis_library/jk_Basis/
.
4.4.31 F12
Request to turn on the F12 technique in NEVPT2 computations conducted by ORCA. F12 is not used by default. But if you turn on F12, RI will be turned on as a byproduct.
This option currently can only be used in CASSCF and CASSCF-NEVPT2 computations conducted by ORCA program, i.e. you need to specify CASSCF_prog=ORCA,NEVPT2_prog=ORCA,F12
in mokit{}. To learn more about the near-complete auxiliary basis set used in F12 technique, see the following section.
4.4.32 F12_cabs
Specify a near-complete auxiliary basis set for the F12 technique in NEVPT2 computations conducted by ORCA. Usually you do not need to specify this, since the automr program will automatically assign a proper auxiliary basis set according to the basis set (e.g. cc-pVTZ-F12-CABS for cc-pVTZ-F12). You can simply open the output file of automr and see what CABS is assigned.
4.4.33 DLPNO
Request to turn on the DLPNO technique in NEVPT2 computations conducted by ORCA. DLPNO is not used by default. But if you turn on DLPNO, RI (see 4.4.28) and FIC (see 4.4.33) will be turned on as byproducts.
This option currently can only be used in CASSCF and CASSCF-NEVPT2 computations conducted by ORCA program, i.e. you need to specify CASSCF_prog=ORCA,NEVPT2_prog=ORCA,DLPNO
in mokit{}. Of course it can be combined with F12 to perform RI-DLPNO-FIC-NEVPT2-F12 computations for large systems, where the keywords should be mokit{CASSCF_prog=ORCA,NEVPT2_prog=ORCA,DLPNO,F12}
.
4.4.34 FIC
Request the FIC- variant of NEVPT2 (i.e. FIC-NEVPT2) to be used. By default SC-
NEVPT2 is invoked if you specify NEVPT2 in route section #p NEVPT2/...
and use
PySCF/Molpro/OpenMolcas/ORCA program as NEVPT2_prog
. But if you specify NEVPT2_prog=BDF
,
this option is turned on as a byproduct and FIC-NEVPT2 will then be performed.
SC-NEVPT2 is not supported in BDF. FIC-NEVPT2 is not supported in PySCF.
4.4.35 ON_thres
When ist=5, this parameter is the threshold of natural orbital occupation numbers (NOON) for determining the number of active orbitals. Default value is 0.02, which means orbital occupation numbers 0.02~1.98 will be considered as active orbitals in subsequent CAS/DMRG calculations.
Note that when using ist=5, you should provide a .fch(k) file which contains paired natural orbitals, for example, UNO, UDFT NO, SUHF NO, etc. This is to ensure occupation numbers occur in a pairwise way (1-x, 1+x). Better not modify the default value unless you are an experienced user.
4.4.36 UNO_thres
When ist=1 or 2, this parameter is the threshold of UNO occupation numbers for determining the number of pairs in GVB computation. The default value is 0.00001 and it means all unoccupied UNOs with occupation numbers >1e-5 will be chosen (as well as their corresponding occupied UNOs) for subsequent orbital localization. The default value often corresponds to a full-valence computation of GVB. Similarly, setting UNO_thres=0.02 corresponds to the fact that any unoccupied UNOs with occupation numbers >0.02 will be chosen.
DO NOT modify the default value unless you are an experienced user.
4.4.37 excludeXH
Request the exclusion of inactive X-H bonds after normal GVB computation finished. For example, a normal GVB computation of the benzene molecule using cc-pVDZ basis set will lead to 15 pairs in total, which contains 9 pairs of C-C bonds and 6 pairs of C-H bonds. If the keyword excludeXH
is specified in mokit{}
, then a GVB(9) computation containing only C-C bonds will be automatically performed after GVB(15).
4.4.38 NMR
Request the calculation of nuclear shielding constants. Currently only the CASSF method is supported. Gauge-Independent Atomic Orbital (GIAO) method is used to compute the NMR shielding tensors. Note that:
(1) This keyword should be written in mokit{}, not in the Gaussian keyword line #p ...
.
(2) The chemical shift of an atom or element is the difference of nuclear shielding constants between the studied molecule and the standard reference molecule. For example, hydrogen atoms in tetramethylsilane(TMS) is usually used as the reference for chemical shifts in 1H-NMR.
(3) For practical computations of chemical shifts, it is recommended to use specially designed basis sets like pcSseg-1 or pcSseg-2. Larger basis sets like pcSseg-3, def2-QZVP or cc-pVQZ would be better but often too expensive.
(4) If you want to calculate NICS, please read 5.4.3 NICS of cyclobutadiene. If you want ICSS, please read 5.4.1 ICSS of the ground state of cyclobutadiene.
(5) You should use the OpenMP version of Dalton for this functionality, do not use MPI-parallelized version of Dalton.
(6) This functionality cannot be used for active space >(14,14), since there is no DMRG-GIAO implementation.
(7) MOKIT assumes the user uses 32-bit integer Dalton. The maximum memory allowed for 32-bit integer Dalton is around 16GB. So MOKIT will automatically reduce the memory to 16GB if the user specify >16GB in a NMR job.
4.4.39 ICSS
Request the calculation ICSS (Isochemical Shielding Surfaces). Currently only the
CASSF method is supported. Gauge-Independent Atomic Orbital (GIAO) method is used
to compute the NMR shieldings. See the example $MOKIT_ROOT/examples/automr/16-C4H4.gjf
.
Note that:
(1) This keyword should be written in mokit{}
, not in the Gaussian keyword line (#p ...).
(2) This is an extremely time-consuming job. So you'd better have a large node/machine to compute all the generated files. 6-31G(d) or 6-31+G(d) basis set is recommended since larger basis set requires very long time.
(3) After the ICSS computation is accomplished, there would be a file like *_ICSS.cub
generated. You can visualized it via GaussView, Multiwfn or VMD.
(4) You should use the OpenMP version of Dalton for this functionality, do not use MPI-parallelized version of Dalton.
(5) This functionality cannot be used for active space >(14,14), since there is no DMRG-GIAO implementation.
(6) MOKIT will submit 2, 3 or 4 Dalton jobs in parallel to accelerate the ICSS computation. The number of Dalton job is determined as three cases: (i) if %nproc is multiples of 4, then 4 jobs will be submitted; (ii) otherwise if %nproc is multiples of 3, then 3 jobs will be submitted; (iii) otherwise 2 jobs will be submitted. In these cases, the allowed maximum total memory is 64GB, 48GB and 32GB, respectively. This is because MOKIT assumes the user uses 32-bit integer of Dalton, which can only utilize up to 16GB. See a detailed example in 5.4 ICSS and NICS.
4.4.40 Nstates
Specify the number of roots to be averaged in SA-CASSCF computations. For example, Nstates=2 stands for 3 electronic states (ground state + two excited states). The default is to average states with the same spin. If you want to average S0/T1, you should also write the keyword Mixed_Spin.
4.4.41 Mixed_Spin
Specifying this keyword means that allowing electronic states with different spin multiplicities to be averaged in SA-CASSCF. If you want to write this keyword, just write Mixed_Spin. Do not write Mixed_Spin=.True. or Mixed_Spin=True. If this keyword is not specified, the default setting is to average states with the same spin.
4.4.42 Root
Specify the root which you are interested in State-Specific CASSCF (SS-CASSCF) calculations. Default value is 0 (ground state). Root=1
stands for the first excited state. For example, if the ground state is S0, then Root=1
stands for the S1 state. Note that this keyword is mutually exclusive to the keyword Nstates
in Nstates, since the latter one is used for SA-CASSCF. Currently dynamic correlation based on SS-CASSCF is not supported. The SS-CASSCF calculation is performed after the ground state CASSCF calculation.
4.4.43 GVB_conv
Modify/Set the density matrix convergence criterion in GAMESS GVB to be a desired threshold. The default threshold is different for two cases:
(1) 5D-4 (i.e. 0.0005 a.u.) for CASSCF and post-CASSCF calculations;
(2) 1D-5 for GVB and GVB-BCCC calculations.
Usually there is no need to modify the default value. But if you want to use a less tight threshold, 1D-4 ~ 5D-4 is recommended, e.g. GVB_conv=5D-4
. Note that only 4 characters are allowed for this parameter. Do not write 5.0D-4 since it exceeds the length limit. This keyword is equivalent to the keyword CONV in GAMESS (please read the documentation file docs-input.txt in GAMESS package if you want know more details).
There are two possible cases in which you may want to change the default GVB convergence threshold:
(1) When dealing with molecules which have many conjugated pi bonds (e.g. acene, zethrene), although the default orbital localization method Pipek-Mezey provides \( \sigma \) - \( \pi \) separated initial guess orbitals for GVB computation, the converged GVB orbitals may still be \( \sigma \) - \( \pi \) mixed. In that case you can specify keywords mokit{LocalM=Boys,GVB_conv=5d-4}
and specify exactly the number of \( \pi \) bonds n in Route Section as GVB(n) (this is just a combination of keywords which have been explored by the developer jxzou and found often useful). One may wonder why the Boys localization method is used here. This is because we have explicitly specified GVB(n), the n pairs of UNOs near HONO are usually pure pi orbitals and it is safe to use Boys localization among \( \pi \) orbitals.
(2) When dealing with d or f transition metal (e.g. Fe) molecules, the GVB orbital optimization often takes many cycles to converge or even diverge in the end, but the first ~30 cycles are often reasonable, so we can use a less tight threshold to converge the GVB wavefuntion.
4.4.44 Skip_UNO
Specify the number of pairs of UNOs to be skipped during orbital localization. Default is 0. For example, Skip_UNO=1
means that the HONO and LUNO will be kept unchanged when localizing UNOs. And Skip_UNO=2
means that the HONO-1, HONO, LUNO and LUNO+1 will be kept unchanged when localizing UNOs. This is useful when GVB exists multiple SCF solutions. Using Skip_UNO=1
you can probably obtain a biradical-like GVB solution (if the molecule indeed has significant biradical characters). It is recommended to choose the solution with the lowest GVB electronic energies for subsequent post-GVB
computations.
This keyword is invalid for keyword ist=3,4,5. It is also invalid when mokit{ist=6}
is specified. But it is valid for keywords mokit{ist=6,inherit}
since the keyword inherit
will force skip_UNO=N to be inherited in the GVB/STO-6G computation.
4.4.45 Inherit
Request to inherit keywords and the number of GVB pairs (if explicitly specified) in GVB/STO-6G calculation from the target calculation. This keyword can only be used when ist=6. Default is not to inherit keywords. If you want to write this keyword, just write Inherit. Do not write Inherit=.True. or Inherit=True.
4.4.46 Npair
Specify the number of GVB pairs in a non-GVB (e.g. CASSCF) calculation. For example, the following input file will lead a GVB(24) -> CASSCF(10,10) calculation, where 24 means 16 C-C bonds and 8 C-H bonds:
%mem=48GB
%nprocshared=48
#p CASSCF/cc-pVDZ
mokit{ist=1,readuhf='naphthalene_cc-pVDZ_uhf.fch'}
But if you want to perform a GVB(5) -> CASSCF(10,10) calculation, which would save some time and obtain the same CASSCF result, you can modify the keywords into
mokit{ist=1,readuhf='naphthalene_cc-pVDZ_uhf.fch',Npair=5}
4.4.47 FcGVB
Request to freeze all doubly occupied orbitals in GVB calculations. Do not write FcGVB=.T., FcGVB=.True., or FcGVB=True. Just specifying FcGVB will work, i.e. mokit{FcGVB,...}
. This keyword is useful for obtaining the GVB solution with pure \( \pi \) orbitals in calculations of non-planar polycyclic hydrocarbons. If you want to calculate the S-T gap, remember to specify FcGVB in both singlet and triplet cases.
By default, for CASCI/CASSCF and post-CAS calculations, FcGVB is automatically enabled in MOKIT. While for GVB and GVB-BCCC calculations, FcGVB is automatically disabled. The keyword NoFcGVB
prevents freezing doubly occupied orbitals.
4.4.48 OnlyXH
Request to keep only X-H bonds after a normal GVB computation finished. For example, a normal GVB computation of the benzene molecule using cc-pVDZ basis set will lead to 15 pairs in total, which contains 9 pairs of C-C bonds and 6 pairs of C-H bonds. If the keyword OnlyXH
is specified in mokit{}
, then a GVB(6) computation containing only C-H bonds will be automatically performed after GVB(15). This keyword can be viewed as an opposite option of excludeXH.
4.4.49 Xmult
Specify the spin multiplicity of the target excited state in a SS-CASSCF calculation. This keyword is usually used along with root
in Section Root. If the spin multiplicity in the input file is 1 (i.e. assuming a singlet ground state), here are some examples of calculating the target excited state:
(1) mokit{root=1}
means the S1 state;
(2) mokit{root=1,Xmult=1}
is identical to (1);
(3) mokit{root=1,Xmult=3}
means the T1 state;
(4) mokit{root=2,Xmult=1}
means the S2 state.
4.4.50 noDMRGNO
Request not to generate natural orbitals in a DMRG-CASCI calculation. Only valid for a DMRG-CASCI job. This keyword would save some time when one wants to check whether the DMRG-CASCI electronic energy converges with maxM. After the user confirms a suitable maxM
, he/she can remove this keyword and perform a single point calculation to generate DMRG NOs.
4.4.51 HFonly
Request the automr
program to terminate after the HF calculations are finished. This is useful when one wants to perform the sfX2C-UHF calculation using fragment guess. An input example is shown below
%mem=200GB
%nprocshared=48
#p GVB/gen guess(fragment=5)
mokit{HF_prog=PySCF,DKH2,HFonly}
...
automr
will firstly call Gaussian to generate the fragment guess, then transfer initial guess orbitals to PySCF and call PySCF to perform the sfX2C-UHF calculation. After UHF is finished, automr
will terminate and no GVB calculation will be performed. Here the keyword DKH2
is for GVB calculation and will not be used in the UHF calculation.
4.4.52 block_mpi
Request the MPI version of Block program to be used in DMRG calculations. By default automr
would call the OpenMP version of the Block2 program since it is faster than MPI version in a single node. Note that this keyword only changes the parallelism of DMRG-CASCI and/or DMRG-CASSCF calculations, but the NEVPT2 calculation after DMRG always utilize MPI parallelism (which requires mpi4py
to be intalled). Usually you do not need this keyword unless you want to perform tests or debug.
4.4.53 LocDocc
Request to perform the orbital localization upon the doubly occupied orbitals in a GVB job. This GVB job can be a target calculation, or just an intermediate step in a CASCI/CASSCF job. Note that this localization does not change the electronic energy of GVB/CASCI/CASSCF. It is just of convenient usage for some users, or for convenience of identifying core orbitals. The localization method is PM by default, and it is controlled by LocalM.
4.5 List of Utilities in MOKIT
The utilities for transferring MOs are summarized in the following figure.
For detailed explanations of all utilities, please read the following subsections. Clicking any utility label (in red color) -- for example, fch2py
-- in the figure will redirect you to the corresponding subsection.
Other useful utilities
Pre-processing | add_bgcharge_to_inp | replace_xyz_in_inp | |
Read/write fch | extract_noon2fch | fch_mo_copy | fch_u2r |
Generating guess | frag_guess_wfn | ||
Math operation | mo_svd | solve_ON_matrix | gvb_sort_pairs |
4.5.1 add_bgcharge_to_inp
This utility is designed to add background charges to the input file of various software packages. If you do not use background charges in your computation, you can skip this section. But if you do use them (e.g. in subsystems of fragmentation-based or embedding methods), they are not recorded in any .fch(k) file. This can be viewed as a defect of the .fch(k) file. Therefore, the generated input file by utilities fch2com
, fch2inp
, fch2iporb
, fch2psi
or bas_fch2py
will contain no background charges either.
To add background charges into the input file, you have to provide a .chg file which contains information of those charges. An example of such file is shown below
2
4.0 0.0 0.0 0.1
-4.0 0.0 0.0 0.1
The first line holds the number of point charges. While the charges are written starting from the second line, with format x, y, z, charge.
Note that in all computations of the automr
program, this situation is explicitly considered. This utility will be automatically called if needed. The only situation when you need this utility is merely using utilities fch2com
, fch2inp
, fch2iporb
, fch2psi
or bas_fch2py
(and of course, with background charges).
4.5.2 addH2singlet
Add hydrogen atoms onto the principle axes of a high-spin open-shell molecule with a distance around 100.0 Å. The number of added atoms \( n_{H} \) = 1, 2, 3, ... for doublet, triplet, quartet, ..., respectively. And the basis set of added hydrogen atoms are automatically set to STO-2G. This utility is designed to transform a high-spin open-shell molecule into a singlet complex, then one can use the singlet GVB-BCCC code to calculate the complex. The electronic energy of the open-shell molecule is obtained by \( E_{\text{target}} \) = \( E_{\text{complex}} \) - \( n_{H}*E_{H} \), where \( E_{H} \) is the ROHF/STO-2G energy of an H atom, i.e. -0.454397402 a.u. An example of using this utility is shown below
addH2singlet benzene_triplet_gvb14_s.fch
4.5.3 bas_fch2py
Generate a PySCF .py file from a Gaussian .fch file. The Cartesian coordinates, basis set data and keywords to read MOs from the .fch file are written in the .py file. Please do not delete these basis set data and do not use PySCF built-in basis set name, since the auto-generated basis set data ensures an exact correspondence of MOs in two programs. Two examples are shown below
(1) transfer RHF/ROHF/UHF/GHF/CAS MOs from Gaussian to PySCF
bas_fch2py h2o.fch
This will generate a Python input script h2o.py
. You can find that the last 3 lines in h2o.py
are commented
#dm = mf.make_rdm1()
#mf.max_cycle = 10
#mf.kernel(dm0=dm)
If you want to perform a HF calculation, remember to uncomment these lines. If you want to directly perform a CASSCF calculation or perform any orbital localization, just ignore or delete these lines.
(2) transfer DFT MOs from Gaussian to PySCF
bas_fch2py h2o.fch -dft
Assuming the file h2o.fch
contains the information of a B3LYP calculation, you can find that the last few lines in h2o.py
are
dm = mf.make_rdm1()
mf = dft.RKS(mol)
mf.xc = 'b3lypg'
mf.grids.atom_grid = (99,590)
mf.verbose = 4
mf.max_cycle = 128
mf.kernel(dm0=dm)
The functional b3lypg
corresponds to B3LYP
in Gaussian, and the grid (99,590)
corresponds to ultrafine
DFT grid of Gaussian. PySCF usually starts an SCF calculation using the density matrix, so mf.make_rdm1()
requires to construct the density matrix using input MOs. And mf.kernel(dm0=dm)
means that the constructed density matrix would be used as the initial guess.
This utility is in fact a wrapper of two utilities fch2inp
and bas_gms2py
. So if you only want to use this utility, you need to make sure that fch2inp
and bas_gms2py
are available (i.e. also be compiled).
Note that if you use background charges in your studied system, the background charges are not recorded in the .fch(k) file, which might be viewed as a shortcoming of .fch file. So there would be no background charges in the generated .py file. To add background charges, you need to use the utility add_bgcharge_to_inp.
See relevant Python modules fch2py, py2fch, and fchk.
4.5.4 bas_gau2molcas
Transform a basis set file in Gaussian format to another in (Open)Molcas format. If you turn on RI and use OpenMolcas as the CASSCF_prog
, there is no RI-JI auxiliary basis set file in current version of OpenMolcas package. Therefore, this utility will be called automatically to transform the auxiliary basis set file in $MOKIT_ROOT/mokit/basis/
directory to the (Open)Molcas syntax. And the transformed file will normally be in $MOLCAS/basis_library/jk_Basis/
.
You can, of course, use this utility by yourself. An example is shown below
bas_gau2molcas def2-universal-jkfit
Assuming the basis set file def2-universal-jkfit (in Gaussian format) has been put in the current directory, this command will generate a basis set file named DEF2-UNIVERSAL-JKFIT (in (Open)Molcas format).
4.5.5 bas_gms2bdf
Generate two BDF files (_bdf.inp
and .BAS) from a GAMESS .inp/.dat file. Cartesian coordinates are written in the _bdf.inp
file, while the basis set data is held in .BAS file. Note that BDF does not support Cartesian-type basis functions, so only spherical harmonic functions will be used. The 'ISPHER' keyword in .inp/.dat file (if any) will be ignored. One example is shown below
bas_gms2bdf a.inp
Generate a_bdf.inp and A.BAS files, for R(O)HF or UHF type wave function.
4.5.6 bas_gms2dal
Generate Dalton .dal and .mol files from a GAMESS .inp/.dat file. The Cartesian coordinates and basis set data are written in the .mol file. Two examples are shown and explained below
(1) bas_gms2dal a.inp
Generate .mol and .dal file, in which the keywords 'Cartesian' for all atom are written (in order to use pure Cartesian type of basis functions)
(2) bas_gms2dal a.inp -sph
Generate .,mol and .dal file, in which the keywords 'Cartesian' of each atom are not written (in order to use pure spherical harmonic type of basis functions).
4.5.7 bas_gms2molcas
Generate an (Open)Molcas .input file from a GAMESS .inp/.dat file. The Cartesian coordinates and basis set data are written in the .input file. Two examples are shown and explained below
(1) bas_gms2molcas a.inp
Generate .inp file, in which the keywords 'Cartesian' of each atom are written (in order to use pure Cartesian type of basis functions)
(2) bas_gms2molcas a.inp -sph
Generate .inp file, in which the keywords 'Cartesian' of each atom are not written (in order to use pure spherical harmonic type of basis functions).
4.5.8 bas_gms2molpro
Generate a Molpro .com file from a GAMESS .inp/.dat file. The Cartesian coordinates and basis set data are written in the .com file. Two examples are shown and explained below
(1) bas_gms2molpro h2o.inp
Generate h2o.com
, in which the keyword 'Cartesian' is written (in order to use pure Cartesian type of basis functions)
(2) bas_gms2molpro h2o.inp -sph
Generate h2o.com
, in which the keyword 'Cartesian' is not written (in order to use pure spherical harmonic type of basis functions).
(3) bas_gms2molpro h2o.inp -sph -m15
Generate h2o.com
for Molpro 2015. Only useful when you are using Molpro 2015 and dealing with ROHF/UHF high-spin wave functions.
4.5.9 bas_gms2psi
Generate a PSI4 _psi.inp
file from a GAMESS .inp/.dat file. The Cartesian coordinates and basis set data are written in the _psi.inp
file.
Two examples are shown and explained below
(1) bas_gms2psi a.inp
Generate a_psi.inp file, in which the keyword 'cartesian' is written (in order to use pure Cartesian type of basis functions)
(2) bas_gms2psi a.inp -sph
Generate a_psi.inp file, in which the keyword 'spherical' is written (in order to use pure spherical harmonic type of basis functions).
4.5.10 bas_gms2py
Generate a PySCF .py file from a GAMESS .inp/.dat file. The Cartesian coordinates and basis set data are written in the .py file. Two examples are shown and explained below
(1) bas_gms2py a.inp
Generate .inp file, in which the keyword 'mol.cart = True' is written (in order to use pure Cartesian type of basis functions)
(2) bas_gms2py a.inp -sph
Generate .inp file, in which the keyword 'mol.cart = True' is not activated (in order to use pure spherical harmonic type of basis functions).
4.5.11 bdf2fch
Transfer MOs from BDF (.scforb/.inporb/.casorb, etc) to Gaussian .fch file. Two examples are shown and explained below
(1) bdf2fch a.scforb a.fch
This is used for transferring R(O)HF, UHF or CASSCF orbitals.
(2) bdf2fch a.casorb a.fch -no
This is used for transferring CASCI or CASSCF NOs and NOONs.
bdf2fch
cannot generate a fch file from scratch, and a fch file must be provided and the MOs in it will be replaced. You should firstly use Gaussian to generate a .fch file (with keywords 'nosymm int=nobasistransform'), then generate the *_bdf.inp
file using fch2bdf
. After BDF computations finished, you can transfer MOs from .scforb/.casorb back to .fch file using bdf2fch
. This procedure seems a little bit tedious, but it ensures an exact reproduce of energy.
This utility supports only spherical harmonic functions. To transfer MOs from Gaussian to BDF, see fch2bdf.
4.5.12 bdf2mkl
Transfer MOs from BDF (.scforb/.inporb/.casorb, etc) to ORCA. The ORCA .inp and .mkl file will be generated. This utility is actually a wrapper of two utilities bdf2fch
and fch2mkl
. Thus bdf2mkl
cannot generate a .fch file from scratch, either. The user must provide a .fch(k) file, and MOs in that file will be replaced. Two examples are shown and explained below
(1) bdf2mkl a.scforb a.fch
This is used for transferring RHF, ROHF, UHF or CASSCF orbitals.
(2) bdf2mkl a.casorb a.fch -no
This is used for transferring CASCI or CASSCF NOs and NOONs.
NOTE: it is recommended to firstly use Gaussian to generate a .fch file (with keywords 'nosymm int=nobasistransform'), then generate the _bdf.inp
file using fch2bdf
. After BDF computations finished, you can transfer MOs from .scforb/.casorb of BDF to ORCA. This procedure seems a little bit tedious, but it ensures an exact reproduce of energy.
4.5.13 chk2gbw
Convert one (or more) .chk file(s) into .gbw file(s). The Gaussian utility formchk and ORCA utility orca_2mkl will be called automatically. Multiple chk files are supported. For example,
(1) chk2gbw a.chk
Convert a.chk to a.gbw.
(2) chk2gbw *.chk
Convert all chk files in the current directory into corresponding gbw files.
4.5.14 dal2fch
Transfer MOs from Dalton (i.e. DALTON.MOPUN file) to Gaussian .fch(k) file. Two examples are shown and explained below
(1) dal2fch a.dat a.fch
This is used for transferring R(O)HF or CASSCF orbitals.
(2) dal2fch a.dat a.fch -no
This is used for transferring CASCI/CASSCF natural orbitals and corresponding natural orbital occupation numbers. To transfer MOs from Gaussian to Dalton, see fch2dal.
NOTE: dal2fch
cannot generate a fch file from scratch, and a fch file must be provided and the MOs in it will be replaced. You should firstly use Gaussian to generate a .fch file (with keywords 'nosymm int=nobasistransform'), then generate the *.dal
and *.mol
files using fch2dal
. After Dalton computations finished, you can transfer MOs from Dalton back to Gaussian using dal2fch
. This procedure seems a little bit tedious, but it ensures an exact reproduce of energy.
4.5.15 dat2fch
Transfer MOs from GAMESS/Firefly (.inp/.dat file) to Gaussian .fch file. Five examples are shown and explained below
(1) dat2fch a.dat
This is used for transferring R(O)HF, UHF or CASSCF orbitals. If the file a.fch
exists, it will be used and MOs are written into that file. If a.fch
does not exist, the utility dat2fch
would try to create one such file from scratch. And this requires that the 1st line contains a $CONTRL
section in file a.dat
, so that charge and spin multiplicity can be read, e.g.
$CONTRL SCFTYP=ROHF RUNTYP=ENERGY ICHARG=0 MULT=2 ISPHER=1 $END
Usually we have the corresponding .fch file in hand. Creating a .fch file from scratch is only recommended for users who cannot use the Gaussian software, and he/she may have performed the calculation using user-defined basis set in GAMESS/Firefly.
(2) dat2fch a.dat b.fch
This is the same as (1), but transferring MOs to the specified file b.fch
.
(3) dat2fch a.dat a.fch -gvb 5
This is used for transferring GVB orbitals for spin singlet molecule. The order of GVB orbitals is different between Gaussian and GAMESS. Thus you must specify -gvb [npair]
to tell the utility the number of GVB pairs, so that dat2fch can adjust the order of MOs.
(4) dat2fch a.dat a.fch -gvb 5 -open 1
This is used for transferring GVB orbitals for non-singlet molecule. In this way, you tell the utility the number of GVB pairs and singly-occupied orbitals, so that dat2fch can adjust the order of MOs.
(5) dat2fch a.dat a.fch -no 10
This is used for transferring natural orbitals (NOs) of CASCI/CASSCF. In this way, you tell dat2fch
the number of NOs such that it would work correctly.
This utility supports two types of basis functions: (1) pure spherical harmonic functions; (2) pure Cartesian functions. But for creating a .fch file from scratch, only the former is supported currently. To transfer MOs from Gaussian to GAMESS/Firefly, see fch2inp.
4.5.16 extract_noon2fch
Extract natural orbital occupation numbers (NOONs) from the following types of files
(1) .out file of PySCF
(2) .dat file of GAMESS
(3) .gms file of GAMESS
(4) .out file of ORCA
(5) .out file of PSI4
and write NOONs (as the 'Alpha Orbital Energies' section) into a given .fch file. This is for the convenience of visualizing orbitals using GaussView or Multiwfn+VMD.
4.5.17 fch2amo
Generate Amesp files (.aip and .amo) from a Gaussian .fch(k) file. Cartesian coordinates are written in the .aip file, while the basis set data is written both in the .aip and .amo files (in different formats). Both spherical harmonic type and Cartesian type basis functions are supported. An example is shown below
fch2amo h2o.fch
a2m h2o.amo
The first line will generate h2o.aip
and h2o.amo
files. h2o.aip
is the input file of AMESP, and h2o.amo
is the wave function file. The second line converts the ASCII text file h2o.amo
into a binary file h2o.mo
. This procedure is very similar to the file operations in Gaussian, where the correspondence is
h2o.gjf <-> h2o.aip
h2o.fch <-> h2o.amo
h2o.chk <-> h2o.mo
formchk <-> m2a
unfchk <-> a2m
To transfer MOs from Amesp back to Gaussian, you can use the amo2fch
utility.
4.5.18 fch2bdf
Generate three BDF files (_bdf.inp
, .BAS, .scforb/.inporb) from a Gaussian .fch(k) file. Cartesian coordinates are written in the _bdf.inp
file, while the basis set data is held in .BAS file. The molecular orbitals are written in the .scforb/.inporb file. Note that BDF does not support Cartesian-type basis functions, so only spherical harmonic functions will be used. If there exists any Cartesian-type basis function in .fch(k) file, this utility will signal errors. Two examples are shown and explained below
(1) fch2bdf a.fch
This is used for transferring RHF/ROHF/UHF orbitals. Three files will be generated: a_bdf.inp, A.BAS and a.scforb. The data in 'Alpha Orbital Energies' and 'Beta Orbital Energies' sections in .fch file will be read and printed into the a.scforb file.
(2) fch2bdf a.fch -no
This is used for transferring NOs. Three files will be generated: a_bdf.inp, A.BAS and a.inporb. Note the data of 'Alpha Orbital Energies' section in .fch file (assumed occupation numbers) will be read and printed into the a.inporb file, but the occupation numbers do not affect subsequent computations.
This utility will call another two utilities fch2inp
and bas_gms2bdf
. So if you want to compile fch2bdf, you have to compile fch2inp
and bas_gms2bdf
additionally.
Note that to transfer HF orbitals, the data in 'Alpha Orbital Energies' and 'Beta Orbital Energies' section should be genuine orbital energies, since BDF program will use these values. Random values (like zero) will affect SCF computations and thus it cannot converge in 1 cycle (or even fails to converge). This is totally different with other quantum chemistry software packages where only orbitals are useful and orbital energies are useless. When transferring NOs, the occupation numbers in 'Alpha Orbital Energies' section do not affect subsequent computations.
To transfer MOs from BDF back to Gaussian, see bdf2fch.
4.5.19 fch2cfour
Generate CFOUR files (ZMAT, OLDMOS, GENBAS and ECPDATA, if ECP is used). One example is shown below
fch2cfour h2o.fch
This is used for transferring R(O)HF or UHF orbitals. One should first read the Cartesian coordinates from the CFOUR output and write a .gjf file to perform the single point calculation. Then use fch2cfour
to generate CFOUR files, otherwise the molecular orientations of two programs will not be the same. Please use CFOUR v2.1 (or higher), since older versions of CFOUR (like v1.0 or v2.00beta) could not correctly read MOs from the file OLDMOS.
4.5.20 fch2com
Generate a Molpro .com file from a Gaussian .fch(k) file, with alpha MOs written in a *.a
file. Two examples are shown below
(1) general use
fch2com h2o.fch
Files h2o.com
and h2o.a
would be generated. If UHF-type wave function is involved, h2o.b
woud also be generated. This is used for transferring R(O)HF, UHF or CASSCF orbitals.
(2) a special case
fch2com h2o.fch -m15
This is only useful when you are using Molpro 2015 and dealing with ROHF/UHF non-singlet wave function. Please do not use it for general purpose.
This utility will call another two utilities fch2inp
and bas_gms2molpro
. So if you want to compile fch2com, you have to compile fch2inp
and bas_gms2molpro
additionally.
Note that in Windows OS, any file with .com suffix/extension may be automatically associated with system, in which case double click of the mouse to open this file does not work. You have to right click on the .com file and choose 'open with'. You can modify the suffix/extension to .inp if you do not like that.
Note that if you use background charges in your studied system, the background charges are not recorded in the .fch(k) file. So there are no background charges in the generated .com file, either. To add background charges, you need to use the utility add_bgcharge_to_inp. To transfer MOs from Molpro back to Gaussian, see xml2fch.
4.5.21 fch2dal
Generate Dalton .dal and .mol files from a Gaussian .fch(k) file, where MOs are written in .dal file, and the Cartesian coordinates as well as basis set data are written in .mol file. One example is shown below
fch2dal a.fch
This is used for transferring R(O)HF or CASSCF orbitals. Note that there is no UHF method (or any methods based on UHF), thus you cannot use a .fch file containing UHF orbitals to transfer orbitals. To transfer MOs from Dalton back to Gaussian, see dal2fch.
4.5.22 fch2inp
Generate a GAMESS .inp file from a Gaussian .fch(k) file, with Cartesian coordinates, basis set data and MOs written in. The keywords in .inp file is already suitable for common simple calculations, but do check or modify it if you have additional requirements.
Note that due to the different types of MOs (R(O)HF, UHF, GVB, and CASSCF orbitals), the fch2inp offers different options. Five examples are shown and explained below
(1) fch2inp a.fch
This is used for transferring R(O)HF, UHF or CASSCF orbitals.
(2) fch2inp a.fch -gvb 5
This is used for transferring GVB orbitals for spin singlet molecule. The order of GVB orbitals is different between Gaussian and GAMESS. Thus you must specify -gvb [npair]
to tell the utility fch2inp the number of GVB pairs, so that fch2inp can adjust the order of MOs.
(3) fch2inp a.fch -gvb 5 -open 1
This is used for transferring GVB orbitals for non-singlet molecule. In this way, you tell the utility fch2inp the number of GVB pairs and singly-occupied orbitals, so that fch2inp can adjust the order of MOs.
(4) fch2inp high_spin.fch -sf
This is used to transfer RODFT/UDFT orbitals for the subsequent SF-TDDFT calculation in GAMESS. This spin multiplicity in high_spin.fch is supposed to be >=3, i.e. triplet or higher.
(5) fch2inp triplet.fch -mrsf
This is used to transfer RODFT orbitals for the subsequent MRSF-TDDFT calculation in GAMESS. This spin multiplicity in triplet.fch is supposed to be 3.
This utility supports two types of basis functions: (1) pure spherical harmonic functions; (2) pure Cartesian functions. To transfer MOs from GAMESS back to Gaussian, see dat2fch.
Note that if you use background charges in your studied system, the background charges are not recorded in the .fch(k) file. So there are no background charges in the generated .inp file, either. To add background charges, you need to use the utility add_bgcharge_to_inp.
4.5.23 fch2inporb
Transfer MOs from Gaussian to (Open)Molcas. A .input file and a .INPORB file will be generated. The .input file is the input file of (Open)Molcas, and it contains the geometry, basis set data and keywords. The .INPORB file contains the MOs. Two examples are shown and explained below
(1) fch2inporb a.fch
This is used for transferring R(O)HF, UHF or CASSCF orbitals.
(2) fch2inporb a.fch -no
This is used for transferring NOs.
The option '-no' means reading NOONs and NOs from .fch(k) file and printing them into the .INPORB file. The NOONs written in the .INPORB file does not affect the calculations in OpenMolcas at all. It just make the file appear more readable.
This utility will call another two utilities fch2inp
and bas_gms2molcas
. So
if you want to compile fch2inporb
, you have to compile fch2inp
and bas_gms2molcas
additionally.
Note: if you use background charges in your studied system, the background charges are not recorded in the .fch(k) file. So there are no background charges in the generated .input file, either. To add background charges, you need to use the utility add_bgcharge_to_inp.
Note: if you provide a .fch file generated by G09 and this .fch file is generated with DKH2 Hamiltonian in its corresponding .gjf file, the DKH2 information is not recorded in the .fch file. In such case, if you run fch2inporb xxx.fch, the generated xxx.input file will not include keyword Relativistic = R02O, you need to manually add this keyword into the .input file. This is not a problem for G16 since the Route section is recorded in .fch file for G16 so that it can be recognized/identified by fch2inporb.
To transfer MOs from (Open)Molcas back to Gaussian, see orb2fch.
4.5.24 fch2mkl
Transfer MOs from Gaussian to ORCA. One example is shown below
fch2mkl h2o.fch
This is used for transferring R(O)HF/UHF/CASSCF orbitals. Two files will be generated: h2o_o.inp
and h2o_o.mkl
. The _o
characters are added to avoid file overwritten in case that there is already a xxx.inp file in the current directory (for example, generated by the utility fch2inp
). See more examples below:
(1) Common DFT calculations
fch2mkl h2o.fch -dft wB97M-V
fch2mkl h2o.fch -dft 'b3lyp d3bj'
fch2mkl h2o.fch -dft 'HSE06 D3zero'
If h2o.fch
includes a DFT wave function, you can use the commands shown above to transfer orbitals. The density functional name is supposed to be provided by the user. If there is also dispersion correction to be added, they need to written together in single quotes '' or double quotation marks "".
(2) Double hybrid functional calculations
Assuming you want firstly to perform the SCF part of the double hybrid functional PWPB95 in Gaussian using def2TZVPP, the Gaussian keywords are
#p PW91B95/def2TZVPP nosymm int(nobasistransform) IOp(3/76=1000005000,3/77=0000005000,3/78=0731007310)
And assuming next you want to perform a PWPB95-D3(BJ) job using ORCA, then you can run
fch2mkl h2o.fch -dft 'PWPB95 D3BJ'
There is no need to write the orbital basis set def2-TZVPP. The keywords in the generated file h2o_o.inp
would be
! RKS PWPB95 D3BJ def2-TZVPP/C def2/J TightSCF noTRAH defgrid3
(3) DLPNO double hybrid density functional calculations
Assume that you have performed a wB97X/def2TZVPP calculation using Gaussian, you can run
fch2mkl h2o.fch -dft 'DLPNO-wB97X-2 D3'
to generate the ORCA input file. Keywords are well written in the input file, e.g.
! RKS DLPNO-wB97X-2 D3 def2-TZVPP/C TightPNO def2/J TightSCF noTRAH defgrid3
Usually there is no need to modify the keywords above. But if you use def2QZVPP in Gaussian, you need to modify def2-TZVPP/C to def2-QZVPP/C in the ORCA input file. Similarly, if you have performed a B3LYP/def2TZVPP calculation using Gaussian, you can run
fch2mkl h2o.fch -dft 'DLPNO-B2PLYP D3BJ'
to generate the ORCA input file.
(4) SF-TDDFT calculations
fch2mkl O2_UDFT.fch -sf
Theoretically, the SF-TDDFT method can be either based on ROHF/ROKS, or based on UHF/UDFT reference. But ORCA only accepts the UHF/UDFT reference. So please provide a UHF/UDFT .fch file.
(5) DLPNO-CCSD(T) calculations
Assuming you have perfored an RHF/cc-pVTZ job for H2O in Gaussian, and next you want to perform a DLPNO-CCSD(T)/cc-pVTZ computation in ORCA, then you can open the generated file h2o_o.inp
and modify the keywords as
! RHF TightSCF DLPNO-CCSD(T) TightPNO RIJK cc-pVTZ/JK cc-pVTZ/C
The RHF calculation would be accelerated by the RIJK approximation. If you prefer RIJCOSX than RIJK, you can modify keywords as
! RHF TightSCF DLPNO-CCSD(T) TightPNO RIJCOSX defgrid3 cc-pVTZ/C
Again, there is no need to write the orbital basis set cc-pVTZ
since the detailed basis set data was already written in the file h2o_o.inp
. Of course, you need to modify the parallel and memory settings to suit your computer. If you want more accurate results, you can modify DLPNO-CCSD(T)
to DLPNO-CCSD(T1)
.
To transfer MOs from ORCA to Gaussian, see mkl2fch and mkl2gjf. To transfer MOs from ORCA to other software packages, you can use mkl2amo
,mkl2cfour
,mkl2com
,mkl2dal
,mkl2inp
,mkl2inporb
,mkl2psi
,mkl2py
,mkl2qchem
.
Note 1: There is no need to write IOp(3/32=2)
in Gaussian .gjf file. And it is always not recommended to write this keyword.
Note 2: Assuming your ORCA input file is h2o_o.inp
, you need to run orca_2mkl h2o_o -gbw
to generate the h2o_o.gbw
file since ORCA cannot read h2o_o.mkl
file directly.
Note 3: When calculating relative energies (e.g. the energy barrier), the computation results must share the same theory level. For example, the RIJCOSX approximation is used in all files.
Note that if you use background charges in your studied system, the background charges are not recorded in the .fch(k) file. So there are no background charges in the generated .inp or .mkl file, either. To add background charges, you need to use the utility add_bgcharge_to_inp.
4.5.25 fch2psi
Transfer MOs from Gaussian to PSI4. Two or three files will be generated: *_psi.inp
, *.A
, and *.B
(if Beta MOs exist). The _psi.inp
file of PSI4 holds the Cartesian coordinates, basis set data and keywords. The .A (and .B) file contains the Alpha (Beta) MOs. Two examples are shown below
(1) For R(O)HF, UHF or CASSCF orbitals
fch2psi a.fch
(2) For DFT orbitals
fch2psi a.fch -dft 'wB97M-D3BJ'
This will write keywords about the density functional wB97M-D3BJ into the generated input file a_psi.inp
.
Please note that
(i) There is no need to write IOp(3/32=2)
in Gaussian .gjf file. And it is always not recommended to write this keyword.
(ii) This utility will call another two utilities – fch2inp
and bas_gms2psi
, remember to compile them additionally.
(iii) If you use background point charges in your studied system, the point charges are not recorded in the .fch(k) file. So there is no point charge in the generated .inp file. To add background point charges, you need to use the utility add_bgcharge_to_inp.
PSI4 can generate a .fch(k) file after calculation finished, which is equivalent to transferring MOs from PSI4 back to Gaussian.
4.5.26 fch2qchem
Transfer MOs from Gaussian to Q-Chem. For R(O)HF, UHF and DFT methods, two files will be generated: .in
and 53.0
. The molecular orbitals are stored in the file 53.0
. For GVB, three files will be generated: .in
, 53.0
, and 169.0
. The GVB orbitals are stored both in files 53.0
and 169.0
(the same content). Three examples are shown below
(1) fch2qchem h2o.fch
This is used for transferring R(O)HF, UHF or CASSCF orbitals.
(2) fch2qchem h2o.fch -gvb 2
This is used for transferring GVB orbitals. The orbitals in h2o.fch must be ordered in Gaussian GVB preference (bonding1, bonding2, anti-bonding2, anti-bonding1). And -gvb 2
tells fch2qchem to write GVB-PP related keywords into the .in file.
(3) fch2qchem high_spin.fch -sasf
This is used to transfer RODFT orbitals for the subsequent SA-SF-DFT calculation in Q-Chem. The spin multiplicity in high_spin.fch is supposed to be >=3, i.e. triplet or higher.
After running any of these commands, a Q-Chem input file h2o.in
and a scratch directory h2o
will be generated. The orbital file(s) is put in h2o/
. Run
qchem h2o.in h2o.out h2o
to perform the subsequent Q-Chem job. If the environment variable QCSCRATCH
is already defined in your node/computer, the scratch directory h2o
will be automatically put into $QCSCRATCH/
; otherwise it will be put in the current directory.
You can read Q-Chem Manual for more information.
4.5.27 fch2qm4d
Transfer MOs from Gaussian to QM4D. Input files for QM4D (.xyz, .inp, .xml and basis files for each element) will be generated. One example is shown below
fch2qm4d a.fch
This is used for transferring RHF or UHF orbitals.
Note 1: Currently QM4D supports only Cartesian-type basis sets. Thus this utility will signal error if you provide a .fch(k) file which has spherical harmonic functions (i.e. 5D 7F).
Note 2: QM4D supports ECP, but currently it seems to use spherical harmonic functions in ECP, so even if you specify 6D 10F
in Gaussian, the electronic energies of Gaussian v.s. QM4D will not be equal to each other. This tiny defect does not affect transferring MOs, and SCF can be converged in several cycles with almost no energy change.
Note that if you use background charges in your studied system, the background charges are not recorded in the .fch(k) file. So there are no background charges in the generated .inp file, either. To add background charges, you need to use the utility add_bgcharge_to_inp.
4.5.28 fch2tm
Transfer MOs from Gaussian to Turbomole, i.e. generate files control
and mos
from a .fch(k) file. An example is shown below
fch2tm h2o.fch
Using this utility, you can skip the interactive module define
of Turbomole, and obtain files control
and mos
directly. Default keywords written in control
are for RHF/ROHF/UHF calculations. If you want to perform other calculations, further modification of control
is needed. Currently only spherical harmonic functions are supported for this utility, so please do not use Cartesian-type functions (i.e. 6D 10F
) in the generation of .fch file.
For transferring MOs from Turbomole back to Gaussian, see molden2fch. Using utilities like fch2qchem, fch2inporb, fch2com, etc, you can transfer MOs from Turbomole to many other quantum chemistry programs.
4.5.29 fch2wfn
Generate .wfn file from a specified .fch(k) file. Two examples are shown below
(1) fch2wfn a.fch
For a .fch file in which ground state HF/DFT orbitals are recorded, generate the corresponding a.wfn file.
(2) fch2wfn a.fch -no
For a .fch file in which natural orbitals are recorded, generate the corresponding a.wfn file.
4.5.30 fch_mo_copy
Copy MOs from one .fch file into another .fch file. An example is shown below
fch_mo_copy a.fch b.fch
The default is to copy Alpha MOs in a.fch to Alpha MOs in b.fch. There are 4 optional parameters -aa
, -ab
, -ba
, -bb
. For example, -ab
means copying Alpha MOs from a.fch to Beta MOs in b.fch.
4.5.31 fch_u2r
Transform a UHF-type .fch file into a RHF-type one. Only alpha MOs are retained.
fch_u2r h2o_uhf.fch
4.5.32 frag_guess_wfn
This utility is designed to generate various EDA input files, e.g. GKS-EDA, SAPT, LMO-EDA and Morokuma-EDA. The Cartesian coordinates, basis set data, converged MO coefficients and necessary keywords will be written in the generated input file. See examples.
The Morokuma-EDA is supported by the GAMESS-US program; the GKS-EDA and LMO-EDA are supported by the XEDA program. These methods are very useful but often suffer from SCF convergence failure due to the "just so-so" SCF convergence techniques in GAMESS. Even if all SCF converge, their wave function stabilities cannot be checked in GAMESS.
The SAPT is supported by the PSI4 program. When the molecule is large or the basis set includes diffuse functions, SCF often fails to converge.
This utility will call Gaussian to perform SCF calculations for each fragment (and maybe also for the complex), then write all MOs into the input file, so that all SCF steps during an EDA calculation will converge immediately.
If UHF/UDFT method is specified, wave function stability of each fragment (and of total system in GKS-EDA) will be checked. This is very important for biradical and transition-metal-containing systems. If any fragment is singlet and UHF/UDFT method is specified, broken symmetry initial guess (i.e., guess=mix) will be automatically applied.
If you want to use the Windows pre-built executable frag_guess_wfn
, you need to define the environment variables of MOKIT and Gaussian. Assuming your input file nh3_h2o.gjf
is put in D:360Downloads\
, then you can run the following commands in CMD
D:
cd 360Downloads
set PATH=D:\360Downloads\mokit\bin;%PATH%
set PATH=D:\Program files\G09W;%PATH%
set GAUSS_EXEDIR=D:\Program files\G09W
frag_guess_wfn nh3_h2o.gjf >nh3_h2o.out
If you use G16W rather than G09W, remember to modify paths showns above.
4.5.33 gvb_sort_pairs
Sort (part of) MOs in descending order of the pair coefficients of the 1st natural orbital in each pair. This utility is designed only for the GAMESS .dat file. A new .dat file will be generated, in which the sorted MOs and pair coefficients are held.
4.5.34 mkl2com
Transfer MOs from ORCA .mkl file into Molpro (.com, .a and possibly .b files). Two examples are shown and explained below
(1) mkl2com h2o.mkl
This is used for transferring HF/DFT orbitals. h2o.com
and h2o.a
files will be generated. If the wavefucntion in .mkl is UHF-type, then h2o.b
will also be generated.
(2) mkl2com h2o.mkl water.com
Also used for transferring HF/DFT orbitals, but the input filename is specified by the user.
4.5.35 mkl2fch
Transfer MOs from ORCA .mkl file into Gaussian .fch(k) file. Four examples are shown and explained below
(1) mkl2fch a.mkl
This is used for transferring HF/DFT orbitals. If the file a.fch
exists, orbitals will be directly exported into it. If it does not exist, mkl2fch
will try to create one such file from scratch.
(2) mkl2fch a.mkl b.fch
Also used for transferring HF/DFT orbitals. But the filename b.fch
is specified. mkl2fch
will firstly try to find whether the file b.fch
exists. If it exists, orbitals will be directly exported into it. If it does not exist, mkl2fch
will try to create one such file from scratch.
(3) mkl2fch a.mkl a.fch -no
This is used for transferring CAS NOs, MP2 NOs, CCSD NOs, etc.
(4) mkl2fch a.mkl a.fch -nso
This is used for transferring Natural Spin Orbitals(NSO), e.g. UCCSD NSOs.
NOTE: If no .fch(k) file is provided, then mkl2fch
will try to generate one from scratch. This is usually O.K. but remember that there is no ECP/PP information in .mkl file. So if you use ECP/PP, the generated .fch file will not include ECP/PP data, and you need to add them into .gjf file if you want to perform further calculations using Gaussian.
Note that the number of digits in .mkl file is only 7, and the scientific notation is not used. Therefore, the transferred MOs will not be very accurate. This should be sufficient for visualizing orbitals and common wavefunction amalysis, but may cause up to 1e-5 a.u. error on electronic energy in further computations. This can be viewed as a defect of the orca_2mkl
utility in ORCA program. You are recommended to bring up an issue or request in the ORCA forum, to suggest ORCA developers to fix this. If the scientific notation is not used, then 10 digits of MO coefficients should be sufficient for further computations.
You can also transfer MOs from ORCA to Gaussian via mkl2gjf. To transfer MOs from Gaussian to ORCA, see fch2mkl.
4.5.36 mkl2gjf
Generate the Gaussian .gjf file from the ORCA .mkl file. The Cartesian coordinates, basis set data and MOs will be printed into the .gjf file. An example is shown and explained below
mkl2gjf a.mkl
this will generate the file a.gjf
. Note that the ORCA .mkl file has a defect: it does not contain ECP/PP information. Therefore, if you did not use ECP in your ORCA calculations, there would be no problem. But in case that you used ECP, there would be no ECP data in the generated .gjf file (you must add them manually), and there would be a warning printed on the screen
Warning in subroutine mkl2gjf: element(s)>'Ar' detected.
NOTE: the .mkl file does not contain ECP/PP information. If you use ECP/PP
(in ORCA .inp file), there would be no ECP in the generated .gjf file. You
should manually add ECP data into .gjf, and change 'gen' into 'genecp'. If
you are using an all-electron basis set, there is no problem.
4.5.37 molden2fch
Convert a .molden file into a .fch file. Note that many quantum chemistry packages do not strictly obey the molden format standard, so a .molden file generated by different programs usually have different basis function order and positive/negative signs of MO coefficients. This utility takes care of these details one case by one.
Example 1 If you have an ORCA wave function file h2o.gbw
, you can run
orca_2mkl h2o -molden
molden2fch h2o.molden -orca
to convert a ORCA-type h2o.molden into h2o.fch.
Example 2 If you have Turbomole files control
and mos
, you can run
tm2molden
h2o.molden
[Enter]
[Enter]
molden2fch h2o.molden -tm
convert a Turbomole-type h2o.molden into h2o.fch. Here tm2molden
is a built-in utility of Turbomole. Note that there is no charge and spin multiplicity in the .molden file. The utility molden2fch
will guess the charge and spin multiplicity and write them into the .fch file. Usually the guessing is right for HF/DFT jobs. But if you conducted complicated jobs, or stored natural orbitals in .molden, be careful about the charge and spin multiplicity in .fch. Currently only spherical harmonic functions are supported, so please do not use Cartesian-type basis functions in Turbomole (if you want to use molden2fch
).
Example 3 OpenMolcas
molden2fch h2o.molden -molcas
Example 4 CFOUR
molden2fch h2o.molden -c4
Note that only the molden file generated by CFOUR v2.1 is supported currently, since different released versions of CFOUR program may have different conventions of molden.
Example 5 CP2K
molden2fch h2o.molden -cp2k
convert a CP2K-type molden file into h2o.fch. A molden generated only by a gamma-point PBC calculation is supported. Becareful that lattice vectors and ECP/PP information are not recoreded in molden, so they would be missing in the .fch file, either. One can further use the bas_fch2py
utility to generate a PySCF PBC gamma-point Python script, or use load_cell_from_fch
to load a Cell object directly from the .fch file, which saves much time compared to writing a Python script from scratch. We are working on implementing better APIs for PBC calculations with less information missing of molden.
.molden file does not include any ECP/PP data, so the generated .fch file would not include that data, either. If you use ECP/PP in your calculations, be careful about this problem.
To transfer MOs from ORCA to Gaussian, also see mkl2fch and mkl2gjf. To transfer MOs from Gaussian to Turbomole, see fch2tm.
4.5.38 orb2fch
Transfer MOs from (Open)Molcas *Orb
file (e.g. .ScfOrb, .RasOrb.1, .UnaOrb, .UhfOrb, etc) to Gaussian .fch(k) file. 5 examples are shown and explained below
(1) orb2fch a.ScfOrb a.fch
This is used for transferring RHF orbitals.
(2) orb2fch a.RasOrb a.fch
This is used for transferring CASCI/CASSCF or RASCI/RASSCF (pseudo)canonical orbitals.
(3) orb2fch a.UnaOrb a.fch -no
This is used for transferring UNOs.
(4) orb2fch a.UhfOrb a.fch
This is used for transferring UHF alpha and beta MOs.
(5) orb2fch a.RasOrb.1 a.fch -no
This is used for transferring CASCI/CASSCF or RASCI/RASSCF NOs.
NOTE: orb2fch
cannot generate a fch file from scratch, and a fch file must be provided and the MOs in it will be replaced. You should firstly use Gaussian to generate a .fch file (with keywords 'nosymm int=nobasistransform'), then generate the *.input
and *.INPORB
files using fch2inporb
. After OpenMolcas computations finished, you can transfer MOs from OpenMolcas back to Gaussian using orb2fch
. This procedure seems a little bit tedious, but it ensures an exact reproduce of energy.
To transfer MOs from Gaussian to (Open)Molcas, see fch2inporb.
4.5.39 mo_svd
Output the singular values of the overlap matrix between two sets of MOs. This utility will first read the atomic overlap matrix from a given file, then calculate the overlap matrix between two sets of MOs. Finally, perform singular value decomposition (SVD) on the molecular orbital overlap matrix and print information about singular values. The first two command line arguments can be both Gaussian .fch files, or both OpenMolcas orbital files (.INBORB, .RasOrb, etc). The third argument is a Gaussian .log file or a OpenMolcas output file, in which the atomic overlap matrix is written.
The singular values can be used to measure the overlap or similarity of two sets of MOs. If all singular values are close to 1, the compared two sets of MOs are very similar. Any singular value being close to 0 means there exists at least 1 distinct orbital.
4.5.40 solve_ON_matrix
Compute the occupation number matrix (a non-diagonal matrix) of a set of MOs. Assuming you have a .fch(k) file which holds some kind of MOs, and another .fch(k) file which holds some kind of NOs and corresponding NOONs, then this utility can compute the occupation numbers of this set of MOs (transformed from NOs and NOONs). Of course, in this case the occupation number matrix of MOs is not a diagonal matrix, only the diagonal elements are written into the 'Alpha Orbital Energies' section of the file which holds MOs.
4.5.41 replace_xyz_in_inp
Replace the Cartesian coordinates in an input file, by coordinates from a given .xyz file or an output file. For the input/output file, currently only (Open)Molcas and Molpro formats are supported. Other formats will be supported in the near future. Four examples are shown below
(1) replace_xyz_in_inp a.xyz b.input -molcas
Replace the Cartesian coordinates in b.input file, by coordinates from the a.xyz file. A file named b_new.input will be generated.
(2) replace_xyz_in_inp a.out b.input -molcas
Replace the Cartesian coordinates in b.input file, by coordinates from an (Open)Molcas output file. A file named b_new.input will be generated.
(3) replace_xyz_in_inp a.xyz b.com -molpro
Replace the Cartesian coordinates in b.com file, by coordinates from the a.xyz file. A file named b_new.com will be generated.
(4) replace_xyz_in_inp a.out b.com -molpro
Replace the Cartesian coordinates in b.com file, by coordinates from a Molpro output file. A file named b_new.com will be generated.
4.5.42 xml2fch
Transfer MOs from Molpro .xml file to Gaussian .fch(k) file. 2 examples are shown and explained below
(1) xml2fch a.xml a.fch
This is used for transferring R(O)HF or UHF orbitals.
(2) xml2fch a.xml a.fch -no
This is used for transferring NOs.
NOTE: xml2fch
cannot generate a fch file from scratch, and a fch file must be provided and the MOs in it will be replaced. You should firstly use Gaussian to generate a .fch file (with keywords 'nosymm int=nobasistransform'), then generate the *.com
file using fch2com
. After Molpro computations finished, you can transfer MOs from Molpro back to Gaussian using xml2fch
. This procedure seems a little bit tedious, but it ensures an exact reproduce of energy.
To transfer MOs from Gaussian to Molpro, see fch2com.
The modules/utilities below are compiled by f2py
, which is a Fortran to Python interface generator. These utilities are not binary executable files, but dynamic libraries *.so
in $MOKIT_ROOT/mokit/lib
. They can only be imported in a Python script. And this is the reason that why one of the environment variables of MOKIT is PYTHONPATH
, not LD_LIBRARY_PATH
. These modules can also be viewed as APIs (Application Programming Interface), but they are described in this section not in 4.6, just for better comparison with other utilities of transferring MOs.
4.5.43 fch2py
Transfer MOs from Gaussian to PySCF. By importing fch2py, PySCF Python script is able to read alpha and/or beta MOs from a provided .fch file. All attributes are shown below
fch2py(fchname, nbf, nif, ab, mf.mo_coeff)
There are 4 parameters required by fch2py: (1) filename of .fch(k) file; (2) the number of basis functions; (3) the number of molecular orbitals; and (4) a character 'a'/'b' for reading alpha/beta orbitals. You should write these contents into a Python script, and run that script using the python executable. If you want a more detailed example, just run any automr
job and see all *.py
files generated.
See relevant utilities/modules: bas_fch2py and load_mol_from_fch.
4.5.44 py2fch
Export MOs from PySCF to a Gaussian .fch file. Note that py2fch
cannot generate a .fch file from scratch. The user must provide one .fch file, in which the 'Alpha Orbital Energies' and 'Alpha MOs' sections (or beta orbital counterpart) will be replaced by occupation numbers and MOs from PySCF. If you want to generate a .fch file from scratch, see the fchk module.
The provided .fch file must contain exactly the same geometry and basis set data as those data of PySCF. To obtain a valid .fch file, it is strongly recommended to use the bas_fch2py
utility to generate a .py file from a .fch file first, and then import py2fch
in this generated .py file. These descriptions seem a bit of tedious. But a rigorous transferring of MOs between two programs or two files is ensured. All attributes of py2fch
are shown below
py2fch('a.fch', nbf, nif, mf.mo_coeff, ab, ev, gen_density)
The first few parameters are identical to those in fch2py. The parameter ev
means eigenvalues, it is supposed to contain orbital energies or orbital occupation numbers. The last parameter gen_density
is a bool variable, where True/False meaning whether or not generating Total SCF Density (as well as writing density into .fch file) using mf.mo_coeff and ev. If gen_density=True
, the parameter ev
must contain orbital occupation numbers, not orbital energies. Because density would be computed using MOs and occupation numbers. If gen_density=False
, you can assign proper values to ev
as your wish.
This module is usually used along with these utilities/modules: bas_fch2py, fch2py and load_mol_from_fch.
4.5.45 py2fch_cghf
Export complex gHF MOs from PySCF to a Gaussian .fch file. Note that py2fch_cghf
cannot generate a .fch file from scratch. The user must provide one .fch file. This module is similar to py2fch
(which is usually used for R(O)HF, UHF and DFT methods). The difference is that this module is only designed for complex gHF/gDFT methods. All attributes of py2fch_cghf
are shown below
py2fch_cghf(fchname, nbf, nif, coeff, ev, gen_density)
4.5.46 fchk
This is a powerful module since it can directly export/generated a Gaussian .fch(k) file (i.e. no need to provide one .fch file in advance). And this module is the basis of many other modules like py2molpro, py2molcas, etc, see Section 4.6.5. A few examples are shown below:
(1) transfer HF/DFT MOs from PySCF to Gaussian
from pyscf import gto
from mokit.lib.py2fch_direct import fchk
mol = gto.M(atom='O 0.0 0.0 0.1; H 0.0 0.0 1.0',
basis='cc-pvtz',
charge=-1,
).build()
mf = mol.RHF().run()
fchk(mf, 'test.fch')
Options
- To write the HF/DFT Total SCF Density into the generated fch file, you should use the
density
argument
fchk(mf, 'test.fch', density=True)
- To specify another orbital coefficient (instead of
mf.mo_coeff
), use
fchk(mf, 'test.fch', mo_coeff=some_other_mo)
After MOKIT version 1.2.6rc5, a lazy import of this function is enabled. from mokit.lib.py2fch_direct import fchk
can be replaced by from mokit.lib import *
or from mokit.lib import fchk
.
(2) transfer CASSCF MOs from PySCF to Gaussian
from pyscf import gto, mcscf
from mokit.lib.py2fch_direct import fchk
mol = gto.M(
atom = 'O 0 0 0; O 0 0 1.2',
basis = 'ccpvdz',
spin = 2)
mf = mol.RHF().run()
mc = mcscf.CASSCF(mf, 6, 8).run() #CAS(6o,8e)
fchk(mc, 'O2_cas6o8e.fch')
(3) transfer CASSCF NOs from PySCF to Gaussian
from pyscf import gto, mcscf
from mokit.lib.py2fch_direct import fchk
mol = gto.M(
atom = 'O 0 0 0; O 0 0 1.2',
basis = 'ccpvdz',
spin = 2)
mf = mol.RHF().run()
mc = mcscf.CASSCF(mf, 6, 8) #CAS(6o,8e)
mc.natorb = True
mc.kernel()
fchk(mc, 'O2_cas6o8e_NO.fch', density=True)
By default, fchk
will dump mc.mo_coeff
and mc.mo_occ
(if present) into the 'Alpha MO' and 'Alpha Orbital Energies' section in .fch file. It's also possible to manually dump any orbitals you want
fchk(mc, 'O2_cas6o8e_some_other_orb.fch', mo_coeff=some_other_mo, mo_occ=some_other_occ)
(4) write UNOs into fch
By default, the fchk(mf, fchname)
for UHF/UKS object dumps the orbital energies which prevents we writing the UNO occupation numbers. So another function fchk_uno
can be used
from pyscf import gto, mcscf
from mokit.lib.py2fch_direct import fchk_uno
mol = gto.M(
atom = 'O 0 0 0; O 0 0 1.2',
basis = 'ccpvdz',
spin = 2)
mf = mol.UHF().run()
unoon, uno = mcscf.addons.make_natural_orbitals(mf)
fchk_uno(mf, 'O2_UNO.fch', uno, unoon)
There is a module named fchk()
in PSI4, which is used to export/generated a .fch file. Here we use the same name for easy memorizing. You can also use its alias py2gau
, e.g.
py2gau(mf, 'test.fch')
if you think py2gau
is more easy for you to memorize.
4.6 APIs in MOKIT
Currently only Python APIs are provided. C/C++ APIs may be provided in the future.
4.6.1 APIs from gaussian
and rwwfn
module
gaussian
module provides some APIs for manipulating .fch(k) files while rwwfn
provides some APIs for read/write .fch(k) or .log files. Some commonly used APIs in this module are shown below.
- load_mol_from_fch(fchname)
- loc(fchname, method, idx)
- uno(fchname)
- permute_orb(fchname, orb1, orb2)
- gen_fcidump(fchname, nacto, nacte, mem=4000, np=None)
- read_int1e_from_gau_log(logname, itype, nbf)
- read_mo_from_fch(fchname, nbf, nif, ab)
- read_density_from_gau_log(logname, itype, nbf)
- read_density_from_fch(fchname, itype, nbf)
- write_pyscf_dm_into_fch(fchname, nbf, dm, itype, force)
- gen_ao_tdm(logname, nbf, nif, mo, istate)
- export_mat_into_txt(txtname, n, mat, lower, label)
- pairing_open_with_vir
Meanings of frequently appeared arguments are
fchname
: filename of .fch(k) file
logname
: filename of .log/.out file
nbf
: the number of basis functions
nif
: the number of (linear-independent) MOs
Meanings of other arguments are shown in below subsections.
4.6.1.1 load_mol_from_fch
Load a PySCF mol object from a Gaussian .fch(k) file. For example, run python in Shell
from pyscf import scf
from mokit.lib.gaussian import load_mol_from_fch
mol = load_mol_from_fch(fchname='00-h2o_cc-pVDZ_0.96_rhf.fchk')
mf = scf.RHF(mol).run()
where the module gaussian is actually the file $MOKIT_ROOT/mokit/lib/gaussian.py
.
4.6.1.2 loc
Perform orbital localization for a given .fch(k) file. Two localization methods are supported: Foster-Boys or Pipek-Mezey. For example, run python in Shell
from mokit.lib.gaussian import loc
loc(fchname='benzene_5D7F_rhf.fchk', method='pm', idx=range(6,21))
Only 'boys' or 'pm' is allowed for the 2nd argument. You should specify the orbital indices or range in the 3rd argument. Note that the starting integer index is 0 in Python convention, and the final index cannot be reached. So range(6,21) means orbitals 7-21, which are valence occupied orbitals for benzene. The localized orbitals will be exported/written into a new file with suffix *_LMO.fch
(in this case, it is benzene_5D7F_rhf_LMO.fch).
For system containing only \( \sigma \) bonds, these two methods make little difference. While for system containing \( \sigma \) and \( \pi \) bonds, the Boys localization method tends to mix \( \sigma \) and \( \pi \) bonds, which leads to 'banana' bonds. The PM localization method tends to keep \( \sigma \) and \( \pi \) separated. If you want to analyze localized \( \pi \) bonds after localization, you should choose method='pm'
.
4.6.1.3 uno
Generate UHF natural orbitals(UNOs) from a given Gaussian .fch(k) file. For example
from mokit.lib.gaussian import uno
uno(fchname='benzene_uhf.fch')
4.6.1.4 permute_orb
Swap/exchange two orbitals in a given .fch(k) file. For example
from mokit.lib.gaussian import permute_orb
permute_orb('ethanol_rhf_proj_loc_pair2gvb8_s.fch',6,13)
then the MO 6 and MO 13 will be swapped/exchanged. The index of the first orbital starts from 1.
4.6.1.5 gen_fcidump
Generate a FCIDUMP file which contains the effective 1e integrals and 2e integrals from a given Gaussian .fch(k) file. Such a FICUDMP file can be used in CASCI, DMRG-CASCI, GVB-BCCC, or any post-CASCI calculations. For example
from mokit.lib.gaussian import gen_fcidump
gen_fcidump(fchname='anthracene_cc-pVDZ_uhf_uno_asrot2gvb7_s.fch',nacto=14,nacte=14)
where the arguments nacto
and nacte
are the number of active orbitals and the number of active electrons, respectively. This module requires the PySCF installed.
4.6.1.6 read_int1e_from_gau_log
Read various one-electron integral matrices from a Gaussian output file. For example, read AO-basis overlap
from mokit.lib import rwwfn
S = rwwfn.read_int1e_from_gau_log(logname='00-h2o_cc-pVDZ_1.5.log',itype=1,nbf=24)
Argument itype
: The type of one-electron integral matrices. Allowed values are 1,2,3,4 for Overlap, Kinetic Energy, Potential Energy, and Core Hamiltonian, respectively.
4.6.1.7 read_mo_from_fch
Read MOs from a Gaussian .fch(k) file. For example
from mokit.lib import rwwfn
mo = rwwfn.read_mo_from_fch(fchname='00-h2o_cc-pVDZ_1.5.fchk',nbf=24,nif=24,ab='a')
Argument ab
: character with length=1, 'a'/'b' for reading alpha/beta orbitals.
See also 4.6.4.1 to read/write MOs from files of other programs.
4.6.1.8 read_density_from_gau_log
Read various types of density matrix from a Gaussian output file. For example, read the Alpha Density Matrix from a .log file
from mokit.lib import rwwfn
den = rwwfn.read_density_from_gau_log(logname='00-h2o_cc-pVDZ_1.5.log', itype=2, nbf=24)
argument itype
:
1/2/3 for Total/Alpha/Beta Density Matrix.
4.6.1.9 read_density_from_fch
Read various types of density matrix from a Gaussian .fch(k) file. For example, read the Total SCF Density from a .fchk file
from mokit.lib import rwwfn
den = rwwfn.read_density_from_fch(fchname='00-h2o_cc-pVDZ_1.5.fchk',itype=1,nbf=24)
itype | type of density | itype | type of density |
---|---|---|---|
1 | Total SCF Density | 6 | Spin MP2 Density |
2 | Spin SCF Density | 7 | Total CC Density |
3 | Total CI Density | 8 | Spin CC Density |
4 | Spin CI Density | 9 | Total CI Rho(1) |
5 | Total MP2 Density | 10 | Spin CI Rho(1) |
See also 4.6.4.4 for other operations on density matrix.
4.6.1.10 write_pyscf_dm_into_fch
Write a PySCF density matrix into a given Gaussian .fch(k) file. This module does two things: (1) deal with the order of basis functions and their normalization factors, (2) then export density matrix into a given .fch(k) file. All arguments of this module are shown below
write_pyscf_dm_into_fch(fchname, nbf, dm, itype, force)
where dm
is the PySCF density matrix with dimension (nbf,nbf). itype
has the same meaning with that in read_density_from_fch. force
is a bool variable, and its meaning is:
(1) If the string corresponding to itype
(e.g. 'Total SCF Density') can be found/located in the given .fch file, this parameter will not be used, i.e. setting True/False does not matter.
(2) If the string corresponding to itype
cannot be found, setting force=True
will enforce writing density matrix into the given file; setting force=False
will stop/abort the program and signal errors immediately (this can be used to check whether desired strings exists in the specified file).
You should use this module via
from mokit.lib.py2fch import write_pyscf_dm_into_fch
Note that this module cannot generate a .fch(k) file from scratch, the user must provide one such file. The recommended approach is firstly using the utility bas_fch2py
to generate PySCF input file or using the Python module load_mol_from_fch
to a generate proper PySCF object, then do computations in PySCF. Finally using the module write_pyscf_dm_into_fch
to export desired density matrix.
4.6.1.11 gen_ao_tdm
Generate AO-basis ground->excited state Transition Density Matrix for CIS/TDHF/TDDFT from a Gaussian output file. Currently only closed shell is taken into consideration. For example, read the S0->S1 Transition Density Matrix from a Gaussian .log file (with iop(9/40=5) specified in .gjf file)
from mokit.lib import rwwfn, excited
mo = rwwfn.read_mo_from_fch(fchname='00-h2o_cc-pVDZ_0.96_rhf.fchk',nbf=24,nif=24,ab='a')
tdm = excited.gen_ao_tdm(logname='00-h2o_cc-pVDZ_0.96_rhf.log',nbf=24,nif=24,mo=mo,istate=1)
rwwfn.export_mat_into_txt(txtname='h2o_tdm.txt',n=24,mat=tdm,lower=False,label='transition density matrix')
Argument istate
: the i-th excited state. For example, i=1 for the first excited state.
The last python statement means exporting the transition density matrix into a plain text file, see the API below.
4.6.1.12 export_mat_into_txt
Export a square matrix into a plain text file. The example of exporting transition density matrix is shown in Section 4.6.11. Here I offer one more example - export the lower triangle of a symmetric AO-basis overlap matrix
from mokit.lib import rwwfn
S = rwwfn.read_int1e_from_gau_log(logname='00-h2o_cc-pVDZ_1.5.log',itype=1,nbf=24)
rwwfn.export_mat_into_txt(txtname='ovlp.txt',n=24,mat=S,lower=True,label='Overlap')
Argument | Explanation |
---|---|
txtname | file name |
mat | the square matrix with dimension (n,n) |
lower | True/False for whether or not printing only the lower triangle part of a matrix |
label | a string, the meaning of exported matrix (provided by yourself) |
4.6.1.13 pairing_open_with_vir
Paring each singly occupied orbital with a virtual one, for a high spin GVB wave function. These new pairs would be put after all normal GVB pairs. Such type of .fch file may be used for high-spin GVB-BCCC calculations. The order of MOs before calling this function is expected to be docc-socc-pair-vir. The order of MOs after calling this function would be docc-pair-(socc-vir1)-vir2. The input file fchname
will be updated.
from mokit.lib.rwwfn import pairing_open_with_vir
pairing_open_with_vir(fchname='ben_triplet_uhf_uno_asrot2gvb2.fch')
The input filename is supposed to end with gvbN.fch
. For example, gvb2.fch
will be identified as 2 GVB pairs. The number of doubly and singly occupied orbitals are automatically determined from information in .fch file. If a singlet .fch file is provided (i.e. no singly occupied orbital), a warning would be printed and fchname
would be left unchanged. The Alpha Orbital Energies
section in .fch file would be updated as: doubly occupation orbitals with 2.0; normal GVB pairs with 1.8/0.2; singly occupied orbitals (paired with virtual MOs) with 1.0/0.0; and remaining virtual orbitals with 0.0. These occupation numbers are not real occupation numbers from any GVB calculation, but just some numerical values for visualizing orbitals conveniently (i.e. to remind the user that different ranges for various types of orbitals).
If you do not have such a .fch file and want to create one, you can run the following Shell commands after an automr
GVB or CASSCF calculation
cp ben_triplet_uhf_uno_asrot2gvb2_s.fch ben_triplet_uhf_uno_asrot2gvb2.fch
dat2fch ben_triplet_uhf_uno_asrot2gvb2.dat ben_triplet_uhf_uno_asrot2gvb2.fch
the filenames above come from a triplet CASSCF(6,6) calculation for benzene.
4.6.1.14 reorder2dbabasv
Reorder MOs in the file gvbN_s.fch
. A new file gvbN_new.fch
would be generated. The order of MOs in gvbN_s.fch
is expected to be docc-bonding-socc-antibonding-vir. The order of MOs in gvbN_new.fch
would be docc-bonding1-antibonding1-bonding2-antibonding2-...-socc-vir.
from mokit.lib.rwwfn import reorder2dbabasv
reorder2dbabasv(fchname='ben_triplet_uhf_uno_asrot2gvb2_s.fch')
4.6.2 APIs related to PDB file
- read_natom_from_pdb(pdbname, natom)
- read_nframe_from_pdb(pdbname, nframe)
- read_iframe_from_pdb(pdbname, iframe, natom, cell, elem, resname, coor)
- write_frame_into_pdb(pdbname, iframe, natom, cell, elem, resname, coor, append)
4.6.2.1 read_natom_from_pdb
Read the number of atoms from a given pdb file. If there exists more than one frame in the pdb file, only the 1st frame will be detected. See an example in Section 4.6.2.4.
4.6.2.2 read_nframe_from_pdb
Read the number of frames from a given pdb file. See an example in Section 4.6.2.4.
4.6.2.3 read_iframe_from_pdb
Read the i-th frame from a given pdb file. See an example in Section 4.6.2.4.
4.6.2.4 write_frame_into_pdb
Write a frame into the given pdb file. A python script is shown below, to illustrate how to use these pdb-related APIs:
import mokit.lib.rwgeom as rg
natom = rg.read_natom_from_pdb(pdbname='test.pdb')
cell, elem, resname, coor = rg.read_iframe_from_pdb('test.pdb', 10, natom)
for i in range(0,natom):
s1 = elem[i][0].decode('utf-8')
s2 = elem[i][1].decode('utf-8')
print(s1, s2)
rg.write_frame_into_pdb('a.pdb', 10, natom, cell, elem, resname, coor, False)
4.6.3 Other wavefunction APIs
4.6.3.1 calc_unpaired_from_fch
Calculate the Yamaguchi's unpaired electrons and Head-Gordon's unpaired electrons using the provided .fch(k) file. Biradical and tetraradical indices are printed as well. This .fch file must include natural orbitals and their corresponding occupation numbers. For example, a UNO, GVB or CASSCF NO .fch file is expected. DO NOT use a UHF .fch file. A python script is shown below:
from mokit.lib.wfn_analysis import calc_unpaired_from_fch
calc_unpaired_from_fch(fchname='00-h2o_cc-pVDZ_1.5_uhf_uno_asrot2gvb4_s.fch', wfn_type=2, gen_dm=False)
The argument wfn_type
has values 1/2/3 for UNO/GVB/CASSCF NOs, respectively. The example above will print the following
----------------------- Radical index -----------------------
biradical character (2c^2) y0= 0.155
tetraradical character(2c^2) y1= 0.155
Yamaguchi's unpaired electrons (sum_n n(2-n) ): 1.183
Head-Gordon's unpaired electrons(sum_n min(n,(2-n))): 0.639
Head-Gordon's unpaired electrons(sum_n (n(2-n))^2 ): 0.328
-------------------------------------------------------------
If the argument gen_dm
is set to True
, then a file like *_unpaired.fch
will also be generated, in which the unpaired electron density is stored. The unpaired density can be visualized by GaussView or Multiwfn+VMD.
Yamaguchi's biradical index for UHF: \( t = (n_{\text{HONO}} - n_{\text{LUNO}})/2, y = 1 - 2t/(1 + t^2) \)
Yamaguchi's biradical index for CAS(2,2): \( y = 2{c_2}^{2} = n_{\text{LUNO}} \)
where \( c_2 \) is the CI coefficients of the 2nd configurations in CASCI wave function (assuming natural orbitals are used). Note that there is no unique way to define biradical (or tetraradical, etc) index for general cases including CASSCF (m,m) where m>=4, or GVB(n) where n>=2. The \( y_i = n_{\text{LUNO}+i} \) formula is adopted for these general cases.
There are also modules which calculate the number of unpaired electrons of GVB by reading information from GAMESS .dat/.gms file:
from mokit.lib.wfn_analysis import calc_unpaired_from_dat
calc_unpaired_from_dat(datname='00-h2o_cc-pVDZ_1.5_uhf_uno_asrot2gvb4_s.fch',mult=1)
It requires the user to input the spin multiplicity since there is no spin information in .dat file. Or you can use
from mokit.lib.wfn_analysis import calc_unpaired_from_gms_out
calc_unpaired_from_gms_out(outname='00-h2o_cc-pVDZ_1.5_uhf_uno_asrot2gvb4.gms')
Reference for these two types of unpaired eletrons:
[1] Theoret. Chim. Acta (Berl.) 48, 175-183 (1978). DOI: 10.1007/BF00549017.
[2] Chemical Physics Letters 372 (2003) 508–511. DOI: 10.1016/S0009-2614(03)00422-6.
4.6.3.2 get_gvb_bond_order_from_fch
Perform GVB bond order analysis using _s.fch
and _s.dat
files.
from mokit.lib.wfn_analysis import population
population.get_gvb_bond_order_from_fch('ben_uhf_uno_asrot2gvb15_s.fch')
The _s.fch
and _s.dat
files can be obtained after a GVB or CASSCF job.
4.6.3.3 gen_no_using_density_in_fch
Generate natural orbitals using specified density in a .fch file. The density is read from the specified section of .fch file and the AO-basis overlap is obtained by calling Gaussian. An example is shown below
from mokit.lib.lo import gen_no_using_density_in_fch
gen_no_using_density_in_fch('ben.fch', 1)
where 1 means reading density from "Total SCF Density" section.
4.6.3.4 gen_cf_orb
Generate Coulson-Fischer orbitals using GVB natural orbitals from a GAMESS .dat file. The Coulson-Fischer orbitals are non-orthogonal and the GVB natural orbitals are orthogonal, so this module actually performs an orthogonal -> non-orthogonal orbital transformation. This transformation requires the information of GVB pair coefficients so currently only the GAMESS .dat file is accepted while the .fch file is not supported. An example is shown below
from mokit.lib.lo import gen_cf_orb
gen_cf_orb(datname='naphthalene_gvb5.dat',ndb=29,nopen=0)
where ndb
is the number of doubly occupied orbitals and nopen
is the number of singly occupied orbitals. A new file named naphthalene_gvb5_new.dat will be generated. If you want to visualize the Coulson-Fischer orbitals, you need to use the dat2fch
utility to transfer orbitals from .dat to .fch.
4.6.3.5 make_orb_resemble
Make a set of target MOs resembles the reference MOs. Ddifferent basis set in two .fch files are allowed, but their geometries and orientations should be identical or very similar. An example is shown below
from mokit.lib.gaussian import make_orb_resemble
make_orb_resemble(target_fch, ref_fch, nmo=None, align=False)
where
target_fch
: the .fch file which holds MOs to be updated
ref_fch
: the .fch file which holds reference MOs
nmo
: indices 1~nmo MOs in ref_fch
will be set as reference MOs. If nmo
is not given, it will be set as na (number of alpha electrons) for RHF-type wave function, and set to na/nb respectively for UHF-type wave function.
align
: whether to align two molecules in files target_fch
and ref_fch
. The default is False
, i.e. assuming the geometries are already in a best alignment. If this is not your case, you need to set align=True
.
4.6.3.6 proj2target_basis
Project MOs of the original basis set onto the target basis set. cart
is True/False for Cartesian-type or spherical harmonic type functions, respectively. target_basis
can be smaller than, equal to, or larger than the basis set in fchname
, although usually a larger basis set would be used.
from mokit.lib.gaussian import proj2target_basis
proj2target_basis(fchname, target_basis='cc-pVTZ', cart=False)
Note: use this module to projecting MOs between two all-electron basis sets, or between two basis sets with the same ECP/PP. For example, the following 3 cases are recommended
(1) cc-pVDZ -> cc-pVTZ;
(2) cc-pVDZ -> def2TZVP (if the studied molecule include no element >Kr);
(3) def2SVP -> def2TZVP.
The basis sets def2SVP and def2TZVP have the same ECP/PP data, and they differ only in orbital basis data.
The following 3 cases are not recommended
(1) LANL2DZ -> cc-pVTZ;
(2) cc-pVDZ -> LANL2TZ(f);
(3) LANL2DZ -> def2TZVP.
if the studied molecule include any element >Ne. This is because the number of doubly occupied core MOs are not consistent among these basis sets (with ECP/PP). If you use projected MOs to perform SCF calculations, SCF will probably be oscillating.
4.6.3.7 lin_comb_two_mo
Perform \(\sqrt{2}/2 \) (mo1+mo2) and \(\sqrt{2}/2 \) (mo1-mo2) unitary transformation for two specified MOs in a Gaussian .fch file. When the \( \sigma \) and \( \pi \) orbitals of a double bond are mixed (i.e. a banana bond), this can be used to make them separated.
from mokit.lib.gaussian import lin_comb_two_mo
lin_comb_two_mo(fchname, orb1, orb2)
orb1
/orb2
are in Python convention (starts from 0). You need to call this module 2 times for a double bond, for example
lin_comb_two_mo('polyacene.fch', 30, 31) # for bonding orbitals
lin_comb_two_mo('polyacene.fch', 50, 51) # for anti-bonding orbitals
4.6.3.8 find_antibonding_orb
Construct antibonding orbitals according to bonding orbitals provided. Sometimes the users already have a set of bonding orbitals, e.g. constructed by IAO, Boys/PM localization as well as manual selection, etc. And they want to obtain the corresponding antibonding orbitals without changing any bonding orbital. Then this module would be best choice. The orbital index range [i1,i2]
contains the bonding orbitals, the index range [i3,nif]
contains orbitals which can be optimized/updated to obtain antibonding orbitals. i3>i2
is required. Once the antibonding orbitals are generated, they will be stored in index range [i3,i3+i2-i1]
. The orbitals [1,i3-1]
will not be changed by this module, which is useful when the user wants to keep some orbitals fixed. All MOs are orthonormal before and after using this module.
from mokit.lib.wfn_analysis import find_antibonding_orb
find_antibonding_orb(fchname, i1, i2, i3)
4.6.4 Other functions in rwwfn
4.6.4.1 read/write mo
read_mo_from_chk_txt(txtname, nbf, nif, ab, mo)
read_mo_from_orb(orbname, nbf, nif, ab, mo)
read_mo_from_xml(xmlname, nbf, nif, ab, mo)
read_mo_from_bdf_orb(orbname, nbf, nif, ab, mo)
read_mo_from_dalton_mopun(orbname, nbf, nif, coeff)
write_mo_into_fch(fchname, nbf, nif, ab, mo)
write_mo_into_psi_mat(matfile, nbf, nif, mo)
copy_orb_and_den_in_fch(fchname1, fchname2, deleted)
See also 4.6.1.7.
4.6.4.2 read/write eigenvalue or occupation number
read_eigenvalues_from_fch(fchname, nif, ab, noon)
read_on_from_orb(orbname, nif, ab, on)
read_on_from_dat(datname, nmo, on, alive)
read_on_from_xml(xmlname, nmo, ab, on)
read_on_from_bdf_orb(orbname, nif, ab, on)
read_ev_from_bdf_orb(orbname, nif, ab, ev)
read_on_from_dalton_mopun(orbname, nif, on)
write_eigenvalues_to_fch(fchname, nif, ab, on, replace)
write_on_to_orb(orbname, nif, ab, on, replace)
4.6.4.3 read/write basic information
modify_IROHF_in_fch(fchname, k)
read_mult_from_fch(fchname, mult)
read_charge_and_mult_from_fch(fchname, charge, mult)
read_charge_and_mult_from_mkl(mklname, charge, mult)
read_na_and_nb_from_fch(fchname, na, nb)
read_nbf_and_nif_from_fch(fchname, nbf, nif)
read_nbf_and_nif_from_orb(orbname, nbf, nif)
read_nbf_from_dat(datname, nbf)
read_ncontr_from_fch(fchname, ncontr)
read_shltyp_and_shl2atm_from_fch(fchname, k, shltyp, shl2atm)
4.6.4.4 read/write density and other matrices
read_ovlp_from_molcas_out(outname, nbf, S)
write_density_into_fch(fchname, nbf, total, dm)
detect_spin_scf_density_in_fch(fchname, alive)
add_density_str_into_fch(fchname, itype)
update_density_using_mo_in_fch(fchname)
update_density_using_no_and_on(fchname)
read_ao_ovlp_from_47(file47, nbf, S)
See also 4.6.1.6, 4.6.1.8, 4.6.1.9, 4.6.1.10.
4.6.4.5 read energy and other results
read_npair_from_uno_out(nbf, nif, ndb, npair, nopen, lin_dep)
read_gvb_energy_from_gms(gmsname, e)
read_cas_energy_from_output(cas_prog, outname, e, scf, spin, dmrg, ptchg_e, nuc_pt_e)
read_cas_energy_from_gaulog(outname, e, scf)
read_cas_energy_from_pyout(outname, e, scf, spin, dmrg)
read_cas_energy_from_gmsgms(outname, e, scf, spin)
read_cas_energy_from_molcas_out(outname, e, scf)
read_cas_energy_from_orca_out(outname, e, scf)
read_cas_energy_from_molpro_out(outname, e, scf)
read_cas_energy_from_bdf_out(outname, e, scf)
read_cas_energy_from_psi4_out(outname, e, scf)
read_cas_energy_from_dalton_out(outname, e, scf)
read_mrpt_energy_from_pyscf_out(outname, ref_e, corr_e)
read_mrpt_energy_from_molcas_out(outname, itype, ref_e, corr_e)
read_mrpt_energy_from_molpro_out(outname, itype, ref_e, corr_e)
read_mrpt_energy_from_orca_out(outname, itype, ref_e, corr_e)
read_mrpt_energy_from_gms_out(outname, ref_e, corr_e)
read_mrpt_energy_from_bdf_out(outname, itype, ref_e, corr_e, dav_e)
read_mrcisd_energy_from_output(CtrType, mrcisd_prog, outname, ptchg_e, nuc_pt_e, davidson_e, e)
read_mcpdft_e_from_output(prog, outname, ref_e, corr_e)
find_npair0_from_dat(datname, npair, npair0)
find_npair0_from_fch(fchname, nopen, npair0)
read_no_info_from_fch(fchname, nbf, nif, ndb, nopen, nacta, nactb, nacto, nacte)
4.6.4.6 miscellaneous
determine_sph_or_cart(fchname, cart)
check_cart(fchname, cart)
check_sph(fchname, sph)
check_if_uhf_equal_rhf(fchname, eq)
get_no_from_density_and_ao_ovlp(nbf, nif, P, ao_ovlp, noon, new_coeff)
get_1e_exp_and_sort_pair(mo_fch, no_fch, npair)
4.6.5 The py2xxx modules
There are a few py2xxx modules. The functions provided are as follows.
py2amesp(mf, aipname)
py2bdf(mf, inpname, write_no=None)
py2cfour(mf)
py2dalton(mf, inpname)
py2gms(mf, inpname, npair=None, nopen=None)
py2molcas(mf, inpname)
py2molpro(mf, inpname)
py2orca(mf, inpname)
py2psi(mf, inpname)
py2qchem(mf, inpname, npair=None)
Taking py2qchem
as an example, it exports MOs from PySCF to Q-Chem. An input file (.in) and a directory containing orbital files will be generated. The restricted/unrestricted-type (i.e. R(O)HF or UHF) can be automatically detected. Two examples are shown below:
(1) Transfer RHF, ROHF or UHF orbitals. Create/Write a PySCF input file, e.g. h2o.py
from pyscf import gto, scf
from mokit.lib.py2qchem import py2qchem
mol = gto.M(atom = '''
O -0.49390246 0.93902438 0.0
H 0.46609754 0.93902438 0.0
H -0.81435705 1.84396021 0.0
''', basis = 'cc-pVDZ')
mf = scf.RHF(mol).run()
py2qchem(mf, 'h2o.in')
After MOKIT version 1.2.6rc5, a lazy import of functions in this subsection is enabled. from mokit.lib.py2qchem import py2qchem
can be replaced by from mokit.lib import *
or from mokit.lib import py2qchem
.
Run it using python and then a Q-Chem input file h2o.in
and a scratch directory h2o
will be generated. The orbital file(s) is put in h2o/
. If you run
qchem h2o.in h2o.out h2o
you will find RHF in Q-Chem converges in 2 cycles. If the environment variable QCSCRATCH
is already defined in your node/computer, the scratch directory h2o
will be automatically put into $QCSCRATCH/
; otherwise it will be put in the current directory.
(2) Transfer GVB orbitals If you have performed GVB calculations and stored GVB orbitals in the object mf, then you can write in python
py2qchem(mf, 'h2o.in', npair=2)
to generate the GVB-PP input file of Q-Chem, where npair
tells py2qchem
how many pairs you want to calculate.
4.6.6 The qchem module
qchem2amesp(fchname, aipname)
qchem2bdf(fchname, inpname)
qchem2cfour(fchname)
qchem2dalton(fchname, dalname)
qchem2gms(fchname, inpname)
qchem2molcas(fchname, inpname)
qchem2molpro(fchname, inpname)
qchem2psi(fchname, inpname)
qchem2orca(fchname, inpname)
standardize_fch(fchname)
Taking qchem2molpro
as an example, it will first standardize your provided .fch(k) file and then export MOs from Q-Chem to Molpro. The restricted/unrestricted-type (i.e. R(O)HF or UHF) can be automatically detected. Start Python and run
from mokit.lib.qchem import qchem2molpro
qchem2molpro('water.FChk','h2o.com')
Two files(h2o.com
and water_std.a
) will be generated. Keywords for reading MOs from water_std.a
have been written in h2o.com
.
If you only want to standardize a Q-Chem .fch(k) file and do not want to transfer MOs. Then you only need
from mokit.lib.qchem import standardize_fch
standardize_fch('water.FChk')
A file named water_std.fch
will be generated.
4.6.7 The mirror_wfn module
rmsd_wrapper(fname1, fname2)
mirror_wfn(fchname)
mirror_c2c(chkname)
rotate_atoms_wfn(fchname, coor_file)
permute_atoms_wfn(fchname, coor_file)
geom_lin_intrplt(gjfname1, gjfname2, n)
The rmsd_wrapper
function is able to calculate the RMSD value between two molecules. The input files can be any type of xyz/gjf/fch. For exemple,
from mokit.lib.mirror_wfn import rmsd_wrapper
rmsd_v = rmsd_wrapper('h2o_1.gjf','h2o_2.gjf')
print(rmsd_v)
Note that two molecules can be in different orientations, since RMSD calculation will firstly rotate molecules to achieve maximum overlap. But two molecules must have one-to-one correspondence of atomic labels. e.g. H1-H1, C2-C2. An automatic adjustment of atomic labels to achieve maximum overlap has not been implemented yet.
The mirror_wfn
function transforms a molecule into its mirror image by multiplying all z-components of Cartesian coordinates with -1. Besides, the MO coefficients as well as density matrix in the .fch(k) file are updated. For exmaple, if we provide a chiral molecule with R-chirality on the carbon center,
from mokit.lib.mirror_wfn import mirror_wfn
mirror_wfn('CHFClBr_R.fch')
the function mirror_wfn
will return a file named CHFClBr_R_m.fch
, where _m
means mirror. This new file includes new Cartesian coordinates with S-chirality, updated MO coefficients, total density matrix and spin density matrix (if it was in the CHFClBr_R.fch file).
If you start with the file CHFClBr_R.chk
, not CHFClBr_R.fch
, and you want to get the .chk file of the mirror image, you can use the function mirror_c2c
, which is a wrapper of formchk -> mirror_wfn -> unfchk
. For exemple,
from mokit.lib.mirror_wfn import mirror_c2c
mirror_c2c('CHFClBr_R.chk')
then you can directly obtain the file CHFClBr_R_m.chk
without running formchk
or unfchk
manually. If you write/create a .gjf file to read MOs from CHFClBr_R_m.chk
, the SCF will be converged in 1 cycle if you use the same computational level, because the mirror transformation here is exact.
The function rotate_atoms_wfn
generates the MO coefficients of a translated and rotated molecule. For example
from mokit.lib.mirror_wfn import rotate_atoms_wfn
rotate_atoms_wfn('h2o.fch','h2o_new.gjf')
The original geometry and MO coefficients are stored in h2o.fch
, while the geometry after translation and rotation is stored in h2o_new.gjf
. The xyz format is supported as well, e.g. you can provide h2o_new.xyz
instead. A file named h2o_r.fch
will be generated, which includes new Cartesian coordinates and MO coefficients. Again, if you read MOs from h2o_r.fch
and use the same computational level, the SCF will be converged in 1 cycle.
Note that the one-to-one atom correspondence in two files h2o.fch
and h2o_new.gjf
must be ensured by the user, this function will check whether the number of atoms in two files are equal to each other, but it will not check the atom correspondence.
The function permute_atoms_wfn
generates the MO coefficients of a molecule where some atoms are exchanged/permuted. For example
from mokit.lib.mirror_wfn import permute_atoms_wfn
permute_atoms_wfn('h2o.fch','h2o_new.gjf')
The original geometry and MO coefficients are stored in h2o.fch
, while the geometry after exchanging/permutations are stored in h2o_new.gjf
. The xyz format is supported as well, e.g. you can provide h2o_new.xyz
instead. A file named h2o_p.fch
will be generated, which includes the Cartesian coordinates of h2o_new.gjf
and generated MO coefficients. Again, if you read MOs from h2o_p.fch
and use the same computational level, the SCF will be converged in 1 cycle.
Note that only exchanging/permutations of atoms are allowed, while translation or rotation of the molecule are not allowed for this function. In the latter case, you should use rotate_atoms_wfn
above.
The function geom_lin_intrplt
generates a series of Cartesian coordinates by linear interpolation between the initial and final geometry. For example, if we have the geometries of the reactant and product in a Diels-Alder reaction, we can use this function to generate geometries connecting the reactant and product
from mokit.lib.mirror_wfn import geom_lin_intrplt
geom_lin_intrplt('DA_reactant.gjf','DA_product.gjf',7)
The integer 7 means generating seven geometries along the linear interpolation path. These geometries can be used in subsequent (un)relaxed scan or NEB calculations. Also, you may visualize the generated geometries and choose one as the initial geometry of transition state. The one-to-one atom correspondence in two files should be ensured by the user, since this function will not check that.
Methods and Keywords in autosr
4.7.1 Methods in autosr
4.7.1.1 MP2, RI-MP2
%mem=4GB
%nprocshared=2
#p MP2/cc-pVTZ
mokit{noRI}
0 1
O 0.000000 0.000000 0.062007
H 0.000000 -0.783976 -0.492052
H 0.000000 0.783976 -0.492052
Different keywords in the Title Card Line can be applied:
(1) RI-MP2: mokit{}
.
(2) MP2 NO using unrelaxed density: mokit{noRI,NO}
.
(3) MP2 NO using relaxed density: mokit{noRI,NO,relaxed_dm}
.
(4) RI-MP2 NO using unrelaxed density: mokit{NO}
.
(5) RI-MP2 NO using relaxed density: mokit{NO,relaxed_dm}
.
(6) MP2 analytical gradients: mokit{noRI,force}
.
(7) RI-MP2 analytical gradients: mokit{force}
.
4.7.1.2 CCD, CCSD, CCSD(T), CCSD(T)-F12
%mem=4GB
%nprocshared=2
#p CCSD(T)/cc-pVTZ
mokit{noRI}
0 1
O 0.000000 0.000000 0.062007
H 0.000000 -0.783976 -0.492052
H 0.000000 0.783976 -0.492052
Other cases like:
#p CCSD(T)/cc-pVTZ
mokit{noRI,CC_prog=Molpro,NO}
#p CCSD/cc-pVTZ
mokit{noRI,CC_prog=PySCF,force}
4.7.1.3 DLPNO methods
%mem=4GB
%nprocshared=2
#p DLPNO-CCSD(T1)/cc-pVTZ
mokit{}
0 1
O 0.000000 0.000000 0.062007
H 0.000000 -0.783976 -0.492052
H 0.000000 0.783976 -0.492052
Other cases like:
#p MP2/cc-pVTZ
mokit{DLPNO,NO,relaxed_dm}
#p MP2/cc-pVTZ
mokit{DLPNO,force}
4.7.1.4 EOM-CCSD, IP-EOM-CCSD, EA-EOM-CCSD
To be documented.
4.7.2 Keywords in autosr
There are quite a few keywords that do the same thing here as in automr
, as follows
readrhf
, readuhf
, Cart
, DKH2
, X2C
, charge
Search them in Section 4.4 for their usage.
4.7.2.1 Sepcify programs for each step
HF_prog
Supported programs are Gaussian/PySCF/PSI4/ORCA.
MP2_prog
Supported programs are Molpro/Gaussian/PySCF/PSI4/ORCA/QChem/OpenMolcas.
CC_prog
Supported programs are Molpro/Gaussian/PySCF/PSI4/ORCA/QChem/OpenMolcas.
feature | supported programs |
---|---|
CCSD energy | all |
CCSD unrelaxed DM | Molpro/PySCF/PSI4/ORCA/QChem |
CCSD force | Molpro/PySCF/Gaussian/PSI4/QChem |
CCSD relaxed DM | Molpro/Gaussian/PSI4/QChem |
CCSD(T) energy | all |
CCSD(T) unrelaxed DM | Molpro |
CCSD(T) force & relaxed DM | Molpro/PSI4 |
ADC_prog
To be documented.
EOM_prog
Supported programs are Molpro/Gaussian/PSI4/ORCA/QChem.
4.7.2.2 Nstates
To be documented.
4.7.2.3 RI settings
NoRI
Request to turn off the RI acceleration technique.
RIJK_bas
Specify the RI-JK auxiliary basis set.
RIC_bas
Specify the RI-C auxiliary basis set.
4.7.2.4
F12
To be documented.
F12_cabs
To be documented.
4.7.2.5 DLPNO
To be documented.
4.7.2.6
NO
This keyword requests the generation of Natural Orbitals(NOs) in a post-HF calculation. The NOs will be stored in a file like xxx_NO.fch
, which can be visualzied using GaussView or Multiwfn. By default NOs will not be generated. If you want NOs, do not write mokit{NO=True}
but simply write mokit{NO}
. See the next keyword for using relaxed or unrelaxed density to generate NOs.
Relaxed_DM
Generate NOs using the relaxed density or unrelaxed density.
4.7.2.7 Force
This keyword requests the calculation of analytical nuclear gradients.
4.7.2.8 Miscellaneous
FC
Control the number of frozen core orbitals in a post-HF calculation. By default, the number of frozen core orbitals is equal to settings in ORCA manual 9.11 Frozen Core Options
. For example, if we perform a post-HF calculation for a water molecule, FC=1
will be automatically set and the user does not need to take care of this. If you want no frozen core, write mokit{FC=0}
.
5 Examples
- 5.1 Examples of automr
- 5.2 Examples of Transferring MOs
- 5.3 Examples of frag_guess_wfn
- 5.4 ICSS and NICS
- 5.5 Manually making consistent active space
- 5.6 Generating different GVB solutions
- 5.7 Examples of dinuclear transition metal molecules
- 5.8 Examples of autosr
- 5.9 Visualization of upaired electrons
5.1 Examples of automr
5.1.1 Single point calculation of the ground state
The input file of automr
is just the Gaussian .gjf file. For example, the ground state CASSCF calculation of a stretched molecule is shown as follows (h2o.gjf)
%mem=8GB
%nprocshared=4
#p CASSCF/def2TZVP
mokit{}
0 1
O -0.23497692 0.90193619 -0.068688
H 1.26502308 0.90193619 -0.068688
H -0.73568721 2.31589843 -0.068688
Submit the job
automr h2o.gjf >h2o.out 2>&1 &
After the calculation is accomplished, you can find in the output file that an active space CAS(4,4) is automatically determined. And the CASSCF natural orbitals (NOs) can be found in h2o_uhf_gvb4_CASSCF_NO.fch
. The calculation will also print RHF, UHF, GVB and CASCI electronic energies since they are supposed to be intermediate steps in a CASSCF job.
The GVB NOs can be found in h2o_uhf_uno_asrot2gvb4_s.fch
, in which the GVB active space contains 4 pairs of orbitals (O 2s lone pair, O 2p lone pair, and two O-H bonding orbitals as well as two O-H anti-bonding orbitals). The two pairs of O-H orbitals are of significant multireference characters, so they are automatically chosen as the active space for the further CASSCF(4,4) calculation. That is to say, by default the important GVB orbitals are used as the initial guess orbitals of CASCI/CASSCF calculations. The threshold for determining important GVB orbitals is ni >= 0.02, where ni is the occupation number of the 2nd natural orbital in a GVB pair.
If you want to use other methods instead of CASSCF, just replace the method name as GVB, CASCI, CASPT2, NEVPT2, MRMP2, MRCISD, MCPDFT, etc. To see all methods supported by automr
, run automr -h
.
If you want to use another quantum chemistry program for the CASSCF calculation (say Molpro), you need to write mokit{CASSCF_prog=Molpro}
. Similarly, you can add 'GVB_prog=QChem' to call Q-Chem to perform the GVB calculation if you wish (By default, GVB_prog=GAMESS). For all programs supported, just run automr -h
.
5.1.2 Single point calculation of an excited state
If you are interested only in one or two low-lying excited state, then the State-Specific CASSCF (SS-CASSCF) method is recommended. For example, the input file of optimizing the orbitals of the CASSCF S1 state of a water molecule is
%mem=8GB
%nprocshared=4
#p CASSCF/def2TZVP
mokit{root=1}
0 1
O -0.23497692 0.90193619 -0.068688
H 1.26502308 0.90193619 -0.068688
H -0.73568721 2.31589843 -0.068688
where root=1
means the first excited state which has the same spin as the spin of the ground state. Several electronic states will be calculated and the S1 state can be automatically tracked by automr
, via checking the spin of each state. To calculate the S2 state, just specify mokit{root=2}
.
For T1 state, you need to write mokit{root=1,Xmult=3}
, where Xmult
means the spin multiplicity of the target excited state. Of course, we know that the T1 state can be calculated in an easier way: we can just perform a ground state calculation with spin mmultiplicity 3. So this functionality is useful for S1, S2, T2 or higher states.
5.1.3 Single point calculation of several excited states
If you are interested in many excited states, or there exists energy degeneracy among your interested states, then the State-Averaged CASSCF (SA-CASSCF) method is recommended. This method calculates several electronic states in a single shot. An example is given below
%mem=16GB
%nprocshared=8
#p CASSCF/def2TZVP
mokit{ist=5,readno='h2o_uhf_gvb4_CASSCF_NO.fch',nstates=3}
It is recommended to perform the ground state CASSCF calculation firstly, then use the CASSCF .fch file (which includes CASSCF orbitals) to perform the excited state calculations. The .fch file can be either obtained by a previous MOKIT calculation (recommended) or provided by an experienced user. Here nstates=3
means an average of S0, S1, S2 and S3 states. To mix electronic states with different spins, you need mokit{...,nstates=3,Mix_Spin}
. The output of the input file above would be like
CASCI energies after SA-CASSCF(0 for ground state): E_ex/eV fosc
State 0, E = -75.91837033 a.u. <S**2> = 0.000
State 1, E = -75.68245998 a.u. <S**2> = 0.000 6.419 0.0005
State 2, E = -75.65490596 a.u. <S**2> = 0.000 7.169 0.0158
State 3, E = -75.59084787 a.u. <S**2> = 0.000 8.912 0.0001
where the fosc
is the corresponding oscillator strength of transitions. Besides, hole NTOs and particle NTOs files are generated and stored in .fch files:
h2o_uhf_gvb4_CASSCF_NO_SA-CAS_NTO_H01.fch
h2o_uhf_gvb4_CASSCF_NO_SA-CAS_NTO_H02.fch
h2o_uhf_gvb4_CASSCF_NO_SA-CAS_NTO_H03.fch
h2o_uhf_gvb4_CASSCF_NO_SA-CAS_NTO_P01.fch
h2o_uhf_gvb4_CASSCF_NO_SA-CAS_NTO_P02.fch
h2o_uhf_gvb4_CASSCF_NO_SA-CAS_NTO_P03.fch
where H01
means the hole NTOs of S0 -> S1 transition, and P01
means the particle NTOs of S0 -> S1 transition, respectively. You can use GaussView or Multiwfn+VMD to visualize the NTOs and corresponding eigenvalues.
The default SA-CASSCF program called by MOKIT is PySCF. OpenMolcas can be an alternative choice if you like (i.e. CASSCF_prog=OpenMolcas
). To obtain the dynamic correlation, you can change CASSCF
to NEVPT2
. NEVPT2 will be performed based on each CASCI state obtained after the SA-CASSCF calculation. More advanced methods like QD-NEVPT2 and XMS-CASPT2 interfaces will be supported in the near future.
5.1.4 MRCISD+Q calculation of the ground state
The MRCISD calculations using automr
are worthy of additional explanations. There are at least 3 variants of MRCISD:
(1) uncontracted MRCISD (unc-MRCISD);
(2) internally contracted MRCISD (ic-MRCISD);
(3) fully internally contracted MRCISD (FIC-MRCISD).
All these variants are not size-consistent, therefore, the Davidson size-consistency correction is usually calculated and added into the total electronic energy (e.g. termed as FIC-MRCISD+Q). The computational cost and accuracy is unc-MRCISD > ic-MRCISD > FIC-MRCISD.
The unc-MRCISD is very expensive and usually used for benchmark of small active space of small molecules. The ic-MRCISD+Q and FIC-MRCISD+Q variants are commonly used for routine MRCISD calculations. Some input examples are shown below
(1) unc-MRCISD+Q
%mem=20GB
%nprocshared=4
#p MRCISD/TZVP
mokit{CtrType=1}
0 1
O -0.23497692 0.90193619 -0.068688
H 1.26502308 0.90193619 -0.068688
H -0.73568721 2.31589843 -0.068688
This example will triggers UHF, GVB(4), CASSCF(4,4) and MRCISD+Q based on CAS(4,4) calculations sequentially. By default, MRCISD_prog
is OpenMolcas for unc-MRCISD and thus there is no need to write MRCISD_prog=OpenMolcas
. The +Q
correction energy is calculated by automr
and you can find related energies in the output of automr
Davidson correction= -0.00797009 a.u.
E_corr(MRCISD) = -0.17768879 a.u.
E(MRCISD+Q) = -76.11894741 a.u.
If you want to use another MRCISD_prog
, you can choose ORCA/GAMESS/PSI4/Gaussian/Dalton. But currently for Gaussian and Dalton, automr
cannot provide the +Q
correction energy. Note that you need to install HDF5-enabled OpenMolcas since automr
need to read the .h5 file in order to calculate the +Q
energy.
(2) ic-MRCISD+Q
%mem=20GB
%nprocshared=4
#p MRCISD/TZVP
mokit{CtrType=2}
0 1
O -0.23497692 0.90193619 -0.068688
H 1.26502308 0.90193619 -0.068688
H -0.73568721 2.31589843 -0.068688
By default, MRCISD_prog
is also OpenMolcas for ic-MRCISD+Q. You can also use Molpro.
(3) FIC-MRCISD+Q
%mem=20GB
%nprocshared=4
#p MRCISD/TZVP
mokit{CtrType=3,MRCISD_prog=ORCA}
0 1
O -0.23497692 0.90193619 -0.068688
H 1.26502308 0.90193619 -0.068688
H -0.73568721 2.31589843 -0.068688
For FIC-MRCISD+Q, the only option of MRCISD_prog
is ORCA.
The definition of size-consistency correction is not unique, and automr
adopts the simplest one: \( E_{+Q} = E_{corr}(1-{c_0}^{2}) \). If you are interested in other types of size-consistency correction, you are referred to this paper. If you find your result is sensitive to the type of size-consistency correction, maybe MRCISD+Q is not appropriate for your molecule, and you may want to try more advanced methods like FIC-MRCCSD.
5.1.4 Acceleration techniques
To speed up the calculation for a large molecule and/or a large basis set, one can add RI
to enable the Resolution-of-Identity (RI) techniques whenever possible. For example,
%mem=16GB
%nprocshared=16
#p CASSCF(6,6)/cc-pVTZ
mokit{RI}
0 1
[Cartesian coordinates of benzene]
The RI-JK will be enabled in the CASSCF calculation and the auxiliary basis set are automatically dealt with.
To perform the NEVPT2 calculation for a large molecule, you can use the local correlation method DLPNO-NEVPT2 in ORCA, e.g.
%mem=96GB
%nprocshared=48
#p NEVPT2(6,6)/cc-pVTZ
mokit{CASSCF_prog=ORCA,NEVPT2_prog=ORCA,DLPNO}
...
Note that RI
is useless for DMRG related calculations.
5.2 Examples of Transferring MOs
See examples in $MOKIT_ROOT/examples/utilities/
.
5.3 Examples of frag_guess_wfn
5.3.1 Construct initial guess wave function from fragments
To be added.
5.3.2 Morokuma-EDA
The input file of H2O-NH3 complex is shown below
%mem=32GB
%nprocshared=16
#p RHF/def2TZVP guess(fragment=2)
{morokuma}
0 1 0 1 0 1
N(fragment=1) -1.67861672 -0.49374924 -0.00513612
H(fragment=1) -1.39223602 -1.00824506 0.80312136
H(fragment=1) -1.24821305 0.40883590 0.00449440
H(fragment=1) -1.40093406 -0.98667606 -0.82970600
O(fragment=2) 0.68224018 0.74345713 0.00303850
H(fragment=2) 1.37946396 0.10226230 -0.15300059
H(fragment=2) 0.95548613 1.60139824 -0.32993853
This is clearly a Gaussian .gjf file, and it can be opened by GaussView to show fragments:

(click Tools
, Atom Groups
in the menu of GaussView)
The order of fragments is the same as that required in GAMESS. That is to say, you must define atoms in the order of monomer 1, monomer 2, monomer 3, etc. To conveniently add (fragment=N)
in each row, you may need some advanced text editor like Sublime Text 3, UltraEdit, VSCode, etc.
In the Title Card line, {morokuma}
is written to tell the utility frag_guess_wfn
you want the input file of Morokuma-EDA. Currently Morokuma-EDA in GAMESS has several restrictions:
(1) Only the RHF method can be used. ROHF, UHF or any DFT methods cannot be used. Both the total system and every fragment should be treated with RHF. This means broken bonds cannot be dealt with.
(2) Spherical harmonic functions cannot be used. The utility frag_guess_wfn
will automatically switch to Cartesian fucntions (i.e. 6D 10F) when calling Gaussian to perform SCF computations.
Assuming the filename is nh3_h2o.gjf
, you simply need to run
frag_guess_wfn nh3_h2o.gjf >nh3_h2o.out 2>&1
Since the molecule is small, you can obtain a file named nh3_h2o.inp
in seconds. MOs of two fragments will be written into this file. You can submit the file nh3_h2o.inp
to GAMESS for Morokuma-EDA computations. All SCF of fragments will be converged in 1-2 cycles. The initial guess of orbitals of the total system (i.e. supermolecule) is not written in .inp file, but constructed from fragments within GAMESS, and it is supposed to be converged soon.
Simply replacing {morokuma}
by {gks}
means you want the input file of GKS-EDA. UHF and UDFT methods can be applied in GKS-EDA. Thus it is powerful, especially when dealing with biradical/diradical system, where the symmetry broken UDFT calculations are necessary. See the following section for more examples.
5.3.3 GKS-EDA
To perform GKS-EDA calculations, you need the GAMESS program and xeda-patch script. After you download the required packages, use the xeda-patch script to modify the GAMESS source code. And then compile GAMESS in a usual way. Here is a tutorial written in Chinese 《GKS-EDA计算简介》.
Two examples are shown bloew to illustrate how to use the utility frag_wfn_guess
. The first example: [Ti(H2O)6]3+ cation
%mem=32GB
%nprocshared=16
#p UPBE1PBE/gen guess(fragment=2)
{gks}
3 2 3 2 0 1
Ti(fragment=1) 0.00000000 0.00000000 0.00000000
H(fragment=2) 0.00000000 2.68428109 -0.78751709
H(fragment=2) 2.68128062 0.78751744 0.00000000
H(fragment=2) 0.00000000 -2.68428109 0.78751709
H(fragment=2) 0.78751709 0.00000000 -2.68228109
H(fragment=2) -2.68128062 0.78751744 0.00000000
H(fragment=2) -0.78751709 0.00000000 2.68228109
O(fragment=2) 0.00000000 2.10100000 0.00000000
H(fragment=2) 0.00000000 2.68428109 0.78751709
O(fragment=2) 0.00000000 0.00000000 2.09900000
H(fragment=2) 0.78751709 0.00000000 2.68228109
O(fragment=2) 0.00000000 0.00000000 -2.09900000
H(fragment=2) -0.78751709 0.00000000 -2.68228109
O(fragment=2) 2.09800000 0.00000000 0.00000000
H(fragment=2) 2.68128062 -0.78751744 0.00000000
O(fragment=2) -2.09800000 0.00000000 0.00000000
H(fragment=2) -2.68128062 -0.78751744 0.00000000
O(fragment=2) 0.00000000 -2.10100000 0.00000000
H(fragment=2) 0.00000000 -2.68428109 -0.78751709
Ti 0
def2TZVP
****
H O 0
def2SVP
****
You can use the basis set def2TZVP
for all atoms for this small system. Here is just an example to show that mixed basis set, user-defined basis set or ECP/PP are supported. REMEMBER to type at least 2 blank lines after the basis set data. Assuming the filename is Ti.gjf
, run
frag_guess_wfn Ti.gjf >Ti.out 2>&1 &
to submit the job to the current machine/node.
If the current node cannot be used for computation, for example, you need to submit the job to a queue on a Cluster, then you need to write a script. Here is an example of SLURM script run.sh
#!/bin/bash
#SBATCH -J eda
#SBATCH -e eda.err.%j
#SBATCH -p small_s # queue name
#SBATCH -N 1
#SBATCH -n 1
#SBATCH -c 16
#SBATCH --mem=32GB
#SBATCH --exclusive
frag_guess_wfn Ti.gjf >Ti.out 2>&1
You can run
sbatch run.sh
to submit the job. You can also upload the Ti.inp
to the XACS platform and use their computers.
This example may take about 20 min to generate the desired .inp file. Five SCF jobs will be performed in succession:
(1) monomer 1 in its own basis;
(2) monomer 2 in its own basis;
(3) monomer 1 in all basis;
(4) monomer 2 in all basis;
(5) supermolecule in all basis.
frag_guess_wfn
will automatically construct the initial guess for job (5) using MOs from job (1) and (2). This will save some time. After all 5 jobs normally terminated, the utility fch2inp
will be automatically called to transfer 5 sets of MOs from Gaussian .fch files into one GAMESS .inp file. Proper keywords (including DFT name in DFTTYP, solvent name and radii of solute atoms in $PCM, etc) have been written in the generated .inp file. Usually there is no need to modify this file unless you have other requirements.
For HF methods, all 5 SCF jobs during GKS-EDA will converge in 1-2 cycles. While for UDFT, they may take about 10 cycles since the DFT integral grids (and functional implementation details possibly) in GAMESS are not identical to those in Gaussian. Note that when using DFT methods, grid settings $DFT NRAD=99 NLEB=590 $END
will be automatically added into the .inp file, to mimic the int=ultrafine
in Gaussian. If implicit solvent model is also applied, each SCF will take more cycles, since implementation details in GAMESS have differences with those in Gaussian (but the energy usually steadily decreases).
You can define an environment variable in your ~/.bashrc
for convenience of running XEDA, e.g.
export XEDA=$HOME/software/xeda/rungms
Again, if you are using a script to submit any job, you need to write the definition into your script. Once the calculations are accomplished, you can simply run
$XEDA Ti.inp 00 16 >Ti.gms 2>&1 &
to perform the GKS-EDA calculation, where the 16 is the number of CPU cores for parallel computation.
Then we come to the second example: the ethane molecule. Assuming we want to see the interaction of two \( \cdot \)CH3 radicals, such fragmentation will involve breaking a covalent bond, which leads to an alpha unpaired electron in one \( \cdot \)CH3 radical and a beta unpaired electron in another radical. In this case we need to specify negative spin in the input file, which represents the beta spin:
%mem=4GB
%nprocshared=4
#p UHF/cc-pVTZ guess(fragment=2)
{gks}
0 1 0 2 0 -2
C(fragment=1) -3.42837266 0.18776078 0.00000000
H(fragment=1) -3.07171823 -0.82104923 0.00000000
H(fragment=1) -3.07169982 0.69215897 -0.87365150
H(fragment=1) -4.49837266 0.18777396 0.00000000
C(fragment=2) -2.91503044 0.91371705 1.25740497
H(fragment=2) -1.84503225 0.91200998 1.25838332
H(fragment=2) -3.27008700 1.92309006 1.25642758
H(fragment=2) -3.27329940 0.41044905 2.13105517
It is equivalent to interchange 2/-2. The coupling of an alpha unpaired electron with a beta unpaired electron leads to the total spin singlet. The GKS-EDA supports up to 20 fragments. For convenience only examples containing two fragments are shown above.
5.3.4 LMO-EDA
The LMO-EDA calculations can be performed using HF, MP2, CCSD or CCSD(T) methods for monomers and the complex. This method is used less frequently than GKS-EDA nowadays. If you want to use HF or MP2, then maybe GKS-EDA is a better choice (since DFT is computationally economic). If you want to use CCSD or CCSD(T), the desired computation may be too time-consuming. Anyway, here is one input example of frag_guess_wfn
for LMO-EDA:
%mem=16GB
%nprocshared=8
#p CCSD/cc-pVTZ guess(fragment=2)
{LMO}
0 1 0 1 0 1
O(fragment=1) -1.90937854 1.06610853 -0.05598828
H(fragment=1) -2.23867796 0.17728177 0.09615925
H(fragment=1) -0.94953286 1.05932020 -0.04017097
N(fragment=2) -2.02042680 -1.64298230 0.34416157
H(fragment=2) -2.28323594 -1.98171649 -0.55927105
H(fragment=2) -2.67940800 -1.96304329 1.02482651
H(fragment=2) -1.10944265 -1.98408759 0.57601295
frag_guess_wfn
will call Gaussian to perform HF calculations for monomers and the complex. That is to say, currently only the SCF steps would be accelerated in GAMESS. The CCSD amplitude iteration is not accelerated.
5.3.5 SAPT0
frag_guess_wfn
is also able to generate the input file of SAPT0 computation in PSI4. Currently only SAPT0 (and sSAPT0) is supported. More advanced methods like SAPT2+ and SAPT2+(3)\( \delta \)MP2 will be supported in the future. Similarly, frag_guess_wfn
will first call the Gaussian to perform HF/DFT computations and then generate the input file of SAPT job for PSI4. Be aware that SAPT cannot deal with strong interactions come from two fragments by breaking covalent bonds (in that case you should use the GKS-EDA method). Here is an SAPT example:
%mem=16GB
%nprocshared=8
#p HF/jun-cc-pVDZ guess(fragment=2)
{sapt,bronze}
0 1 0 1 0 1
O(fragment=1) -1.90937854 1.06610853 -0.05598828
H(fragment=1) -2.23867796 0.17728177 0.09615925
H(fragment=1) -0.94953286 1.05932020 -0.04017097
N(fragment=2) -2.02042680 -1.64298230 0.34416157
H(fragment=2) -2.28323594 -1.98171649 -0.55927105
H(fragment=2) -2.67940800 -1.96304329 1.02482651
H(fragment=2) -1.10944265 -1.98408759 0.57601295
Using frag_guess_wfn
to generate the sSAPT0 input file with MOs:
frag_guess_wfn h2o_nh3.gjf >h2o_nh3.log 2>&1 &
If the job above is accomplished successfully, you will find the following information in the output file h2o_nh3.log:
...
i= 1, frags(i)%e = -76.037065444, frags(i)%ssquare= 0.00
i= 2, frags(i)%e = -56.203022091, frags(i)%ssquare= 0.00
i= 3, frags(i)%e = -132.241272191, frags(i)%ssquare= 0.00
All SCF computations finished. Generating PSI4 input file...
Done. Now you can submit file h2o_nh3.inp to PSI4 (DO NOT delete *.A and
*.B files) like
-------------------------------------------------------------------------------
psi4 h2o_nh3.inp h2o_nh3.out -n 8 &
-------------------------------------------------------------------------------
...
And there should be 8 files in the current directory:
001-h2o_nh3.A, 002-h2o_nh3.A, 003-h2o_nh3.A, 003-h2o_nh3.fch, 003-h2o_nh3.log,
h2o_nh3.gjf, h2o_nh3.inp, h2o_nh3.log
The converged MOs are stored in *.A
files. If UHF-based SAPT calculations are performed, there should also be *.B
files generated. Please do not delete these orbital files since they will be needed in SAPT calculations. Now submit the generated h2o_nh3.inp to PSI4:
psi4 h2o_nh3.inp h2o_nh3.out -n 8
Both of SAPT0 and sSAPT0 results can be found in PSI4 output:
...
Total HF -1.16949611 [mEh] -0.73386989 [kcal/mol] -3.07051162 [kJ/mol]
Total SAPT0 -5.33584033 [mEh] -3.34829036 [kcal/mol] -14.00924687 [kJ/mol]
Special recipe for scaled SAPT0 (see Manual):
Electrostatics sSAPT0 -24.50676069 [mEh] -15.37822450 [kcal/mol] -64.34249132 [kJ/mol]
Exchange sSAPT0 31.00203914 [mEh] 19.45407327 [kcal/mol] 81.39584256 [kJ/mol]
Induction sSAPT0 -7.07990345 [mEh] -4.44270649 [kcal/mol] -18.58828396 [kJ/mol]
Dispersion sSAPT0 -4.06339883 [mEh] -2.54982126 [kcal/mol] -10.66845216 [kJ/mol]
Total sSAPT0 -4.64802383 [mEh] -2.91667899 [kcal/mol] -12.20338488 [kJ/mol]
The SAPT paper (JCP 140, 094106 (2014)) recommends three levels of theory to be used:
(1) bronze: sSAPT0/jun-cc-pVDZ;
(2) silver: SAPT2+/aug-cc-pVDZ;
(3) gold: SAPT2+(3)\( \delta\)MP2/aug-cc-pVTZ.
So far the open shell calculations have not been implemented for silver and gold level in PSI4 (it is OK for sSAPT0). Currently only the bronze level is supported in frag_guess_wfn
. {sapt}
or {sapt,bronze}
in the Title Card line will be recognized as the bronze level. You are always recommended to use jun-cc-pVDZ
unless there is some element which is out of the range of jun-cc-pVDZ (in that case def2TZVP is recommended).
5.3.6 Tricks for accelerations
For some simple fragments (water molecules, organic ligands, etc), if you are sure that they have closed-shell wave function, you can append a character r
after the Cartesian coordinates of any atom of the fragment. Again, taking the [Ti(H2O)6]3+ cation as the example
%chk=Ti.chk
%mem=32GB
%nprocshared=16
#p UPBE1PBE/gen guess(fragment=2)
{gks}
3 2 3 2 0 1
Ti(fragment=1) 0.00000000 0.00000000 0.00000000
H(fragment=2) 0.00000000 2.68428109 -0.78751709 r
H(fragment=2) 2.68128062 0.78751744 0.00000000
... (not shown)
This will force the use of R(O)HF calculation for this fragment, thus saving computation time. After R(O)HF computation of this fragment is done, beta MOs will be directly copied from alpha MOs, so the net result is that it appears to perform a UHF computation, but with only R(O)HF cost.
If the r
is not written, UHF and wave function stability test will be performed for this fragment. In this special case we know that UHF is unnecessary for this fragment.
This trick accelerates computations only when UHF/UDFT is required for the supermolecule so that we can force some fragment(s) to use closed-shell methods. If restricted HF or DFT methods are required for the supermolecule, this trick will not reduce computational time.
5.3.7 Other notes
(1) If you want to use the implicit solvent model, PCM is strongly recommended instead of SMD.
(2) Do not use scalar relativistic Hamiltonian like DKH2 since it has not been tested by GKS-EDA users.
(3) An exception of basis set is that definition for each atom tag is unsupported. For example, the following definition
1 0
def2TZVP
****
2 0
def2SVP
****
is currently NOT supported for frag_guess_wfn
.
5.4 ICSS and NICS
5.4.1 ICSS of the ground state of cyclobutadiene
If you are not familiar with ICSS(iso-chemical-shielding surfaces), you are recommended to read Sobereva's blog 《通过Multiwfn绘制等化学屏蔽表面(ICSS)研究芳香性》. That blog contains necessary information of ICSS using HF/DFT methods.
Here we try to compute ICSS using the CASSCF method. The gjf file of the following
example can be found at $MOKIT_ROOT/examples/automr/16-C4H4.gjf
. For convenience,
here is the content:
%mem=64GB
%nprocshared=16
#p CASSCF(4,4)/6-31G(d)
mokit{ICSS}
0 1
C 0.00000000 0.00000000 0.00000000
C 0.00000000 0.00000000 1.34900000
C 1.56200000 0.00000000 1.34894100
C 1.56200000 -0.00023900 -0.00005900
H -0.76022400 -0.00000200 -0.76288200
H -0.76022400 0.00013500 2.11188200
H 2.32222400 0.00000000 2.11182300
H 2.32222400 -0.00037400 -0.76294100
Step 1. Submit the automr job
Run
automr 16-C4H4.gjf >16-C4H4.out 2>&1 &
This computation would take about 11 min. For larger molecules, the computation
is very time-consuming and it might take one day or even more time. You are recommended
to set the memory as large as possible. The basis set 6-31G(d) or 6-31+G(d)
is recommended since larger basis set would cost too much computational time. Check
and make sure the active space is smaller than (15,15), since ICSS is not supported
using the DMRG method. After the computation is accomplished, there would be a
file named 16-C4H4_uhf_gvb10_CASSCF_ICSS.cub
. This file can be visualized by
GaussView, Multiwfn or VMD.
Step 2. Open the .cub file with GaussView
You can simply drag the .cub file into GaussView. Next, click the Results
->
Surfaces/Contours
, set the Density
to an appropriate value, e.g. 0.2. Finally,
click Surface Actions
-> New Surface
. Then a figure like the following picture
would be shown on your screen.

5.4.2 ICSS of S1 state of cyclobutadiene
Assuming you've accomplished the ICSS job of the ground state CASSCF. Now you can study the ICSS at some excited state. Currently only excited states with the same spin as the ground state are supported. For example, if the ground state is S0, then Sn (n>=1) electronic excited states are supported. If you are interested in T1 state, you should read Section 5.4.1, in which you can set the spin multiplicity as triplet in the input file. Tn (n>=2) states are not supported currently. An input example (e.g. s1.gjf) for S1 state is shown below:
%mem=64GB
%nprocshared=16
#p CASSCF(4,4)/6-31G(d)
mokit{ist=5,readno='16-C4H4_uhf_gvb10_CASSCF_NO.fch',Root=1,ICSS}
where the file 16-C4H4_uhf_gvb10_CASSCF_NO.fch
was generated in the ground state
CASSCF job. Root=1
stands for the first excited state. It is recommended to copy
the file 16-C4H4_uhf_gvb10_CASSCF_NO.fch
into another directory and then submit
the excited state calculation since many files will be generated during computation.
Submit the S1 state job
automr s1.gjf >s1.out 2>&1 &
This job will take about 11 min as well. There would be a file named
16-C4H4_uhf_gvb10_CASSCF_NO_CASSCF_ICSS.cub
. Open it with GaussView (Density
isovalue=0.2), you will see

Of course, you can use other isovalues for better visualization, but better to use the same isovalue in a paper.
5.4.3 NICS of cyclobutadiene
If you think ICSS computation is too time-consuming, then NICS is a good choice. It usually takes only several seconds or minutes to obtain one or two points of NICS values. Again, we use the cyclobutadiene as the example. The input file is
%mem=32GB
%nprocshared=16
#p CASSCF(4,4)/def2TZVP
mokit{NMR}
0 1
C 0.00000000 0.00000000 0.00000000
C 0.00000000 0.00000000 1.34900000
C 1.56200000 0.00000000 1.34894100
C 1.56200000 -0.00023900 -0.00005900
H -0.76022400 -0.00000200 -0.76288200
H -0.76022400 0.00013500 2.11188200
H 2.32222400 0.00000000 2.11182300
H 2.32222400 -0.00037400 -0.76294100
Since NICS computation is cheap, you can use a larger basis set like def2TZVP, rather
than 6-31G(d). Note that do not write NICS
keyword in the input file, NMR
is enough.
Firstly, you should submit this file to automr
. Now assuming you've accomplished the
CASSCF computation, and next you want to compute NICS(1)ZZ. Just open the file
16-C4H4_uhf_gvb10_CASSCF_NMR.mol
and add the Cartesian coordinates of dummy atom Bq:
ATOMBASIS
generated by utility bas_gms2dal in MOKIT
Basis set specified with gen
AtomTypes=9 Integrals=1.0D-14 Charge=0 NoSymmetry Angstrom
Charge=6. Atoms=1 Basis=INTGRL Blocks=3 1 1 1
... (abbreviated for brevity)
0.8000000000 1.0000000000
Charge=0. Atoms=1 Basis=INTGRL Ghost
Bq 0.7809234989 -1.0000597432 0.6745590841
What you need to do is two things: (1) modify the AtomTypes from 8 to 9 (because we add one Bq atom); (2) add coordinates of the Bq dummy atom in the end of the file. The modification is shown in bold text above. The coordinates of Bq can be obtained using Multiwfn, see examples. Then submit this job to Dalton, e.g. running the following commands in Shell:
dalton -gb 16 -omp 16 -put "SIRIUS.RST" -noarch -ow 16-C4H4_uhf_gvb10_CASSCF_NMR &
This only takes 1-2 minutes. After it is finished, open the file 16-C4H4_uhf_gvb10_CASSCF_NMR.out
and search @1 Bq
, you can find the chemical shielding of this Bq atom is -11.9931
ppm. Remember that NICS(1)ZZ is the negative of the chemical shielding,
i.e. 11.9931 ppm.
5.4.4 Reproduction of C9H9+ cation ICSS in a paper
Now let's look at a real example: the C9H9+ cation in a paper. Ground and excited state aromaticity are discussed in this paper, in a theoretical pespective. The authors used the basis set 6-311+G(d), but here we only want to do some homework, not to publish a paper, so we use 6-31G(d) for a quick check.
The Cartesian coordinate can be found in Supporting Information of the paper. Here
the automr
input file is
%mem=100GB
%nprocshared=48
#p CASSCF/6-31G(d)
mokit{GVB_conv=5d-4,ICSS}
1 1
C 0. 0. 1.645014194
C 0.4905022324 1.2427684773 1.1554414148
C -0.4905022324 -1.2427684773 1.1554414148
C 0.0100585215 2.0928052569 0.1040785875
C -0.0100585215 -2.0928052569 0.1040785875
C -0.8152087457 1.6410894083 -0.9354914584
C 0.8152087457 -1.6410894083 -0.9354914584
C -0.6569284614 0.2832859329 -1.251946014
C 0.6569284614 -0.2832859329 -1.251946014
H 0. 0. 2.7459418542
H 1.0699783805 1.7858874614 1.9137457041
H -1.0699783805 -1.7858874614 1.9137457041
H 0.12964425 3.1677255378 0.2859658405
H -0.12964425 -3.1677255378 0.2859658405
H -1.5936377795 2.2790433661 -1.3665504299
H 1.5936377795 -2.2790433661 -1.3665504299
H -1.490812806 -0.3325910524 -1.6109751404
H 1.490812806 0.3325910524 -1.6109751404
Note that this molecule is non-planar/twisted, so GVB may converge to a solution
with \( \sigma \) - \( \pi \) orbitals mixed. The keyword GVB_conv=5d-4
is
used to make GVB converge early, which usually lead to a \( \sigma \) - \( \pi \)
separated solution. We can further look into the GVB or CASSCF natural orbitals
during computation, just for a double check:

(Only bonding orbitals of 4 pairs with largest multiconfigurational characters are
shown. Isovalue=0.04)

(Only 4 active orbitals with largest occupation numbers in the active space are shown. Isovalue=0.04)
One can see the active orbitals seem to be \( \pi \) orbitals. You may notice that
automr
recommends a CASSCF(8e,8o) calculation, not the (8e,9o) used by the authors.
But the difference of 1 virtual orbital is not important. This job will take about
6.5 h. The ICSS plot of C9H9+ is shown below
(isovalue=16, the same as that in paper)

("isotropic shielding isosurfaces" called in the paper)
which is similar to the corresponding picture in the
paper. Because automr
uses
coarse grids for ghost atoms by default, and further render (e.g. using VMD) has
not been performed, the plot may not be as beautiful as that in the paper.
5.5 Manually making consistent active space
The active space automatically determined by automr
of MOKIT may not be the same
size for cases involving chemical reactions, different local minima, or different
spin states of the same geometry. Or although the active space size remains unchanged,
but the active orbitals become substantially different among different geometries.
In such cases, if we want to obtain an accurate relative energy, we should pay
attention to the following two points:
(1) Different species are required to have the same size of active space.
(2) Active orbitals of different species are supposed to contain the same main components.
Point (2) is merely recommended by the developer jxzou, not a rule which you must obey. Taking the singlet-triplet gap of a molecule as an example: if MOKIT automatically determined/recommends the active space of singlet to be CAS(2,2), but triplet to be CAS(4,4), then it would be unfair to directly compare the energies of CASSCF(2,2) v.s. CASSCF(4,4). Of course, it would be unfair to simply compare NEVPT2 energies (based on corresponding CASSCF), either. In such cases, the author recommends two approaches:
(i) If we find two orbitals in CAS(4,4) of triplet to be not active, i.e. occupation numbers close to 2 and 0, respectively, we can read CAS(4,4) orbitals, move these two orbitals out of the active space, and perform a CAS(2,2) computation. Then active space size of two spin states are the same.
(ii) If there are similar active orbitals but they are too active to be moved out of the active space, or if we cannot find any similar active orbital between two active space. We should consider to enlarge the size of CAS(2,2) for singlet. The extension target can be CAS(4,4), or can be even larger, depending on the similar extent of two active space.
Obviously the approaches above correspond to intersection set and union set in mathematics. Currently these two approaches have not been implemented in MOKIT, since it requires a lot of techniques to make them become automatic and black-box. Below I will provide two examples to illustrate how to manually make two active space consistent. For comparison of active space from more than two species, the spirit is identical.
5.5.1 The Diels-Alder reaction of 1,3-butadiene and ethane
To be added.
5.5.2 Singlet-Triplet gap of an iron-containing molecule
To be added.
5.5.3 Find a C-C bond for an equilibrium geometry
If one studies the C-C bond dissociation in ethanol using multireference computations
conducted by automr
in MOKIT, he/she may find that when the bond is stretched,
automr
usually provide desired CAS(2,2) results. However, for the equilibrium
geometry (or near that), automr
may provide a CAS(2,2) which does not contain
the desired chemical bond. This is because there is no multireference character
for the equilibrium geometry, automr
cannot locate the chemical bond as desired.
In fact, in such case the word 'desired' has different meanings for different users.
Some other users may want the CAS(2,2) to contain the C-O bond, and some users may
want the C-H bond.
Here I will show how to swap two orbitals to get desired results. Using the ethanol as an example, assuming that we have perform a simple CASPT2(2,2)/cc-pVDZ calculation, the input file ethanol.gjf is shown below
%mem=4GB
%nprocshared=4
#p CASPT2(2,2)/cc-pVDZ
mokit{CASPT2_prog=ORCA}
0 1
C 0.00000000 0.00000000 0.00000000
H 0.00000000 0.00000000 1.09270400
H 1.03726900 0.00000000 -0.34698400
H -0.48033400 0.91954100 -0.34310600
C -0.73649800 -1.21494100 -0.53167000
H -0.24338800 -2.13583800 -0.19123400
H -0.72503400 -1.21412700 -1.63028700
O -2.08207000 -1.16197400 -0.04736100
H -2.56286600 -1.92666800 -0.37754100
And we find that the HONO and LUNO of CASSCF(2,2) correspond to the C-O bond

If we want it to be the C-C bond, we should swap/exchange orbitals in GVB NOs since they are the initial guess of CASSCF. To accomplish that, we start python
from mokit.lib.gaussian import permute_orb
permute_orb('ethanol_rhf_proj_loc_pair2gvb8_s.fch',6,13)
permute_orb('ethanol_rhf_proj_loc_pair2gvb8_s.fch',14,21)
exit()
The orbital indices 13 and 14 are just HONO and LUNO of GVB NOs, while the 6 and 21 are the C-C bonding and anti-bonding orbitals we want. To know the orbital indices, you should visualize MOs in advance. Now the file ethanol_rhf_proj_loc_pair2gvb8_s.fch is updated. We use it as the initial guess of CASSCF(2,2), i.e. submit the job
python ethanol_rhf_gvb8_CASSCF.py >ethanol_rhf_gvb8_CASSCF.out 2>&1
then we re-open the file ethanol_rhf_gvb8_CASSCF_NO.fch and check whether the HONO and LUNO now correspond to the C-C bond. Finally, we update the .gbw file and re- submit the CASPT2 job
fch2gbw ethanol_rhf_gvb8_CASSCF_NO.fch ethanol_rhf_gvb8_CASSCF_CASPT2.gbw
orca ethanol_rhf_gvb8_CASSCF_CASPT2.inp >ethanol_rhf_gvb8_CASSCF_CASPT2.out 2>&1
where running the fch2gbw
utility is to update the file ethanol_rhf_gvb8_CASSCF_CASPT2.gbw
since CASSCF orbitals are stored in this file. Note that you should find CASPT2
energy in file ethanol_rhf_gvb8_CASSCF_CASPT2.out by yourself. It is not in automr
output file currently.
5.6 Generating different GVB solutions
This section is for experienced GVB users. For conjugated molecules, there often exists multiple GVB solutions.
5.6.1 \( \sigma \) - \( \pi \) separated v.s. \( \sigma \) - \( \pi \) mixed solution
By default, the PM (Pipek-Mezey) localization method is used in constructing the initial guess of GVB orbitals. For example, if we calculate a benzene molecule,
%mem=48GB
%nprocshared=48
#p CASSCF/cc-pVDZ
mokit{}
0 1
C -0.90252711 0.46329723 0.00000000
C 0.49263289 0.46329723 0.00000000
C 1.19017089 1.67104823 0.00000000
C 0.49251689 2.87955723 -0.00119900
C -0.90230811 2.87947923 -0.00167800
C -1.59990911 1.67127323 -0.00068200
H -1.45228611 -0.48901977 0.00045000
H 1.04214089 -0.48921577 0.00131500
H 2.28985089 1.67112823 0.00063400
H 1.04271689 3.83170023 -0.00125800
H -1.45243011 3.83176023 -0.00263100
H -2.69951311 1.67145623 -0.00086200
we will obtain GVB(15) and CASSCF(6,6) natural orbitals, in which we can find 3 \( \pi \) bonding orbitals and 3 \( \pi \)* anti-bonding orbitals:
But if we deliberately change the orbital localization method as the Boys method:
mokit{LocalM=Boys}
then the \( \sigma \) - \( \pi \) orbitals are mixed during orbital localization,
and the GVB will converge to a lower-energy solution, but with \( \sigma \) - \( \pi \)
orbitals mixed. And we will probably get the complain information from automr
There is no active orbital/electron. AutoMR terminated.
The reason is this molecule has little multi-configurational or multi-reference
character...
This is because in the lower-energy GVB solution, there are no pure \( \pi \) orbitals, and the corresponding pair coefficients are totally different
CICOEF( 1)= 0.99795122275633, -0.06397934822390
CICOEF( 3)= 0.99795102014144, -0.06398250853672
CICOEF( 5)= 0.99794887464558, -0.06401596358426
CICOEF( 7)= 0.99608018865066, -0.08845483467657
CICOEF( 9)= 0.99607885713120, -0.08846982749052
CICOEF( 11)= 0.99607599571862, -0.08850203812992
CICOEF( 13)= 0.99602509688574, -0.08907303954478
CICOEF( 15)= 0.99602503059542, -0.08907378080778
CICOEF( 17)= 0.99602141489628, -0.08911420239227
CICOEF( 19)= 0.99591536359976, -0.09029168591829
CICOEF( 21)= 0.99591522848545, -0.09029317621383
CICOEF( 23)= 0.99591456661139, -0.09030047625144
CICOEF( 25)= 0.99591440686394, -0.09030223807220
CICOEF( 27)= 0.99591436826541, -0.09030266376197
CICOEF( 29)= 0.99591319014626, -0.09031565585597
It shows the symmetry of 3 C-C \( \sigma \) bonds, 6 C-H \( \sigma \) bonds,
and 6 \( \sigma \) - \( \pi \) mixing bonds ("banana" bonds). These pair coefficients
are of less multi-reference characteristics so that automr
thinks there is no
strongly correlated orbitals to be picked for CASSCF calculations.
5.6.2 closed-shell-like v.s. diradical-like solution
To be added.
5.7 Examples of dinuclear transition metal molecules
5.7.1 [Fe2(OH)2(H2O)8]4+
I downloaded the geometry from some unknown website, so before performing multi- reference computations, the geometry optimization is necessary. If you obtain the geometry from crystal data, you may only need to optimize the positions of hydrogen atoms. If you get the geometry from previous computational literature (with reasonable computations therein), geometry optimization may be unnecessary.
Assuming we are studying the lowest antiferromagnetic singlet, the Gaussian input file (.gjf) for geometry optimization is shown below
%chk=Fe2_ini.chk
%mem=200GB
%nprocshared=48
#p UBP86/TZVP nosymm guess(fragment=5) scf(xqc,maxcycle=500)
title
4 1 3 6 3 -6 -1 1 -1 1 0 1
Fe(Fragment=1) 0.56990000 -0.61710000 -0.77260000
Fe(Fragment=2) 2.46140000 -0.90030000 -3.11630000
O(Fragment=5) 0.83530000 1.50450000 -0.94550000
O(Fragment=3) 2.42280000 -0.91110000 -0.96240000
O(Fragment=5) 0.24940000 -2.71760000 -0.51100000
O(Fragment=5) -1.54430000 -0.30870000 -0.63950000
O(Fragment=5) 0.63850000 -0.41510000 1.35780000
O(Fragment=4) 0.58860000 -0.83710000 -2.91460000
H(Fragment=5) 1.26180000 1.66120000 -1.77480000
H(Fragment=5) -0.63350000 -2.89290000 -0.78630000
H(Fragment=5) -1.70840000 0.57350000 -0.92580000
H(Fragment=5) 0.18020000 0.37680000 1.58020000
H(Fragment=5) 1.41960000 1.77430000 -0.26000000
H(Fragment=3) 2.55880000 -1.80250000 -0.69740000
H(Fragment=5) 0.30060000 -2.88260000 0.41550000
H(Fragment=5) -1.77780000 -0.38080000 0.26940000
H(Fragment=5) 1.54020000 -0.32870000 1.61230000
H(Fragment=4) 0.27980000 -1.70780000 -3.08960000
O(Fragment=5) 2.57760000 1.24250000 -3.13070000
O(Fragment=5) 4.59560000 -0.97250000 -3.27840000
O(Fragment=5) 2.39960000 -3.04000000 -3.18650000
O(Fragment=5) 2.40100000 -0.88560000 -5.25630000
H(Fragment=5) 1.50560000 -0.76300000 -5.51910000
H(Fragment=5) 2.41410000 1.52980000 -4.01150000
H(Fragment=5) 4.91230000 -1.68440000 -2.75140000
H(Fragment=5) 3.25730000 -3.36430000 -2.97650000
H(Fragment=5) 3.46380000 1.46430000 -2.89930000
H(Fragment=5) 4.80310000 -1.14140000 -4.18100000
H(Fragment=5) 2.20300000 -3.27960000 -4.07670000
H(Fragment=5) 2.91120000 -0.15440000 -5.55730000
--Link1--
%chk=Fe2_ini.chk
%mem=200GB
%nprocshared=48
#p UBP86 chkbasis nosymm guess=read geom=allcheck scf(xqc,maxcycle=500) stable=opt
--Link1--
%chk=Fe2_ini.chk
%mem=200GB
%nprocshared=48
#p opt=calcfc freq UBP86 chkbasis nosymm guess=read geom=allcheck scf(xqc,maxcycle=500)
--Link1--
%chk=Fe2_ini.chk
%mem=200GB
%nprocshared=48
#p UBP86 chkbasis nosymm guess=read geom=allcheck stable=opt
This is a 4-step job. The 1st step sets up the fragment information of 5 fragments. This is because we need the antiferromagnetic singlet (5 alpha electrons for Fe1 and 5 beta electrons for Fe2) solution. Such an SCF solution is often difficult to be obtained using conventional HF/DFT initial guess, so here we use the fragment guess.
The 2nd step reads MOs from previous step, converges the UBP86 orbitals and checks the stability of UBP86 wavefunction. If not stable, the keyword stable=opt will automatically optimize the wavefunction to a stable one. The 3rd step reads MOs from previous step and starts geometry optimizations. Note that the wavefunction stability may change during geometry optimizations, so we need the 4th step to ensure that we get a stable wavefunction finally.
If the UBP86 energy in the 4th step becomes lower (than the electronic energy of the final geometry in the 3rd step), it means that the frequency analysis in the 3rd step is useless since it is based on an instable wavefunction. In that case we need to read MOs from the 4th step and continue geometry optimization. Fortunately, in this example, we obtain the stable E(UBP86) = -3289.8489510 a.u. in the 3rd step and it is not lowered in the 4th step.
Then we use the optimized geometry to perform multiconfigurational/multireference
computations. The automr
input file (.gjf) is shown below
%mem=200GB
%nprocshared=48
#p CASSCF/TZVP guess(fragment=5)
mokit{GVB_conv=5d-4,hardwfn}
4 1 3 6 3 -6 -1 1 -1 1 0 1
Fe(Fragment=1) 0.525119 -0.497869 -0.708001
Fe(Fragment=2) 2.614048 -0.959498 -3.224439
O(Fragment=5) 0.629061 1.607749 -0.624032
O(Fragment=3) 2.509495 -0.820203 -1.255415
O(Fragment=5) 0.032247 -2.520107 -0.363336
O(Fragment=5) -1.530291 -0.185439 -0.458412
O(Fragment=5) 0.900278 -0.419588 1.381845
O(Fragment=4) 0.629692 -0.637059 -2.676981
H(Fragment=5) 0.401400 2.259064 -1.321606
H(Fragment=5) -0.381926 -3.161928 -0.979088
H(Fragment=5) -1.996364 0.680813 -0.436459
H(Fragment=5) 0.194596 -0.275371 2.052603
H(Fragment=5) 0.786914 2.117185 0.201537
H(Fragment=3) 3.289968 -0.897141 -0.663082
H(Fragment=5) 0.055815 -2.947638 0.521352
H(Fragment=5) -2.220899 -0.875872 -0.338272
H(Fragment=5) 1.736797 -0.507223 1.890496
H(Fragment=4) -0.151051 -0.561947 -3.269202
O(Fragment=5) 3.107786 1.062670 -3.567445
O(Fragment=5) 4.669288 -1.273093 -3.474047
O(Fragment=5) 2.508898 -3.065004 -3.309713
O(Fragment=5) 2.239245 -1.036321 -5.314310
H(Fragment=5) 1.403078 -0.946473 -5.823151
H(Fragment=5) 3.085985 1.490862 -4.451860
H(Fragment=5) 5.134581 -2.139729 -3.497196
H(Fragment=5) 2.736911 -3.717012 -2.612900
H(Fragment=5) 3.521488 1.703609 -2.950468
H(Fragment=5) 5.360506 -0.583097 -3.593267
H(Fragment=5) 2.349591 -3.573694 -4.135462
H(Fragment=5) 2.944846 -1.181245 -5.984990
This input file will invoke the UHF(fragment guess)->UNO->localization->GVB->CASSCF
route. The keyword GVB_conv=5d-4
means using a less tight GVB convergence threshold
(see the detailed explanations for GVB_conv
in Section 4.4.42). And the keyword
hardwfn
means extra keywords written into PySCF CASSCF input file to ensure spin
pure CASCI wave function since antiferromagnetic singlet with 5 alpha v.s. 5 beta
electrons is challenging.
Once the *_uno.fch
file is generated, you should open it and see whether there
are (at least) 10 UNOs whose occupation numbers are close to 1.0. This is to make
sure that we obtain the desired antiferromagnetic singlet UHF solution. Here are
my calculated UNO, GVB and CASSCF NOONs:

The GVB NOs and CASSCF NOs are stored in *_s.fch
and *_CASSCF_NO.fch
files,
respectively. The figures shown above can be visualized by opening the corresponding
.fch file using GaussView. Note that here they are not orbital energies, but natural
orbital occupation numbers (NOONs). The automatically determined active space is
CAS(10,10), and you can see there are 10 NOs with occupation numbers near 1.0, no
matter for UNO, GVB or CASSCF NOs.
The current CAS(10,10) active space is supposed to be adequate for qualitatively
correct descriptions of the ground state and d->d transitions. If you perform
the SA-CASSCF(10,10) computation based on current CASSCF NOs, you can obtain reasonable
excited energies. For quantitatively accurate excited energies, you should perform
the state-specific NEVPT2 computations based on each CASCI root. For even better
excited energies, QD-NEVPT2 computations can be performed. Here is the automr
input
file for SA-CASSCF and state-specific NEVPT2 computations:
%mem=200GB
%nprocshared=48
#p NEVPT2(10,10)/TZVP
mokit{ist=5,readno='Fe2_uhf_gvb32_CASSCF_NO.fch',nstates=12,hardwfn}
If you want larger active space (e.g. which can describe metal-ligand excitations),
you should look into GVB NOs (i.e. *_s.fch
file) and find which orbitals you want
to add in.
5.8 Examples of autosr
The utility autosr
is designed for automatic single reference calculation. Currently supported methods are MP2, CCSD, CCSD(T), DLPNO-CCSD(T), etc (see the full list in Section 4.7.1). Supported basis sets are cc-pVnZ, aug-cc-pVnZ, and the def2- series. The RI technique is turned on by default in MP2 or CC calculations. See the supported programs for MP2_prog
and CC_prog
in Section 4.7.2.1. By default, HF_prog=Gaussian
.
5.8.1 DF-CCSD(T) calculation using Molpro
See an example
%mem=20GB
%nprocshared=4
#p CCSD(T)/cc-pVTZ
mokit{CC_prog=Molpro}
0 1
O 2.72506079 -0.54744525 0.0
H 3.68506079 -0.54744525 0.0
H 2.40460620 0.35749058 0.0
By default, the density fitting (DF, also called RI) technique is turned on and the auxiliary basis set of cc-pVTZ are automatically dealt with, so this is in fact a DF-CCSD(T) calculation. If you do not want to use RI, you can specfidy mokit{CC_prog=Molpro,noRI}
.
The T1 diagnostic will be printed if you perform CCSD or CCSD(T) calculations. The Force
keyword is supported for calling programs to calculate the analytical nuclear gradients. For example, the CCSD(T) force can be obtained by
#p CCSD(T)/cc-pVTZ
mokit{CC_prog=Molpro,force}
Currently only CC_prog=Molpro
or CC_prog=PSI4
is supported for the CCSD(T) force.
5.8.2 DLPNO-CCSD(T1) calculation using ORCA
The input file of autosr
is just the Gaussian .gjf file. For example, the DLPNO-CCSD(T1) calculation of a water molecule is shown as follows (h2o.gjf)
%mem=20GB
%nprocshared=4
#p DLPNO-CCSD(T1)/cc-pVTZ
mokit{}
0 1
O 2.72506079 -0.54744525 0.0
H 3.68506079 -0.54744525 0.0
H 2.40460620 0.35749058 0.0
Submit the job
autosr h2o.gjf >h2o.out 2>&1 &
By default, the HF calculation will be performed by calling Gaussian, and the DLPNO-CCSD(T1) calculation will be performed by calling ORCA. The auxiliary basis set of cc-pVTZ are automatically dealt with by autosr
. You are supposed to write the memory as large as possible since post-HF jobs usuallty needs a lot of memory. The %maxcore
in ORCA input file will be the total memory multiplied by 0.8 and divided by the number of processors.
5.9 Visualization of upaired electrons
After a multiconfigurational/multireference calculation is accomplished, people often want to know where the unpaired electron is, or at which atom(s) are unpaired electrons located. 3 approaches are introduced below.
5.9.1 Visualization via natural orbitals
Taking the triangulene molecule as an example
The geometry has been optimized at the UCAM-B3LYP/6-31G(d,p) level. This molecule has a triplet ground state (T0). But firstly, let's perform the CASSCF calculation of the lowest singlet (S1), which is expected to be of significant biradical/diradical character. The input file of automr
(say, triangulene_cc-pVDZ_S.gjf) is
%mem=96GB
%nprocshared=48
#p CASSCF/cc-pVDZ
mokit{FcGVB,Npair=11,GVB_conv=5d-4,LocalM=Boys}
0 1
C -6.99392200 -0.69965300 0.03089200
C -6.99037600 -2.08681100 0.00754000
C -5.79692300 -2.79373800 -0.01900700
C -4.55293000 -2.11706000 -0.02294700
C -4.54759300 -0.69024300 0.00094600
C -5.77962500 0.02901000 0.02825000
C -3.32798900 -2.80487500 -0.04954900
C -2.09696100 -2.12716900 -0.05329300
C -2.08976500 -0.70050100 -0.02943100
C -3.31558100 0.01410700 -0.00232100
C -0.85924300 -2.81425700 -0.08011500
C 0.34050600 -2.11749600 -0.08271500
C 0.35636800 -0.73047500 -0.05962300
C -0.85143000 0.00846700 -0.03273200
C -3.30942000 1.43300300 0.02142600
C -4.54239100 2.15073200 0.04875200
C -5.75083800 1.43372600 0.05151200
C -0.86802700 1.41341100 -0.00896700
C -2.07013000 2.14041600 0.01811300
C -2.09382400 3.55612800 0.04213800
C -3.29698300 4.24638100 0.06855300
C -4.50617000 3.56622700 0.07202700
H -7.93196300 -0.15396700 0.05161000
H -7.93251100 -2.62540600 0.01006600
H -5.80270700 -3.87898600 -0.03717500
H -3.33266200 -3.89097400 -0.06780500
H -0.86309200 -3.89951400 -0.09839300
H 1.27782500 -2.66405100 -0.10333100
H 1.29927000 -0.19284000 -0.06197400
H -1.15202100 4.09568800 0.03963300
H -3.29222900 5.33144200 0.08668400
H -5.44315500 4.11372400 0.09272200
H 0.07497000 1.95259200 -0.01139400
H -6.68913500 1.98066500 0.07227600
Keywords in mokit{}
are specified to make the GVB SCF not converged to \( \sigma \) - \( \pi \) mixing solution, so that 11 pure C=C \( \pi \) bonding orbitals and 11 pure \( \pi \)* anti-bonding orbitals will be used in CASSCF calculations. Submit this job
automr triangulene_cc-pVDZ_S.gjf >triangulene_cc-pVDZ_S.out 2>&1
This job will take about 5 hours using 48 CPU cores. Since the active space would be (22,22), the CASSCF would automatically be switched to DMRG-CASSCF during calculation. After the job is accomplished, we obtain 3 fch files which includes natural orbtials
triangulene_cc-pVDZ_S_uhf_uno.fch
triangulene_cc-pVDZ_S_uhf_uno_asrot2gvb11_s.fch
triangulene_cc-pVDZ_S_uhf_gvb11_CASSCF_NO.fch
which include UHF natural orbitals (UNOs), GVB NOs, and CASSCF NOs, respectively. Choose a file, open it using GaussView, click MO Editor
, and click Visualize
(if it is in grey color and cannot be clicked, it implies that you need to install the Gaussian software). By default, the 72th and 73th NOs are supposed to be highlighted. If they are not, you can click/choose them to be highlighted. Change Isovalue
if you want (0.04 will be used in the following figures). Finally, click Update
and wait for a few seconds. Here are the HONO and LUNO of GVB and DMRG-CASSCF calculations, respectively
(NOONs are shown below correponding orbitals)
The natural orbital occupation numbers (NOONs) of HONO and LUNO are close to 1.0, which means 1 unpaired electron is occupied in each of these orbitals. It can be seen that these NOs are somewhat delocalized, but mainly with C_1 and C_18 the largest populations. So the following skeletal formula may represent the molecule better
Moreover, the number of unpaired electrons can be found in output
E(GVB) = -840.26647527 a.u.
----------------------- Radical index -----------------------
biradical character (2c^2) y0= 0.945
tetraradical character(2c^2) y1= 0.038
Yamaguchi's unpaired electrons (sum_n n(2-n) ): 3.086
Head-Gordon's unpaired electrons(sum_n min(n,(2-n))): 2.445
Head-Gordon's unpaired electrons(sum_n (n(2-n))^2 ): 2.051
-------------------------------------------------------------
...
E(CASSCF) = -840.43136870 a.u.
----------------------- Radical index -----------------------
biradical character (2c^2) y0= 1.004
tetraradical character(2c^2) y1= 0.129
Yamaguchi's unpaired electrons (sum_n n(2-n) ): 5.015
Head-Gordon's unpaired electrons(sum_n min(n,(2-n))): 3.572
Head-Gordon's unpaired electrons(sum_n (n(2-n))^2 ): 2.531
-------------------------------------------------------------
The number of unpaired electrons are calculated based on all NOs, so they will be slightly larger than 2 (the definition sum_n (n(2-n))^2 is recommended). Anyway, we know that there exists 2 unpaired eletrons for the S1 state of this molecule.
If you think DMRG-CASSCF calculation is too expensive, you can perform only the GVB calculation, and use GVB NOs in the *_s.fch
file to demonstrate unpaired electrons or diradical characters.
5.9.2 Visualization via localized active orbitals
You might notice that for the example shown above, the CASSCF occupation number of HONO happens to be almost equal to that of LUNO. It can be viewed as degenracy in occupation numbers. So we can perform orbital localization upon these 2 NOs without changing their occupation numbers. Start Python and run
from mokit.lib.gaussian import loc
loc(fchname='triangulene_cc-pVDZ_S_uhf_gvb11_CASSCF_NO.fch',idx=range(71,73))
Then a file named triangulene_cc-pVDZ_S_uhf_gvb11_CASSCF_NO_LMO.fch
is obtained. These two active orbitals will become a little more localized.
This trick can also be applied to high-spin triplet GVB or CASSCF calcuations, in which there will be (at least) two unpaired alpha electrons in GVB or CASSCF NOs. Whenever singly occupied orbitals are delocalized, orbital localization will do much help.
Note that if you apply this trick to NOs which are not degenerate in occupation numbers, their occupation numbers will no longer exist, and only occupation number expectaion values exist (i.e. the occupation number matrix becomes not diagonal).
5.9.3 Visualization via unpaired electron density
In unrestricted DFT (UDFT) calculations, people often visualize the spin density to find where the unpaired electrons are. In multiconfigurational methods, the unpaired electron density (also called odd electron density) is often used to show the spatial distribution of unpaired electrons. Using the CASSCF NOs of the S1 state as an example, start Python and run
from mokit.lib.wfn_analysis import calc_unpaired_from_fch
calc_unpaired_from_fch(fchname='triangulene_cc-pVDZ_S_uhf_gvb11_CASSCF_NO.fch',wfn_type=3,gen_dm=True)
This will generate the unpaired electron density stored in triangulene_cc-pVDZ_S_uhf_gvb11_CASSCF_NO_unpaired.fch
. There are various ways to visualize the unpaired electron density:
(1) Visualizing via GaussView
Run the following command in Shell
cubegen 48 fdensity triangulene_cc-pVDZ_S_uhf_gvb11_CASSCF_NO_unpaired.fch triangulene_unpaired.cub -3 h
which uses 48 CPU cores to generate the cube file. Note here the fdensity
cannot be replaced by density=SCF
.
Open the file triangulene_unpaired.cub
with GaussView, click Results
on the panel, click Surfaces/Contours
, change the value of Density =
to 0.01 a.u., click Surface Actions
and finally click New surface
. The plot would be like
(2) Visualizing via Multiwfn
Start Multiwfn and load the file triangulene_cc-pVDZ_S_uhf_gvb11_CASSCF_NO_unpaired.fch
, then type
200
16
SCF
y
0
5
1
3
-1
The key idea is to make Multiwfn read the Total SCF Density
section in .fch file and generate corresponding cubes. Note that the procedure above (i.e. integers input interactively) may be slightly different for different versions of Multiwfn, you should pay attention to the information/hints shown in the Multiwfn cmd window. Finally, change the Isosurface value
to 0.01 a.u. and the plot would be like
Of course, you can use also Multiwfn + VMD to get even better plots.
6 Developer's Guide
6.1 How to contribute
See CONTRIBUTING. Also, this page contains a few tips for handling the CI/CD.
6.2 Tips for Coding
To be added.
6.3 Tips for CI/CD and packaging
How to write GitLab CI yml file
GitLab CI jobs are triggered by .gitlab-ci.yml
.
The official manual provides a lot of useful materials for writing it, but it's quite long.
A typical job can be
centos7_conda_py37:
image: centos:7
variables:
TARGET: mokit-master_linux_centos7_conda_py37
MCONDA_VER: py37_4.10.3
extends: .setup_system
script:
- pip install numpy==1.18
- cd src
- make all -f Makefile.gnu_openblas_ci
- cd ..
artifacts:
name: $TARGET
paths:
- ./$TARGET/
It means, this job pull a docker image
and run a script
inside it, then zip and upload some artifacts
. A job can extends
another job (actually means "inherit", not "run after"). Here .setup_system
is a virtual job (because it starts with a "."), provides a before_script
and after_scipt
, to be executed before and after the script
.
.setup_system:
except:
- conda
before_script:
- yum install -y epel-release
- yum install -y openblas-devel make gcc gcc-gfortran wget
- wget -q --no-check-certificate https://mirrors.bfsu.edu.cn/anaconda/miniconda/Miniconda3-${MCONDA_VER}-Linux-x86_64.sh
- >
{ echo ${KEYCODE_ENTER};
echo "yes";
echo ${KEYCODE_ENTER};
} | sh ./Miniconda3-${MCONDA_VER}-Linux-x86_64.sh
- /root/miniconda3/bin/conda init bash
- . ~/.bashrc
after_script:
- xxx
Conda packaging
Any modification to the code will trigger CI/CD for normal build, but not all of them trigger the conda build and upload. The pipeline for conda can only be triggered by any modification of mokit/__init__.py
or conda/meta.yaml
, as indicated by this yml.
It means, uploading a new conda release should modify at least one of __version__
(in mokit/__init__.py
) and build number (in meta.yaml
). Build number change usually means that there's no actual change in functionality or program behaviour.
If something went wrong in the conda build, a manual build can be done to debug as follows.
Set up environment by docker first (may require sudo)
docker pull continuumio/miniconda3:latest
docker run -it --name myconda continuumio/miniconda3:latest /bin/bash
Now we enter the shell inside docker container
cd /root
apt update && apt install git vim # and so on
git clone https://gitlab.com/jxzou/MOKIT
cd MOKIT
conda create -n mokit-build python=3.7
conda activate mokit-build
conda install anaconda-client conda-build
conda build --output-folder ./output conda
#anaconda upload ./output/linux-64/*.bz2
Usually no need to run anaconda upload
because it's better to install and test locally. Uploading the debug version to anaconda is still possible, but it's recommended to use anaconda upload xxx --label test
. Thus, the debug version will not be downloaded by conda install mokit -c mokit
but can be downloaded by conda install mokit -c mokit/label/test
.
Next time, we can re-enter the container by
docker start -i myconda
and build a new version
conda activate mokit-build
conda build --output-folder ./output conda
#anaconda upload ./output/linux-64/*.bz2
Other CI/CD
Other CI jobs build the usual pre-built artifacts. They can also be debugged by manual docker pull and run.
For example, to manually debug centos7_conda_py37, you can docker pull centos:7
and follow the steps defined in the corresponding yml.
Of course, debugging can be: simply commit the change, push to GitLab and see what happens in the pipeline log. But this will consume the limited CI job time quota. Another option is to fork the repo to NJU Git and debug, which provides free, unlimited CI runner for NJUers.
6.4 Documentation
The current online documentation is enabled via mdBook, utilizing Yethiel's template and modified Nord Theme.
For configuration tips (rendering, themes, etc.), see mdBook Doc.
All chapters should be registered in SUMMARY.md
, and thus they will appear in the side bar.
Build and preview
We have included the mdbook
linux binary in the source code, so simply do
tar xzvf mdbook-toc*.tar.gz
tar xzvf mdbook-v*.tar.gz
# modify md in src
./mdbook build
./mdbook serve
# open your browser to visit http://localhost:3000
On other platforms, you can download the binaries in the GitHub repo of mdBook and mdbook-toc, or use WSL (you don't need a browser inside WSL), or use cargo install mdbook mdbook-toc
.
Simple modifications can be done on the GitLab website. Clicking the Edit
button or Open the Web IDE
allows you modify and commit online.
Markdown syntax
You should learn some basic Markdown to modify the documentation.
Math
mdBook's math support is a little different from common Markdown syntax.
The inline math should be written as \\( \\)
.
Blocks for note/warning
We can use admonish's note block. It looks like
The type of blocks can be note
, tip
, example
, warning
, danger
, etc. See admonish doc for more information.
Flowchart
We now have a flowchart example at the beginning of Section 4.5, which is an AntV G6 graph embedded in an iframe. The G6 graph can be configured in chap4-5_head.html
and data.json
.
But for simple flowcharts, like A -> B -> C
, it's not necessary to use G6. Instead, Mermaid is suitable, and well supported by Markdown, although we don't have a chance to use it yet.
Papers citing MOKIT
-
On the origin of reactivity variation upon sequential ligation: the [Re(Cl)x]+/CH4 (x=1−3) couples. Phys Chem Chem Phys DOI: 10.1039/d1cp03468e
-
Striking Size and Doping Effects of Ti−Si−O Clusters on Methane Conversion Reactions. Chem Eur J DOI: 10.1002/chem.202201136
-
Energy Decomposition Analysis of the Nature of Coordination Bonding at the Heme Iron Center in Cytochrome P450 Inhibition. Chem Asian J DOI: 10.1002/asia.202200360
-
Ab Initio Methods in First-Row Transition Metal Chemistry. Eur J Inorg Chem DOI: 10.1002/ejic.202200014
-
The Bonding Nature of Fe−CO Complexes in Heme Proteins. Inorg Chem DOI: 10.1021/acs.inorgchem.2c02387
-
Efficient Implementation of Block-Correlated Coupled Cluster Theory Based on the Generalized Valence Bond Reference for Strongly Correlated Systems. J Chem Theory Comput DOI: 10.1021/acs.jctc.2c00445
-
On the distinct reactivity of two isomers of [IrC4H2]+ toward methane and water. Sci China Chem DOI: 10.1007/s11426-022-1342-4
-
The Pincer Ligand Supported Ruthenium Catalysts for Acetylene Hydrochlorination: Molecular Mechanisms from Theoretical Insights. Catalysts DOI: 10.3390/catal13010031
-
Oxygen-containing functional groups enhance uranium adsorption by aged polystyrene microplastics: Experimental and theoretical perspectives. Chem Eng J DOI: 10.1016/j.cej.2023.142730
-
Accurate rate constants for barrierless dissociation of ethanol: VRC-VTST and SS-QRRK calculations with the cheaper DFT method. Chem Phys Lett DOI: 10.1016/j.cplett.2023.140522
-
Assessment of DFT functionals for a minimal nitrogenase [Fe(SH)4H]- model employing state-of-the-art ab initio methods. J Chem Phys DOI: 10.1063/5.0152611
-
Equation-of-Motion Block-Correlated Coupled Cluster Method for Excited Electronic States of Strongly Correlated Systems. J Phys Chem Lett DOI: 10.1021/acs.jpclett.3c01474
-
A call to arms: making the case for more reusable libraries. J Chem Phys DOI: 10.1063/5.0175165
-
Methane activation by [LnO]+: the 4f orbital matters. Sci China Chem DOI: 10.1007/s11426-023-1801-4
-
Interpreting the Cu–O2 Antibonding Nature in Two Cu–O2 Complexes from Cu L-Edge X-ray Absorption Spectra. Inorg Chem DOI: 10.1021/acs.inorgchem.3c01896
-
Methane Activation by [AlFeO3]+: the Hidden Spin Selectivity. ChemPhysChem DOI: 10.1002/cphc.202300603
-
Electronic Structure Determines Geometry: Bond Length Alternating in Cyclo[2n]carbons. J Phys Chem A DOI: 10.1021/acs.jpca.4c01540
Preprint: Chemistry of Cyclo-[2n]-Carbon: A Many-Particle Quantum Mechanics Investigation. DOI: 10.26434/chemrxiv-2023-plj1t-v2 -
Photocatalytic Reduction of CO2 to HCOOH and CO by a Phosphine-Bipyridine-Phosphine Ir(III) Catalyst: Photophysics, Nonadiabatic Effects, Mechanism, and Selectivity. Angew Chem Int Ed DOI: 10.1002/anie.202315300
-
Light- and thermal-driven gold-catalyzed reaction of o-alkynylphenols with aryldiazonium salts: Computational insights into mechanistic similarities and differencesy. Chin. J. Chem. Phys DOI: 10.1063/1674-0068/cjcp2304029
-
Block-Correlated Coupled Cluster Theory with up to Four-Pair Correlation for Accurate Static Correlation of Strongly Correlated Systems. J Phys Chem Lett DOI: 10.1021/acs.jpclett.3c03373
-
Renormalized-Residue-Based Multireference Configuration Interaction Method for Strongly Correlated Systems. J Chem Theory Comput DOI: 10.1021/acs.jctc.3c01247
-
Theoretical Study on Ethylamine Dissociation Reactions Using VRC-VTST and SS-QRRK Methods. J Phys Chem A DOI: 10.1021/acs.jpca.3c08373
-
XMECP: Reaching State-of-the-Art MECP Optimization in Multiscale Complex Systems. J Chem Theory Comput DOI: 10.1021/acs.jctc.4c00033
-
Triel Bonds between BH3/C5H4BX and M(MDA)2 (X = H, CN, F, CH3, NH2; M = Ni, Pd, Pt, MDA = Enolated Malondialdehyde) and Group 10 Transition Metal Electron Donors. Molecules DOI: 10.3390/molecules29071602
-
Accurate ab initio based potential energy surface and kinetics of the Cl + NH3 → HCl + NH2 reaction. J Chem Phys DOI: 10.1063/5.0216562
-
Strong Triel Bonds with Be as Electron Donor. Inorg. Chem. DOI: 10.1021/acs.inorgchem.4c02186
-
A weight growth route from 2-naphthylmethyl radical to tricyclic aromatics. Proc. Combust. Inst. DOI: 10.1016/j.proci.2024.105535
-
Block-Correlated Coupled Cluster Theory Based on the Generalized Valence Bond Reference for Singlet–Triplet Energy Gaps of Strongly Correlated Systems. J Phys Chem Lett DOI: 10.1021/acs.jpclett.4c02362
-
Intensity-dependent three-body Coulomb explosion of methane in femtosecond laser pulses. Phys. Rev. A DOI: 10.1103/PhysRevA.109.023115
-
Theoretical investigation on unimolecular decomposition network of n-propylamine with advanced kinetic methods. Combust. Flame DOI: 10.1016/j.combustflame.2025.113996
-
Preprint: Facile Wet Chemical Synthesis of Dimeric Triangulene Derivatives through Intramolecular Radical-Radical Coupling. DOI: 10.21203/rs.3.rs-3185783/v1
-
Preprint: Endeavoring the First Lanthanide–Carbon Triple-Bond in Fullerene Cage. DOI: 10.21203/rs.3.rs-5351349/v1
Appendix
Here we provide a brief outline of all frequently asked questions, limitations and suggestions.
You can also use the 🔍 tool in the left upper corner of this page to search the whole manual for your questions.
If any of those cannot solve your problem, please consider Bug report.
Note: not all the error messages shows on screen, they may be found in program log files (usually MOKIT will print a message on screen to suggest you checking those files).
Other questions | ||
---|---|---|
pre-compiled version? | why orb2fch requires a fch? | support more programs? |
is density in fch correct? | automr takes more time than manual computation? | my computation so slow! |
Limitations and Suggestions | ||
---|---|---|
Basic knowledge of multi-reference methods | Symmetry | Excited State Calculations |
Possible Multiple Solutions of UHF | Implicit Solvent Model |
A1 Frequently Asked Questions (FAQ)
Q1: command not found
Find errors like xxx: command not found
. What is the solution?
A1: This is simply because you haven't installed the corresponding software, or you didn't write environment variables correctly. Four examples are shown below:
Example 1 (during compilation): error ifort: command not found
means there is no ifort compiler on your computer, see Section 2.3.1 Prerequisite for details.
Example 2 (during compilation): error -bash: f2py: command not found
means there is no f2py on your computer, see Section 2.3.1 Prerequisite for details.
Example 3 (during execution): assuming you've compiled MOKIT successfully, but got the -bash: automr: command not found
error. Then you should check the MOKIT paths in your ~/.bashrc. See Section 2.3.4 Environment variables for details.
Example 4: errors like /usr/bin/ld: cannot find -lmkl_rt
, ld: cannot find -lmkl_intel_lp64
(during compilation) or error while loading shared libraries: libxxx.so: cannot open ...
(during execution) have two possible reasons: (1) there is no MKL installed on your computer; (2) there is MKL on your node but you did not correctly (or you forgot to) write environment variables of MKL. See Section 2.3.1 Prerequisite and Section 2.3.4 Environment variables for details.
Q2: pre-compiled version
Is there any Windows/Mac OS pre-compiled or pre-built version of MOKIT?
A2: The author offers more than 20 utilities of Windows* OS pre-built executables, which are released as a .zip file. See instructions for download and setting environment variables in Section 2.1.
The automr
module is not included in pre-compiled executables, since multi-reference calculations are usually performed on high-performance computers/platforms, which usually contain Unix/Linux systems. However, if you really need all modules or utilities under Windows, you can compile the source code on your laptop/computer/node by yourself.
For Linux/MacOS pre-built version, see Section 2.2
Q3: why orb2fch requires a fch
Why the utilities py2fch
, orb2fch
, xml2fch
, bdf2fch
and bdf2mkl
cannot generate a .fch file from scratch, but require the user to provide one?
A3: Strictly speaking, to correctly transfer MOs between different programs requires int=nobasistransform nosymm
(and also possibly 5D 7F
/6D 10F
) keywords in Gaussian, and equivalent keywords in other quantum chemistry programs. This rule also applies to other programs/software which claim they can transfer MOs or support file transformation.
However, the utility/subroutine (which transfers MOs) cannot detect whether the users have truly specified int=nobasistransform nosymm
(and also possibly 5D 7F
/6D 10F
) keywords, or equivalent keywords in other programs. Then transferring MOs in such case is very dangerous, i.e. correctness cannot be assured. For example, using the built-in basis sets in other programs instead of basis sets generated by MOKIT is dangerous, since the built-in basis sets in other programs are generally not exactly identical to those generated by MOKIT.
Therefore, the developers have to require the users to provide a .fch file, to remind the users that proper keywords must be used in Gaussian. And the most recommended approach is to use utilities in MOKIT (e.g. fch2inp
, bas_fch2py
, etc) to generate input files of other quantum chemistry programs. The users use these generated input files to perform their desired computations. After jobs finished, use dat2fch and py2fch to transfer MOs back to the .fch file. Such an approach is already applied in automr
.
In the future version of MOKIT, the utility xml2fch will not require the user to provide a .fch file.
Q4: support more programs
Can MOKIT support more types of files? Transferring MOs between other programs like NWChem or BAGEL is possible?
A4: The MOKIT developers wish to make the MOKIT recognize all kinds of MO files of quantum chemistry programs, but they are not likely to be familiar with all programs. Therefore, if you are very familiar with any program (other than supported ones in MOKIT), and you happens to need a transferring of MOs, please contact MOKIT developers and tell the information in the MO file of that program. With the help from experienced users, the development will be much easier and quicker.
Q5: OpenMolcas: Error in keyword
Why there is a keyword error in OpenMolcas output when using DKH2 Hamiltonian? Two possible errors are given: (1) ERROR: RELATIVISTIC is not a keyword!, Error in keyword. (2) ERROR: R02O is not a keyword!, Error in keyword.
A5: This is due to a recent update of OpenMolcas. If version <= 18.09, the keyword for DKH2 is simply R02O
; but for version >=20.10, this keyword become RELAtivistic = R02O
. For version >18.09 and <20.10, we have no manual and thus it is not tested (but you can test if you like).
If you encounter any of the two errors, you have two choices: (1) update your OpenMolcas to a newer version, e.g. >=20.10; (2) modify the source code of MOKIT and re-compile it. If you choose (2), you will need to modify the words RELAtivistic = R02O
to R02O
in file $MOKIT_ROOT/src/automr.f90
, and run make automr
in Shell to re-compile the program automr.
Q6: executable paths of Gaussian, etc.
How does automr
read the executable paths of Gaussian, OpenMolcas, PySCF, ORCA, and GAMESS?
A6: For Gaussian, the paths of executable files are read from the environment variables $GAUSS_EXEDIR
. Note: this does not mean that you need to explicitly define the variable $GAUSS_EXEDIR
. A correct example of Gaussian environment variable definitions look like
export g16root=/opt
source $g16root/g16/bsd/g16.profile
export GAUSS_SCRDIR=/scratch/$USER/gaussian
where the $GAUSS_EXEDIR
is already defined in the file g16.profile
.
For PySCF and OpenMolcas, the python
and pymolcas
executable are used directly, assuming the user had installed the corresponding programs correctly. For ORCA, the absolute path is automatically obtained from echo which orca
. For GAMESS, the user must define the $GMS
environment variable in his/her ~/.bashrc
file, such that the automr program can find corresponding paths.
Q7: GAMESS: ERROR DIMENSIONS EXCEEDED
What are the possible reasons and solutions of the following errors
(1) '***** ERROR **** DIMENSIONS EXCEEDED *****'
(2) 'PAIR= xx MAX= 12'
(3) 'DDI Error: Could not initialize xx shared memory segments.'
(4) 'DDI was compiled to support 32 shared memory segments.'
(5) 'The GAMESS executable gamess.01.x or else the DDIKICK executable ddikick.x
could not be found in directory ...'
in GAMESS .gms file (where xx is an integer >12)?
A7: Please read Section 4.4.10 carefully.
Q8: GAMESS: semget errno=ENOSPC
Errors like semget errno=ENOSPC -- check system limit for sysv semaphores
found in the .gms file. Why? How to solve the problem?
A8: Please search ENOSPC
on this FAQ of GAMESS.
Q9: GAMESS: floating point error (SIGFPE)
I found the error DDI Process 0: trapped a floating point error (SIGFPE).
How to solve it?
A9: See this page https://github.com/gms-bbg/gamess-issues/issues/40.
Q10: PySCF: has no attribute mo_occ
I found the following error
AttributeError: 'CASSCF' object has no attribute 'mo_occ'
in PySCF output (e.g. .out file). Why? How to solve the problem?
A10: This is because you were using PySCF version < 1.7.4, which has no mo_occ attribute in CASSCF object. Please update to PySCF >=1.7.4.
Q11: PySCF: No such file block.spin_adapted
I found the following error
FileNotFoundError: [Errno 2] No such file or directory: '/path/to/Block/block.spin_adapted'
in PySCF output (e.g. .out file). Why? How to solve the problem?
A11: This is because you forgot to modify pyscf/dmrgscf/settings.py. You should copy file settings.py.example and rename it as settings.py. Then set the correct path (within the file) for the DMRG solver.
Q12: OpenMolcas: Error detected in HDF5
I found the following error
HDF5-DIAG: Error detected in HDF5 (1.12.0) thread 0:
#000: H5D.c line 435 in H5Dget_type(): invalid dataset identifier
major: Invalid arguments to routine
minor: Inappropriate type
occurs many times in OpenMolcas output, Why? How to solve the problem?
A12: This does not affect the computations. It is just a tiny bug of old versions of OpenMolcas if the QCMaquis support is compiled but not used. You can find this problem here. It is recommended to update your OpenMolcas to the latest version.
Q13: is density in fch correct?
Is 'Total SCF Density' in *_NO.fch
file correct? Can it be used to perform wave function analysis?
A13: Yes. The CASCI/CASSCF total densities are correct in generate *_NO.fch
files. Besides, the GVB density is in *_s.fch
file, and the UHF density are both in *_uhf.fch
and *_uno.fch
files. Other types of *.fch
files (*_asrot.fch
, etc) do not have well-defined density. Density and natural orbitals are not calculated for post-CASSCF methods (NEVPT2, CASPT2, MRCISD, etc).
Q14: Syntax error: Bad fd number
I found the error sh: 1: Syntax error: Bad fd number
on the terminal/screen.
Why? How to solve the problem?
A14: This often occurs on Ubuntu* OS, where the default sh is /bin/dash
, not /usr/bin/bash
.
Please read Section 2.2.4 for details and examples.
Q15: Warning for OMP_STACKSIZE
I found the following warnings on terminal/screen, or output of automr
:
OMP: Warning #181: OMP_STACKSIZE: ignored because KMP_STACKSIZE has been defined
Why? How to make it disappear?
A15: This is simply because both OMP_STACKSIZE and KMP_STACKSIZE environment variables are set. You can comment/delete one of them. Usually you can find these two environment variables in your ~/.bashrc file, or in file bdf-pkg/sbin/run.sh (if you use the BDF program).
Q16: GKS-EDA: Warning for radial grid
When using the utility frag_guess_wfn
to generate GKS-EDA input files, there
is a warning (see below) in the GAMESS output file (e.g. xxx.gms file). And SCF
does not converge. How to deal with that?
**************************************************
* *
* WARNING: QUESTIONABLE SELECTION OF RADIAL GRID *
* *
**************************************************
THIS RUN HAS REQUESTED NRAD= 99 IN $DFT OR $TDDFT
ATOM= 63 HAS LARGE EXPONENT= 565073.253000000026077
RECOMMEND NRAD ABOVE 50 FOR ZETA'S ABOVE 1E+4
RECOMMEND NRAD ABOVE 75 FOR ZETA'S ABOVE 1E+5
RECOMMEND NRAD ABOVE 125 FOR ZETA'S ABOVE 1E+6
A16: As you can see in the screenshot above, GAMESS thinks your radial grid is insufficient for this molecule and recommends the minimum requirement. So, all you need to do is: modify the radial grid in the .inp file to the recommended value. For example, in this example, you should modify NRAD0=99 and NRAD=99 in the .inp file to (at least) NRAD0=126 and NRAD=126,respectively.
Q17: Psi4: h5py Error
I found the following error in output of running automr
File "h5py/h5.pyx", line 183, in h5py.h5.H5PYConfig.default_file_mode.__set__
ValueError: Using default_file_mode other than 'r' is no longer supported. Pass the mode to h5py.File() instead.
How to solve the problem?
A17: Please check whether your current python is the PSI4 python by running which python
in Shell. If yes, and if you do not have to use PSI4 currently, you are recommended
to comment/deactivate PSI4's environment variables and use your previous Python
(e.g. Anaconda Python 3).
Q18: PySCF: No module named h5py
I found the following error in output of running automr
File "/home/jxzou/software/pyscf-2.0.1/pyscf/lib/misc.py", line 31, in <module>
import h5py
ModuleNotFoundError: No module named 'h5py'
How to solve the problem?
A18: If you are using python and f2py from PSI4 package, rather than python from
Anaconda Python 3, you may encounter this PySCF error. One possible solution is
to comment PSI4 variables and write this variable export PSI4=...
(see Section 2.2.3).
Then exit the terminal and re-login, and you now you are using your previous python,
e.g. Anaconda Python 3. Next you should run make distclean and make all to re-compile
MOKIT.
Q19: automr takes more time than manual computation?
Is it possible that automr
takes more (computational) time than that by manually doing multi-reference computations (by chemical intuition, manual inspection, etc) ?
A19: Yes, possible. Note that for some small and well-studied molecules, manually doing multi-reference computations (by chemical intuition, manual inspection, etc) may take less (computational) time than automr
. But for general molecules, especially with dozens/hundreds of possible active orbitals, manual inspection is tedious and sometimes chemical intuition is unreliable. automr
aims at tackling general and complicated cases.
Besides, to better compare the cost (of time) between the human and automatic approach, the cost of time of human operations must be taken into consideration, since the time of manual inspection and permuting orbitals (when >10 orbitals) is not negligible.
Q20: GKS-EDA: SCF fail
Why does the GKS-EDA jobs fail even if I used the utility frag_guess_wfn to generate GAMESS .inp file?
A20: Try to modify 'DIIS=.F. SOSCF=.T.' to 'DIIS=.T. SOSCF=.F.' in the .inp file, and re-submit the job (remember to remove temporary files before re-submitting).
Q21: my computation so slow!
Why does my computation proceed slowly? It took a long time in HF and GVB calculations. Is there any acceleration techniques/tricks?
A21: Firstly, please read Section 3.1 and 3.2 for suggestions on choosing appropriate methods and basis sets. Secondly, assuming your method and basis set is already reasonable, there is no trick to be used currently. The HF step is performed by Gaussian and it does not support RI or density fitting techniques. In the future version of MOKIT, users can specify PSI4 as the HF_prog, where RI is supported. The author jxzou will write a RI-GVB program in a year, then the GVB computation can be greatly accelerated.
Q22: undefined symbol GOMP
I find errors like undefined symbol: GOMP_parallel
or undefined symbol: omp_get_thread_num
when importing some modules in mokit.lib
.
A22: If you are using MOKIT from conda: currently there're two approaches to avoid this error:
(1) If you are using MOKIT from default channel, please check:
- no packages used from
conda-forge
channel. One can doconda list | grep libg
to check if some packages in the environment come fromconda-forge
. Since currently pyscf's conda package needsconda-forge
, you can install pyscf from pip in this case if pyscf is needed. If for some reason you have to enableconda-forge
channel, please go to approach (2).
- package
blas
should be*mkl
instead of*openblas
. Doconda list | grep blas
to check it. See issues/19 to make it right.
(2) See here to install MOKIT with conda-forge channel.
If you are compiling MOKIT from source, there's a workround: add -lgomp
to F2_FLAGS
in the Makefile. See issues/29 for more information.
Q23: difference between GAMESS .dat files
What is the difference between xxx2gvb4.dat
and xxx2gvb4_s.dat
?
Here the integer 2
means to
, 4
means that there are four GVB pairs, and _s
means sorted/sorting. The xxx2gvb4.dat
is generated by GAMESS when performing GVB computations. The order of GVB MOs in this file is
{doubly occupied orbitals, singly occupied orbitals, bonding1, antibonding1, bonding2, antibonding2, ..., virtual orbitals}
Besides, the GVB pairs are not sorted by their pair coefficients (also called CI coefficients).
The xxx2gvb4_s.dat
is generated by MOKIT according to the file xxx2gvb4.dat
. The order of GVB MOs in this file is
{doubly occupied orbitals, bonding1, bonding2, ..., singly occupied orbitals, ... antibonding2, antibonding1, virtual orbitals}
Besides, the GVB pairs in this file are sorted by their pair coefficients |Cu,2|. The order of multiconfiguration/multireference characters is 'HONO/LUNO pair' > 'HONO-1/LUNO+1 pair' > ... . This a commonly used MO order, and this is the order used for later CAS/DMRG computations.
Different people want different orders of MOs for their convenience of usage. A few people may want another type of MO order
{doubly occupied orbitals, bonding1, antibonding1, bonding2, antibonding2, ..., singly occupied orbitals, virtual orbitals}
This can be obtained by using the reorder2dbabasv module.
Q24: difference between xxx2gvb4_s.dat
and xxx2gvb4_s.fch
The MO order in these two files are identical, i.e.
{doubly occupied orbitals, bonding1, bonding2, ..., singly occupied orbitals, ... antibonding2, antibonding1, virtual orbitals}
The MO coefficients in xxx2gvb4_s.dat
/xxx2gvb4_s.fch
are in GAMESS/Gaussian conventions, respectively. Note that the MO coefficients of these two quantum chemistry programs do not obey the same {10F,15G,21H} basis order conventions, no matter you use spherical harmonic functions (5D 7F) or Cartesian functions (6D 10F). We know that there are some packages (not MOKIT) which claim that they can convert MOs between Gaussian<->GAMESS with simply copying MO coefficients back and forth, without any permutations of MO coefficients (unfortunately, this is wrong). This shows the powerfulness and uniqueness of fch2inp
and dat2fch
of MOKIT.
Q25: dimensions of MOs in GAMESS .inp/.dat files
GAMESS sopports both spherical harmonic functions (5D 7F) and Cartesian functions (6D 10F). It is controlled by the ISPHER
keyword in the GAMESS .inp
file. Some people thinks that GAMESS only supports 6D 10F
(unfortunately, this is not the truth). If you perform any ab inito calculation for the water molecule using cc-pVDZ, you will find the number of MOs is 24. The dimension of MOs in .fch file is 24*24, while the the dimension of MOs in .dat file is 24*25. This is because GAMESS always adopts Cartesian-type MO coefficients.
When you use 5D 7F
, GAMESS uses 5D 7F
in the code during computation; but when printing MOs into the .dat file, these MOs will be exapanded to obtain new MOs based on Cartesian-type basis functions. When transforming MOs from .dat to .fch file, the dat2fch
utility needs to contract the Cartesian-type MOs (24*25) back into spherical harmonic type MOs (24*24) and then print them into .fch file. When transforming MOs from .fch to .dat file, the fc2inp
utility needs to expand spherical harmonic type MOs (24*24) to Cartesian-type MOs (24*25) and then print them into .inp file. The permutations of MO coefficients are also needed besides contraction/expansion.
When you use 6D 10F
, GAMESS uses 6D 10F
in the code during computation, and printing the same MOs into the .dat file. The dimension of MOs in .dat/.fch files are both 24*25. In such case, contraction/expansion is not needed for dat2fch
or fch2inp
, but permutations of MO coefficients are still needed.
By default, MOKIT uses 5D 7F
no matter you use any basis set or ECP/PP. And MOKIT will make all quantum chemistry packages use 5D 7F
during computations (unless 5D 7F
is not supported by some rare functionalities). Thus the user does not need to worry about this detail. If you want to (or you have to) use Cartesian functions, you need to specify mokit{Cart}
in the Title Card line. This not recommended since it does not bring any advantage, and it is easy to cause basis set linear dependency.
A2 Limitations and Suggestions
A2.1 Basic knowledge of multi-reference methods
Although MOKIT is a useful tool for black-box multi-reference computations, the users are assumed to know basic knowledge of multi-reference methods. If he/she does not even know the meaning of CAS(m,n), then the results of MOKIT are meaningless to he/she, and even wrong explanations may be made from the results. Therefore, if you know little about the multi-reference methods, I recommend you the following materials:
(1) the ORCA CASSCF-tutorial, which is a quickstart guide (but also detailed) for CASSCF computations. To download this .pdf file, you may need to (register and) login to the forum.
(2) to be added.
If, unfortunately, you are forced by your supervisor/advisor to learn how to perform multireference calculations, but meanwhile you are a totally newbie and you don't even know how to perform routine DFT calculations, it is recommended that you change a task/project as soon as possible (ASAP). Or more radically, you are encouraged to switch to a new supervisor ASAP, since he/she does not know how to teach/train a student properly. This is not your fault, but your supervisor's.
A2.2 Symmetry
Unfortunately, molecular point group symmetry cannot be taken into consideration in any module of MOKIT. This is due to: (1) use of symmetry may change the orientation of the target molecule, and the MO coefficients will be changed accordingly; (2) localized orbitals are used in almost all modules of automr
, this usually contradicts with symmetry.
A2.3 Validity of MOs obtained by automr
for excited state calculations
When you use automr
to perform a ground state CASSCF calculation, the obtained CASSCF MOs (whether pseudo-canonical MOs or NOs) is supposed to be excellent for the ground state electronic structure of the target molecule. And you can use this set of MOs (held in a .fch file) to further conduct excited state calculations like state-averaged CASSCF (SA-CASSCF). And moreover, NEVPT2, CASPT2 or MRCISD, if you wish.
However, the resulting excitation energies and excited state MOs (e.g. state-averaged NOs) are not necessarily excellent. Here 'not necessarily excellent' means for some molecules you may get good results while may be unsatisfactory for some other molecules. The reason is simple: the current algorithms in automr
focus on the multi-reference characters in the ground state of the molecule. Thus automr
'finds' excellent active orbitals of the ground state. But the active orbitals of excited states are not necessarily the same as those of the ground state. And the orbital optimization in SA-CASSCF does not guarantee leading to good MOs for excited states. This problem actually exists in almost all methods/programs which feature black-box or automatic multi-reference calculations.
There is a solution (although not perfect or elegant) to this problem: (visually) inspect the doubly occupied orbitals, pick up important orbitals (usually the lone-pair orbitals) and add them into the active space. For example, if you obtain a CAS(6e,6o) active space from MOKIT, and assuming you find 4 lone pair orbitals among doubly occupied orbitals, then you can combine them (by interchanging or permuting orbitals) to be a CAS(14e,10o) active space. Because 6+2*4=14
active electrons, and 6+4=10 active orbitals. You can do a SA-CASSCF(14e,10o) computation next. This usually converges in several cycles. Finally you can perform a NEVPT2/CASPT2/MRCISD computation based on SA-CASSCF(14e,10o) orbitals to get more accurate excitation energies.
A2.4 Possible multiple solutions of UHF
There may exist multiple UHF solutions when a covalent bond cleavages homolytically, or in a transition-metal-containing molecule. In these special cases, if you use ist=0, the UHF calculated by automr
may be not the lowest UHF solution (but it is stable). You may need to perform several UHF computations (by yourself) using various initial guesses. After you identify the lowest UHF solution, you can use keywords ist=1 and readuhf
to read in the desired UHF .fch file.
Alternatively, you can simply write the fragment guess information into the .gjf file, exactly as the syntax of Gaussian. See an example in file examples/automr/05-N2_cc-pVTZ_4.0.gjf.
Peter Pulay et al suggests that for cases involving multiple UHF solutions, one should use the averaged density (of all possible solutions) to generate UNOs. This approach is not supported in MOKIT currently.
A2.5 Implicit Solvent Model
Currently implicit solvent effect cannot be taken into consideration. But you can use the converged CASSCF wave function *_NO.fch
file as an initial guess to your further calculations which takes implicit solvent effect into consideration.
A3 Bug Report
If you find any bug frequently occurs, please go to MOKIT GitLab to download the latest version of MOKIT and check whether the bug still exists. If it still exists, you can
(1) Open an issue on the GitLab page, or issue on the GitHub page of MOKIT.
(2) Join the Tencent QQ group (Group ID: 470745084) if you can communicate in Chinese, and show your problem in the QQ group. Private messages through QQ might not be replied.
(3) Contact the developer jxzou via E-mail njumath[at]sina.cn, with your input and output files attached. Reply/Answer is not guaranteed since jxzou is very busy.
A4 Acknowledgement
Thanks to all MOKIT developers and users for constructive suggestions, bugs report and contributions.
News
20241119 new release v1.2.6
See release note for new features.
20241127 1000 downloads on anaconda
The download number on anaconda reaches 1000 in Nov 27, 2024.