SSAGES Documentation

SSAGES Documentation - IME CODES

SSAGES Documentation
Release 0.4.2-alpha
The SSAGES Team
October 14, 2016
CONTENTS
1
Introduction
2
Getting Started
2.1 Pre-Reqisites . . . .
2.2 Get the source code
2.3 Build SSAGES . . .
2.4 Run SSAGES . . . .
2.5 Advanced options .
3
.
.
.
.
.
5
5
5
5
6
6
3
Tutorials
3.1 Basic User Tutorial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Method-specific tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
9
9
4
Meta-Dynamics Methods
4.1 Adaptive Biasing Force Algorithm .
4.2 Basis Function Sampling . . . . . .
4.3 Elastic Band . . . . . . . . . . . .
4.4 Finite Temperature String . . . . .
4.5 Forward-Flux . . . . . . . . . . . .
4.6 Generic Metadynamics . . . . . . .
4.7 Image Method . . . . . . . . . . .
4.8 Replica Exchange . . . . . . . . .
4.9 Swarm of Trajectories . . . . . . .
4.10 Umbrella Sampling . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
11
11
14
17
19
21
31
31
33
33
35
5
Write your own Methods and CVs
5.1 How to write a new CV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 How to write a new method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
37
37
38
6
Contribute to SSAGES
6.1 Reporting bugs and wishes . . . .
6.2 Improving the Documentation . .
6.3 Adding your method to SSAGES
6.4 Working on the core classes . . .
41
41
41
45
45
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
7
The SSAGES cookbook
47
8
Acknowledgments
8.1 Project Supervisors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.2 Project Leads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8.3 SSAGES Core Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
49
49
49
i
8.4
8.5
8.6
8.7
8.8
9
Methods . . . . . . . . . . . . . . .
Collective Variables . . . . . . . . .
Documentation . . . . . . . . . . . .
Driver Hooks . . . . . . . . . . . . .
All contributors in alphabetical order
Copyright
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
49
50
50
50
50
53
10 License information
55
11 Indices and tables
57
ii
SSAGES Documentation, Release 0.4.2-alpha
In their simplest form, particle based simulations are limited to generating ensembles of configurations (in Monte Carlo
[MC] simulations) or trajectories in time (in molecular dynamics [MD] or Brownian dynamics [BD]). One can then
extract mechanical variables such as the potential energy or pressure and perform ensemble or time averages. There
are two important limitations to such calculations: 1) for complex materials, the time scales available to standard
MD simulations are often insufficient to sample relevant regions of phase space; 2) in order to develop a fundamental
understanding of materials, researchers are primarily interested in calculating the free energy, the entropy, and their
derivatives with respect to various thermodynamic quantities (which lead to material properties such as elastic moduli,
heat capacity, and various other susceptibilities; these quantities are difficult to obtain or intractable in standard MC
and MD simulations. To overcome these limitations, MC and MD simulations must be supplemented with advanced
sampling techniques. These methods are critical for the efficient simulation of complex assembly processes.
SSAGES (Software Suite for Advanced Generalized Ensemble Simulations) is designed to perform these calculations.
The framework is designed to treat molecular simulation routines as a black box, using the coordinates of the system as
evolved by an MC or MD engine to compute collective variables which permit a meaningful reduced-dimensionality
representation of the phase space within a system. This information is then used to define evolving reactive pathways
or to bias the statistics of a simulation for the purposes of computing free energies. The internal structure of the code
has been designed to be simple and extensible to new methods and sampling engines. For further details on examples
and capabilities of SSAGES, peruse the documentation for specifc modules (page 11).
Contents:
CONTENTS
1
SSAGES Documentation, Release 0.4.2-alpha
2
CONTENTS
CHAPTER
ONE
INTRODUCTION
Welcome to SSAGES, our new and shiny Meta-Dynamics package.
Over the past several decades, molecular simulation has emerged as a powerful tool for investigating a wide range of
physical phenomena. Molecular simulation is, in essence, a computational “microscope” whereby computers are used
to “look at” the properties of a system that are difficult to observe or measure through traditional experimental setups.
The comparison between simulations and the corresponding experimental systems can sometimes be challenging,
usually due to factors such as the length and time scales explored. In simulation, a molecular model must have
sufficient temporal and spatial accuracy to resolve the fastest time scales and shortest length scales within a system.
Unfortunately, due to computational constraints, this detailed resolution has limited the length of time and number of
particles that a model can simulate, typically simulating systems that are smaller than analogous experimental setups
in laboratory environments for much shorter times than the duration of the experiments. However in recent years,
advancements in computational processing power, including custom built computer architectures, have continued to
increase the time and length scales accessible by molecular simulation, with current state-of-the-art simulations able
to analyze systems for milliseconds (10-3s).
Another challenge arises from the difficulty in obtaining good statistics from molecular simulations. Thermal fluctuations dominate motion at the nano-scale and result in motion that appears random (i.e. Brownian), with no two
molecular trajectories being identical. As a result, statistically meaningful averages are necessary in order to calculate
thermodynamic and kinetic quantities of interest in these systems. An incredibly powerful thermodynamic quantity
referred to as the relative free energy of a system can be calculated in this way. The relative free energy can characterize underlying system behavior in the presence of the thermal-induced random noise. Performing this necessary
averaging within simulations is challenging. In essence, the requirement of averaging compounds the issue of time
scales described previously; not only must long simulations be performed, but they must be performed a prohibitively
large number of times in order to extract sufficient statistics. It is therefore necessary to develop efficient techniques
to calculate meaningful averages from simulations.
Advanced sampling methods represent a class of simulation techniques that seek to improve this improper averaging
and accelerate the extraction of useful properties (e.g. free energies, transition paths) from simulations. At the heart
of all advanced sampling methods, is statistical mechanics, a field of physics that relates microscopic phenomena (i.e.
the motion of particles) to macroscopic observables (e.g. temperature and pressure). By taking advantage of statistical
mechanics, advanced sampling methods are used to apply a systematic bias to a simulation to speed convergence, and
then mathematically remove this bias to extract the true underlying behavior. Throughout the past decade, advanced
sampling methods have become wildly successful, and have now become an essential component in the toolbox of
molecular simulation.
Despite the demonstrated utility of advanced sampling techinques, they have only been adopted by a fraction of the
scientists working in the field. One explanation for this slow adoption is technical: advanced sampling methods are
complicated, and not all research groups have the expertise required in order to implement these methods themselves.
In the worst case, this leads to long stages of code development, possibly leading to unknown implementation errors
or insufficient validation. Even in cases when advanced sampling methods are implemented, they are typically done so
with a specific problem in mind and are custom-built for a certain model or application. This specificity necessitates
modification of the custom-built advanced sampling code when studying new systems. This prevents the distribution
3
SSAGES Documentation, Release 0.4.2-alpha
of code between researches in the field. As a result, the same methods are implemented again and again by different
members of the community. Sadly, in molecular simulation, it is quite common to “reinvent the wheel”.
SSAGES is an answer to this problem. SSAGES (Suite for Advanced Generalized Ensemble Simulations) is a free,
open-source software package that allows users to easily apply advanced sampling techniques to any molecular system
of interest. Simply put, SSAGES is a wrapper that converts a molecular simulation engine (e.g. LAMMPS, NAMD)
into an advanced sampling engine. SSAGES contains a library of widely used enhanced sampling methods that can be
used to calculate everything from free energies to transition pathways. Importantly, SSAGES works with many of the
widely used simulation packages, and can simply be added on top of the simulations a researcher is already running.
SSAGES is implemented in a highly modular way, and is easily extended to incorporate a new method or to modify
an existing one and has been rigorously tested to ensure the accuracy of its calculations.
In short, SSAGES makes advanced sampling methods easy. We hope that it will do just that for your research.
4
Chapter 1. Introduction
CHAPTER
TWO
GETTING STARTED
Pre-Reqisites
Before you try to build SSAGES, make sure that you have the following packages installed:
Package
openmpi
boost
gcc
cmake
python
Required version
1.8 or higher
1.58 or higher
4.9 or higher
2.8 or higher
2.7
Package name in Ubuntu repository
openmpi-common, libopenmpi-dev
libboost-dev
gcc-4.9
cmake
python2.7
Get the source code
There are two ways of getting the source code for SSAGES: Download a ZIP file from github or clone the git repository.
We strongly recommend the second method as it allows you to easily stay up-to-date with the latest version.
To clone the git repository, call
‘ git clone https://github.com/MICCoM/SSAGES-public.git ‘
Build SSAGES
SSAGES currently works with two simulation engines: LAMMPS and Gromacs (we are striving to add more simulation engines soon). To build SSAGES with LAMMPS, you can use
mkdir build/\n`
cd build/
cmake .. -DLAMMPS=YES
make
or
mkdir build/\n`
cd build/
cmake .. -DGROMACS=YES
make
This set of commands will automatically download LAMMPS and build it together with SSAGES. Alternatively, you
can build SSAGES using a local copy of the MD engine source code (page 6).
5
SSAGES Documentation, Release 0.4.2-alpha
Run SSAGES
In order to run ssages you need to use run the executable followed by the input file. For example:
mpiexec -np 6 ./ssages Test.json
Where the -np flag dictates the total number of processors you need and Test.json is the input file. For specific examples
please see the Tutorials (page 9).
More information on how to run SSAGES with a specific simulation engine can be found here:
• How to run SSAGES on LAMMPS (page 7).
• How to run SSAGES with GROMACS (page 7).
Advanced options
In case these simple steps do not meet your need, you can find advanced information on building and running SSAGES
here.
Build SSAGES with local copy of MD source
The standard procedure to build SSAGES is to auto-download the source code for the simulation engine you intend to
use. This is done by providing the option -DLAMMPS=YES (for LAMMPS) or -DGROMACS=YES (for Gromacs) to
cmake. However, in many cases it will be necessary to build SSAGES using your local copy of the MD engine source
code. For example, if you have modified it to fit a special need LAMMPS or Gromacs does not support natively.
If you want to build SSAGES using a local copy of the MD engine source code, modify the cmake call to
cmake .. -DLAMMPS_SRC=/path/to/LAMMPS/src
if you are using LAMMPS and to
cmake .. -DGROMACS_SRC=/path/to/gromacs
if you are using Gromacs.
Warning: The current implementation of SSAGES will patch the Gromacs source. Thus, if you compile the
patched Gromacs source, it will no longer run. We are working to remedy this inconvenience.
SSAGES is not compatible with all versions of LAMMPS and Gromacs. The following versions of LAMMPS have
been tested extensively, but we are confident that SSAGES will also work with most other LAMMPS versions.
• 10 Aug 2015
• 7 Dec 2015
• 14 May 2016
• 15 Jul 2016
• 30 Jul 2016
The following version of Gromacs is supported
• 5.1.3
6
Chapter 2. Getting Started
SSAGES Documentation, Release 0.4.2-alpha
In contrast to LAMMPS, we are very confident that SSAGES will not work with other versions of Gromacs out of
the box. We are working hard to make SSAGES compatible with more versions of Gromacs, especially with the new
version 6.
How to run SSAGES on LAMMPS
Using SSAGES with LAMMPS is as simple as running LAMMPS by itself. SSAGES requires the same files and
commands that a stand-alone LAMMPS simulation requires, with one exception. LAMMPS input files usually contain
a ‘run’ command. Instead of a run command, SSAGES requires an extra ‘fix’ in the LAMMPS input file:
fix ssages all ssages
Additionally, SSAGES requires a .json file which specifies the method that will be used, collective variables, etc. The
MD Engine should also be specified
"type": "LAMMPS"
The number of MD steps of the simulation are specified in “MDSteps” of the .json file. LAMMPS log files for each
simulation are specified in the “logfile” of the .json file. If they are specified as “none”, no LAMMPS log files will be
generated.
All values in the .json file will be in the units specified in the LAMMPS input file.
How to use SSAGES with GROMACS
After compiling GROMACS with SSAGES, you can use all of GROMACS’ available tools to set up systems and
generate input files. The executable (gmx_mpi) is located in ssages/build/gromacs/bin.
As GROMACS has an in-depth documentation and getting started section, we will not dwell much on how to use these
tools to generate systems. For more information on Gromacs, read the Gromacs manual Getting Started section and
the official Gromacs documentation.
Briefly, a GROMACS input file (.tpr) requires the following three to generate:
1. A ‘box’ of particles to simulate (.gro file)
2. A topology that describes the forcefield and connectivity (.top file, optionally .itp files)
3. A simulation details file that sets many parameters such as which thermostat and barostat to use if any, timesteps,
integrator, saving frequency and many more (.mdp file)
For example, one can convert a protein .pdb file from an online database using GROMACS tools to generate a .gro
and a .top file. To generate an input file, use the gmx_mpi grompp command:
gmx_mpi grompp -f npt.mdp -p topol.top -c conf.gro -o input.tpr
Note: Note that currently, the gmx_mpi executable in the SSAGES folder will NOT function normally for running
regular GROMACS simulations via gmx_mpi mdrun.
After an energy minimization and brief NVT and NPT equilibration runs, you should be ready to use SSAGES with
your system. First, generate a .json for your SSAGES input. If using a single walker, the “inputfile” should be the
same as your .tpr file name. If using multiple walkers, you should number your input files right before the extension,
include a numberless file, and set the “inputfile” to be the same as the numberless. For example, if using four walkers,
you should set your “inputfile” to input.tpr and have the following in your folder:
• input.tpr
2.5. Advanced options
7
SSAGES Documentation, Release 0.4.2-alpha
• input0.tpr
• input1.tpr
• input2.tpr
• input3.tpr
The numberless input.tpr will not be used. Then, for each walker, set the “type” to “Gromacs”, and define the
number of MPI walkers to use for each walker with “number processors”. Finally, define your CV(s) and Methods,
either generally or for each walker. You can start your simulation by calling the ssages executable:
mpirun -np X ./ssages input.json
where X is the total number of MPI processes. For example, for three walkers with “number processors” :
2, 𝑋 = 3 * 2 = 6.
Normally, you can also define an observer in .json to automatically generate backups that will save both simulation
snapshots as well as method-critical data. However, this feature is not yet implemented for GROMACS.
There are example .gro, .mdp, .top, .tpr and .json inputs available in the Examples folder.
8
Chapter 2. Getting Started
CHAPTER
THREE
TUTORIALS
Basic User Tutorial
In your SSAGES directory:
cd Examples/User/Umbrella
To run a simulation using SSAGES, a .json file is needed. A .json file will tell SSAGES what method it should
run, what engine it will use, and what parameters to use for the method chosen. It will also tell SSAGES how many
simulations to run (walkers) and what engine-input file it should read. In this particular example, the engine will be
LAMMPS. A butane molecule is used as the example. All the appropriate LAMMPS input files are provided. The
LAMMPS input files contains the necessary information to perform the simulation. In Butane_SSAEGES.in you
will notice in the last line fix ssages all sages.
A template .json file (Template_Input.json) is provided which contains the necessary information for a
single umbrella simulation. The python code provided will use the template to generate a new .json file needed for the
simulation. The template .json file contains the name of the collective variable (CV) we will use, “Torsional”, and
the appropriate atom ids. Run the python code:
python Umbrella_Input_Generator.py
A new .json file (Umbrella.json) will appear with the correct number of entries. In this particular example, 12
different walkers are generated. Run SSAGES:
mpiexec -np 12 ./ssages Umbrella.json
where 12 is the number of processors. Since Umbrella.json contains 12 walkers, 12 processors should be used.
With that, SSAGES will perform Umbrella sampling on a butane molecule biasing the torsional CV. Output files will
be generated for each one of the walkers containing the iteration number, the target value for the CV, and the CV value
at the iteration number. These values can then be used for further analysis.
Method-specific tutorials
Adaptive Biasing Force (page 13)
Basis Function Sampling (page 16)
Finite Temperature String (page 20)
Forward Flux (page 30)
Image Method (page 32)
Metadynamics (page 31)
9
SSAGES Documentation, Release 0.4.2-alpha
Swarm (page 34)
Umbrella Sampling (page 36)
10
Chapter 3. Tutorials
CHAPTER
FOUR
META-DYNAMICS METHODS
Todo
Give a short overview over all SSAGES methods. Not more than 2 paragraphs.
List of Metadynamics Methods (alphabetical order):
Adaptive Biasing Force Algorithm
Introduction
Adaptive Biasing Force is, at its heart, a flat histogram method. Like many other methods that seek uniform sampling
over CV space such as Metadynamics, it adaptively biases the simulation until such diffusive sampling is achieved.
However, unlike metadynamics, ABF does not estimate the free energy surface. Rather, it directly estimates the
derivative of the free energy in CV directions - the generalized force on that CV by the system.
In practice, this translates to histogramming coordinates in CV space with an instantaneous estimation of the free
energy derivative. This instantaneous estimate fluctuates around the true, global free energy derivative at that point,
but the average quickly converges to the real value. Then, the free energy derivatives can be integrated much like
Thermodynamic Integration to get the free energy surface.
Thus, ABF gives a vector field and not a free energy surface.
An excellent write-up on the method can be found here.
Details on the specific implementation used in SSAGES can be found here.
Options & Parameters
Adaptive Biasing Force Method
• Calculate the generalized force on CVs at each timestep
• Bias with the negative of the estimated generalized force
• Define a CV range. Outside of the CV range, there will be no bias, and no histogram hits will be collected.
• Can optionally define a restraint range. Outside this range, a harmonic restraint of user-chosen spring constant
will drive the CV back into the range. This range should be WIDER than the CV range by at least one bin size
in each direction. To disable restraints, enter a spring constant k equal to or less than zero.
• Currently, CV restraints cannot handle periodicity, but this feature will be implemented soon.
11
SSAGES Documentation, Release 0.4.2-alpha
How to define the ABF Method: "type" :
"ABF"
CV_lower_bounds array of doubles (nr of CVs) long. This array defines the minimum values for the CVs for the
range in which the method will be used in order.
CV_upper_bounds array of doubles (nr of CVs) long. This array defines the minimum values for the CVs for the
range in which the method will be used in order.
CV_bins array of doubles (cr of CVs) long. This array defines the number of histogram bins in each CV dimension
in order.
CV_restraint_minimums array of doubles (cr of CVs) long. This array defines the minimum values for the CV
restraints in order.
CV_restraint_maximums array of doubles (cr of CVs) long. This array defines the maximum values for the CV
restraints in order.
CV_restraint_spring_constants array of doubles (cr of CVs) long. This array defines the spring constants for the
CV restraints in order. Enter a value equal to or less than zero to turn restraints off.
timestep double. The timestep of the simulation. Units depend on the conversion factor that follows.
minimum_count integer. Number of hits in a histogram required before the full bias is active for that bin. Below
this value, the bias linearly decreases to equal 0 at hits = 0. Default = 100, but user should provide a reasonable
value for their system.
filename string. Name of the file to save Adaptive Force Vector Field information to - this is what’s useful
backup_frequency integer. Saves the histogram of generalized force every this many timesteps.
unit_conversion double. Unit conversion from d(momentum)/d(time) to force for the simulation. For LAMMPS
using units real, this is 2390.06 (gram.angstrom/mole.femtosecond^2 -> kcal/mole.angstrom) For GROMACS,
this is 1.
frequency 1. OPTIONAL Leave at 1.
F array of doubles bins1xbins2x...binsnCV long OPTIONAL Option to provide an initial starting histogram. This is
the summed force component.
N array of integers bins1xbins2x...binsnCV long OPTIONAL Option to provide an initial starting histogram. This is
the number of hits component.
Example input
"method" : {
"type" : "ABF",
"CV_lower_bounds" : [-3.13, -3.13],
"CV_upper_bounds" : [3.13,3.13],
"CV_bins" : [91,91],
"CV_restraint_minimums" : [-5,-5],
"CV_restraint_maximums" : [5,5],
"CV_restraint_spring_constants" : [0,0],
"timestep" : 0.002,
"minimum_count" : 200,
"filename" : "F_out",
"backup_frequency" : 10000,
"unit_conversion" : 1,
"frequency" : 1
}
12
Chapter 4. Meta-Dynamics Methods
SSAGES Documentation, Release 0.4.2-alpha
Output
The main output of the method is stored in a file specified in ‘filename’. This file will contain the Adaptive Force
vector field printed out every ‘backup_frequency’ steps and at the end of a simulation. The method outputs a vector
field, with vectors defined on each point on a grid that goes from (CV_lower_bounds) to (CV_upper_bounds) of each
CV in its dimension, with (CV_bins) of grid points in each dimension. For example, for 2 CVs defined from (-1,1)
and (-1,0) with 3 and 2 bins respectively would be a 3x2 grid (6 grid points). The printout is in the following format:
2*N number of columns, where N is the number of CVs. First N columns are coordinates in CV space, the N+1 to 2N
columns are components of the Adaptive Force vectors. An example for N=2 is:
CV1 Coord
-1
-1
0
0
1
1
CV2 Coord
-1
0
-1
0
-1
0
d(A)/d(CV1)
-1
2
1
2
2
3
d(A)/d(CV2)
1
1
2
3
4
5
Tutorial
Find the following input files in Examples/User/ABF/Example_AlanineDipeptide:
For LAMMPS (must be build with RIGID package):
• in.ADP_ABF_Example(0-7) (9 files)
• example.input
• ADP_ABF_1walker.json
• ADP_ABF_8walkers.json
1. Put the ABF_ADP_LAMMPS_Example folder in your ssages build folder
2. For a single walker example, do:
mpirun -np 1 ./ssages -ADP_ABF_1walker.json.json
For 8 walkers, do:
mpirun -np 8 ./ssages -ADP_ABF_8walkers.json
Multiple walkers initiated from different seeds will explore different regions and will all contribute to the same adaptive
force.
3. After the run is finished open F_out and copy the last grid that defined the Adaptive Force vector field (all
numbers in four columns after the last line of text)
4. Paste into any new folder, run ABF_1D_2D_gradient_integrator.py (requires numpy, scipy and matplotlib)
For GROMACS:
Optional:
• adp.gro
• topol.top
• nvt.mdp
Required:
4.1. Adaptive Biasing Force Algorithm
13
SSAGES Documentation, Release 0.4.2-alpha
• example_adp(0-7).tpr (9 files)
• ADP_ABF_1walker.json
• ADP_ABF_8walkers.json
1. Put the ABF_ADP_Gromacs_Example in your ssages build folder
2. For a single walker example, do:
mpirun -np 1 ./ssages -ABF_AlaDP_1walker.json
For 8 walkers, do:
mpirun -np 8 ./ssages -ABF_AlaDP_8walkers.json
These will run using the pre-prepared input files in .tpr format. If you wish to prepare input files yourself using
GROMACS tools:
gmx grompp -f nvt.mdp -p topol.top -c adp.gro -o example1.tpr
Be sure to change the seed in .mdp files for random velocity generation, so walkers can explore different places on the
free energy surface.
Developer
Emre Sevgen
Basis Function Sampling
Introduction
The Basis Function enhanced sampling method is a variant of the Continuous Wang-Landau Sampling method developed by Whitmer et al, which biases a PMF through the summation of Kronecker deltas. In this method, the Kronecker
delta is approximated by projection of a locally biased histogram to a truncated set of orthogonal basis functions.
∫︁
⃗ 𝑗 (𝜉)𝑤(
⃗ 𝜉)𝑑
⃗ 𝜉⃗ = 𝛿𝑖 𝑐𝑖
𝑓𝑖 (𝜉)𝑓
Ξ
By projecting a basis set, the system resolves the same properties as the Kronecker deltas, but in a continuous and
differentiable manner that lends well towards MD simulations. Currently in SSAGES, Legendre Polynomials have
been implemented to work with BFS. These have the property where the weight 𝑤(𝜉) = 1 and are defined on the
interval [−1, 1].
The method applies its bias in sweeps of $N$ through a histogram (𝐻𝑖 ) that is updated at every 𝑗 microstate or timestep.
This histogram is then modified to an unbiased partition function estimate (𝐻˜𝑖 ) by convolution with the current bias
potential (Φ𝑖 ).
˜ 𝑖 (𝜉) = 𝐻𝑖 (𝜉)𝑒𝛽Φ𝑖
𝐻
In order to account for sampling history into this partition function estimate, a simple weight function (𝑊 (𝑡𝑗 )) is
added.
∑︁
𝑍𝑖 (𝜉) =
𝑊 (𝑡𝑗 )𝐻˜𝑗 (𝜉)
𝑗
14
Chapter 4. Meta-Dynamics Methods
SSAGES Documentation, Release 0.4.2-alpha
This final estimate is then projected to the truncated basis set. After this set is evaluated, the coefficients of the basis
set are evaluated. This process is iterated until the surface converges, which is determined by the overall update of the
coefficients.
𝛽Φ𝑖+1 (𝜉) =
𝑁
∑︁
𝛼𝑗𝑖 𝐿𝑗 (𝜉)
𝑗
𝛼𝑗𝑖 =
2𝑗 + 1
2
∫︁
1
log(𝑍𝑖 (𝜉))𝐿𝑗 (𝜉)𝑑𝜉
−1
Options & Parameters
These are all the options that SSAGES provides for running Basis Function Sampling. In order to add BFS to the
JSON file, the method should be labeled as “Basis”.
CV coefficients The order of the polynomial to be projected for each collective variable. If the order of this array
doesn’t match the number of CVs, the system assumes the first number for all of the CVs
CV restraint spring constants The strength of the springs keeping the system in bounds in a non-periodic system.
CV restraint maximums The upper bounds of each CV in a non-periodic system.
CV restraint minimums The lower bounds of each CV in a non-periodic system.
cycle frequency The frequency of updating the projection bias.
frequency The frequency of each integration step. This should almost always be set to 1.
weight The weight of each visited histogram step. Should be kept at 1.0, but the option is available to make it slightly
greater. The system has a higher chance of exploding at higher weight values.
basis filename A suffix to name the output file. If not specified the output will be “basis.out”
coeff filename A suffix to name the coefficient file.
temperature Only should be used if the MD engine cannot produce a good temperature value. (ie: LAMMPS with 1
particle)
tolerance Convergence criteria. The sum of the difference in subsequent updates of the coefficients squared must be
less than this for convergence to work.
convergence exit A boolean option to let the user choose if the system should exit once the convergence is met.
Required to Run BFS
In order to use the method properly a few things must be put in the JSON file. A grid is required to run Basis Function
Sampling. Refer to the Grid section in order to understand options available for the grid implementation. The only
inputs required to run the method:
• cyclefrequency
• frequency
• CV coefficients
Guidelines for running BFS
• It is generally a good idea to use polynomials of order at least 25.
4.2. Basis Function Sampling
15
SSAGES Documentation, Release 0.4.2-alpha
• For higher order polynomials, the error in projection is less, but the number of bins must increase in order to
accurately project the surface.
• A good rule of thumb for these simulations is to do at least one order of magnitude more bins than polynomial
order.
If the system that is to be used requires a non-periodic boundary condition, then it is typically a good idea to place the
bounds approximately 0.1 - 0.2 units outside the grid boundaries.
The convergence exit option is available if the user chooses to continue running past convergence, but a good heuristic
for tolerance is around 1e−6.
Tutorial
This tutorial will provide a reference for running BFS in SSAGES. There are multiple examples provided in the Examples/User directory of SSAGES, but this tutorial will cover the Alanine Dipeptide example. In the ADP subdirectory
of the Examples/User section there should be a LAMMPS input file (titled in.BFS_ADP_shake) and two
JSON input files. Both of these files will work for SSAGES, but the one titled BFS_AdP_rst.json makes use of
the restart capability in SSAGES.
After compiling SSAGES with the user’s version of LAMMPS with the make rigid=yes option chosen, the user
can elect to run the example.
(NOTE: if the user did not compile lammps with the rigid option, then the other lammps file can be used. Just change
the input file variable in the json file to in.BFS_ADP) Use the following command to run the example:
mpiexec -np 1 /path/to/SSAGES/build/dir/ssages BFS_AdP_rst.json
This should prompt SSAGES to begin an alanine dipeptide run. If the run is successful, the console will output the
current sweep number on each node. At this point the user can elect to read the output information after each sweep.
If at any point during the run, the user elects to stop running and then pickup where the simulation was left off, simply
execute SSAGES with the newly generated restart file (BFS_AdP_restart.json).
basis.out
The basis.out file outputs in at least 4 columns. These columns refer to the CV values, the ultimate projected PMF,
the unprojected PMF, and the biased histogram values. Depending on the number of CVs chosen for a simulation, the
number of CV columns will also correspond. Only the first CV column should be labeled.
The important line for graphing purposes is the projected PMF, which is the basis set projection from taking the log
of the biased histogram. The biased histgram is printed so that it can be read in for doing restart runs (subject to
change). For plotting the PMF, a simple plotting tool over the CV value and projected PMF columns will result in the
free energy surface of the simulation. The free energy surface will return a crude estimate within the first few sweeps,
and then will take a longer period of time to retrieve the fully converged surface. A reference image of the converged
alanine dipeptide example is provided in the same directory as the LAMMPS and JSON input files.
coeff.out
This holds all the coefficient values after each bias projection update. This file is entirely used for restart runs.
Developer
Joshua Moller.
16
Chapter 4. Meta-Dynamics Methods
SSAGES Documentation, Release 0.4.2-alpha
Elastic Band
Introduction
There are many methods, several of which are included in SSAGES, to calculate transition pathways between
metastable states. One kind of pathway between states in the minimum energy pathway (MEP), quite simply the
lowest energy pathway a system can take between these states. An MEP has the condition that the force everywhere
along the pathway points only along the path, that is, it has no perpendicular component. By finding the MEP, one also
finds the saddle points of the potential energy surface, as they are by definition the maxima of the MEP. The nudged
elastic band (NEB) method is a popular and efficient method to calculate the MEP between the initial and final state
of a transition 1 2 .
The method involves the evolution of a series of images connected by a spring interaction (hence the “elastic” nature
of the band). The force acting on the images (a combination of the spring force along the band and the true force
acting perpendicular to the band) is minimized to ensure convergence to the MEP. The nudged nature of NEB refers
to a force projection that ensures the spring forces do not interfere with the elastic band converging to the MEP, as well
as that the true force does not alter the distribution of images along the band (that is, it ensures all the images do not
fall into the metastable states). This projection is accomplished by using the parallel portion of the spring force and
the perpendicular portion of the true force. In this way, the spring forces act similarly to reparameterization schemes
common to the string method.
Full mathematical background is available in the references, but a brief overview is given here. The band is discretized
as a series of N+1 images, and the force on each image is given by:
𝑠
𝐹𝑖 = 𝐹𝑖,‖
− ∇𝐸(𝑅𝑖 )⊥
𝑠
Where 𝐹𝑖 is the total force on the image, 𝐹𝑖,‖
refers to the parallel component of the spring force on the ith image, and
∇𝐸(𝑅𝑖 )⊥ is the perpendicular component of the gradient of the energy evaluated at each image 𝑅𝑖 . The second term
on the right hand side is the “true force” and is evaluated as:
∇𝐸(𝑅𝑖 )⊥ = ∇𝐸(𝑅𝑖 ) − ∇𝐸(𝑅𝑖 ) · 𝜏ˆ𝑖
The term 𝜏ˆ𝑖 represents the normalized local tangent at the ith image, and thus this equation states simply that the
perpendicular component of the gradient is the full gradient minus the parallel portion of the gradient. There are
different schemes available in literature to evaluate the tangent vector 2 . The “spring force” is calculated as:
𝑠
𝐹𝑖,‖
= 𝑘 (|𝑅𝑖+1 − 𝑅𝑖 | − |𝑅𝑖 − 𝑅𝑖−1 |) · 𝜏ˆ𝑖
Where 𝑘 is the spring constant, which can be different for each image of the band. One can evolve the images with
these forces according to any number of schemes - a straightforward Verlet integration scheme is used in the SSAGES
implementation, described below.
Algorithmically, the NEB method is implemented in SSAGES as follows:
1. An initial band is defined between the two states of interest. This can be defined however one wishes; often it is
simply a linear interpolation through the space of the collective variables. In fact, the ends of the band need not
necessarily be in the basins of interest; the method should allow the ends to naturally fall into nearby metastable
basins.
2. For each image of the band, a molecular system with atomic coordinates that roughly correspond to the collective
variables of that image is constructed. A period of equilibration is performed to ensure that the underlying
systems’ CVs match their respective band images.
1 G. Henkelman, B. P. Uberuaga, and H. Jónsson, A climbing image nudged elastic band method for finding saddle points and minimum energy
paths. J. Chem. Phys. 113, 9901 (2000).
2 G. Henkelman, and H. Jónsson, Improved tangent estimate in the nudged elastic band method for finding minimum energy paths and saddle
points. J. Chem. Phys. 113, 9978 (2000).
4.3. Elastic Band
17
SSAGES Documentation, Release 0.4.2-alpha
3. The gradient is sampled over a user-defined period of time and intervals, this being the only quantity with
statistical variance that needs to be averaged over.
4. When sufficient sampling of the gradient is done, the band is updated one time-step forward with a simple Verlet
scheme.
Steps two through four are iterated upon, leading to convergence of the method and the MEP.
Options & Parameters
These are all the options that SSAGES provides for running the NEB method. In order to add NEB to the JSON file,
the method should be labeled as “ElasticBand”.
centers For each driver, the initial values of each CV should be specified as a list under “centers”. In this way, the
initial band is defined.
number samples A specification of how many times to sample the gradient during the umbrella sampling portion of
the method.
time step The time step used in the Verlet integration of the force to update the images of the band.
ksprings The constant used in performing umbrella sampling for the simulation; it can be specified uniquely for each
image and for each CV. Please notice its difference from kstring.
kstring The constant used in calculating the spring force at each image. It can be specified uniquely for each image.
Please notice its difference from kpsrings.
frequency The frequency of each integration step. This should almost always be set to 1.
equilibration steps The number of MD steps to simply perform umbrella sampling without invoking the NEB
method. A sufficiently long number of steps ensures that the underlying molecular systems have CVs close
to the CVs of their associated image on the band.
max iterations The simulation will terminate after the specified number of iterations.
evolution steps The number of steps to perform the NEB over; the band is updated after evolution steps times the
number of samples total MD steps. A new value of the gradient is harvested every time the number of MD steps
taken is an integer multiple of evolution steps.
Tutorial
This tutorial will walk you step by step through the user example provided with the SSAGES source code that runs the
NEB method on the alanine dipeptide using LAMMPS. First, be sure you have compiled SSAGES with LAMMPS.
Then, navigate to the SSAGES/Examples/User/ElasticBand/ADP subdirectory. Now, take a moment to observe the in.ADP_Test and data.input files in order to familiarize yourself with the system being simulated.
The next two files of interest are the EB_Template.json input file and the EB_Input_Generator.py script.
Both of these files can be modified in your text editor of choice to customize the inputs, but for this tutorial, simply
observe them and leave them be. EB_Template.json contains all the information necessary to fully specify one driver;
EB_Input_Generator.py copies this information a number of times specified within the script (for this tutorial, 12
times) while also linearly interpolating through the start and end states defined in the script and substituting the correct
values into the “centers” portion of the method definition. Execute this script as follows:
python EB_Input_Generator.py
You will produce a file called EB.json. You can also open this file to verify for yourself that the script did what it
was supposed to do. Now, with your JSON input and your SSAGES binary, you have everything you need to perform
a simulation. Simply run:
18
Chapter 4. Meta-Dynamics Methods
SSAGES Documentation, Release 0.4.2-alpha
mpiexec -np 12 ./ssages EB.json
Soon, the simulation will produce a node-X.log file for each driver, where X is the number specifying the driver
(in this case, 0-11 for our 12 drivers). Each one will report the following information, in order: the node number, the
iteration number, and for each CV, the current value of the band CV as well as the current value of the CV calculated
from the molecular system.
Allow your system to run for the specified number of iterations (2000 for this tutorial). The last line of every node file
can be analyzed to view the last positons of each image of the elastic band.
Developer
Ben Sikora.
References
Finite Temperature String
Introduction
Along with Nudged Elastic Band and Swarm of Trajectories, Finite Temperature String Method (FTS) is a chainof-states method. As in other chain-of-states methods, multiple copies of a system are simulated, with each copy
(“image”) corresponding to a different state of the system along some proposed transition pathway. In FTS, each
image is associated with a node along a smooth curve through in collective variable space, representing a transition
pathway.
The goal of FTS is to evolve the path of this smooth curve or “string” until it approximates a transition pathway
by finding the principal curve, which by definition intersects each of the perpendicular hyperplanes that it passes
through at the expected value of each hyperplane. As such, a principal curve is often referred to as being its own
expectation. Rather than sampling along each hyperplane belonging to each node along the string, we use the Voronoi
approximation introduced by Vanden-Eijnden and Venturoli in 2009 1 . We associate each node along the string with a
corresponding Voronoi cell, consisting of the region in state space where any point is closer to its origin node than any
other node along the string. Each image is free to explore within the bounds of its associated Voronoi cell. To evolve
the string toward its own expectation, the string is evolved toward the running averages in CV space for each image
along the string.
The evolution of the string can be broken down into the following steps:
1. Evolve the individual images with some dynamics scheme, using the location of the initial image as a starting
point. Only keep the new update at each time step if it falls within the Voronoi cell of its associated image; if
the updated position leaves the Voronoi cell, the system is returned back to the state at the previous timestep.
2. Keep track of a running average of locations visited in CV space for each image.
3. Update each node on the string toward the running average while keeping the path smooth; specific equations
can be found in 1 .
4. Enforce parametrization (ex. interpolate a smooth curve through the new node locations, and redistribute the
nodes to new locations along the smooth curve such that there is equal arc length between any two adjacent
nodes).
1
5. Vanden-Eijnden and M. Venturoli, J. Chem. Phys. 130, 194103 (2009).
4.4. Finite Temperature String
19
SSAGES Documentation, Release 0.4.2-alpha
5. After images have been moved, their respective Voronoi cells have also changed. Check that each image still
falls within the new Voronoi cell of its associated image. If the image is no longer in the correct Voronoi cell,
the system must be returned to the Voronoi cell.
6. Return to step 1 and repeat until convergence (ex. until change in the string falls below some tolerance criteria
or stop iterating after a certain number of string method iterations)
Options & Parameters
The following parameters need to be set under “method” in the JSON input file:
"type" : "String"
"flavor" : "FTS"
The following options are available as FTS inputs:
centers (required) Array containing this image’s coordinates in CV space
ksprings (required) Array of spring constants corresponding to each CV Used to ensure that each simulation remains
within its own respective Voronoi cell
block_iterations (required) (int) Number of integration steps to perform before updating the string.
Default value is 2000.
time_step (required) (double) Parameter used for updating the string (∆𝜏 in 1 ).
Default value is 0.1.
kappa (required) (double) Parameter used for smoothing the string (𝑘𝑎𝑝𝑝𝑎 in 1 ).
Default value is 0.1.
frequency (required) (int) Frequency to perform integration; should almost always be set to 1.
Default value is 1.
max_iterations (required) (int) Maximum number of string method iterations to perform.
tolerance (required) Array of tolerance values corresponding to each CV. Simulation will stop after tolerance criteria
has been met for all CVs
iteration (int) Value of initial string method iterator.
Default value is 0 (corresponding to new FTS run).
Tutorial
Two examples for running FTS can be found in the Examples/User/FTS directory. This tutorial will go through
running FTS on a 2D single particle system, using LAMMPS as the MD engine. The necessary files are found in
Examples/User/FTS/Langevin, which should contain the following:
in.LAMMPS_Meta_Test LAMMPS input file; sets up 1 particle on a 2D surface with two Gaussian wells of
different depths (at (−0.98, −0.98) and at (0.98, 0.98)) and one Gaussian barrier at the origin.
Template_Input.json Template JSON input containing information for one image on the string. We are looking
at two CVs: x and y coordinate. We will use Input_Generator.py to use this template to create a JSON
input file containing information for all string images.
Input_Generator.py Python script for creating FTS JSON input file.
20
Chapter 4. Meta-Dynamics Methods
SSAGES Documentation, Release 0.4.2-alpha
After compiling SSAGES with LAMMPS, we will use Input_Generator.py to create a JSON input file for FTS.
Run this script
python Input_Generator.py
to create a file called FTS.json. A string with 16 images is initalized on the 2D surface, evenly spaced on a straight
line from (−0.7, −0.5) to (0.7, 1.0). If you take a look at FTS.json, you will see that the information in the template
file has been replicated for each of the 16 nodes on the string, but with the value of “centers” changed.
Once FTS.json has been generated, we can run the example with the following command:
mpirun -np 16 /path/to/SSAGES/build/./ssages FTS.json
As SSAGES runs, a series of output files are generated:
log-MPI_ID-x LAMMPS output for each of the 16 nodes on the string.
node-00xx.log FTS output for each of the 16 nodes on the string. The first column contains the image number (015). The second column contains the iteration number. The remaining columns list the location of the image and
the instantaneous value for each of the CVs. For this example we have two CVs (x coordinate and y coordinate),
so the remaining columns are (from left to right): x coordinate of the string node, instantaneous x coordinate of
the particle, y coordinate of the string node, instantaneous y coordinate of the particle.
To visualize the string, we can plot the appropriate values from the last line of each node-00xx.log file. For
example, one can quickly plot the final string using gnuplot with the command
plot "< tail -n 1 node*" u 3:6+
The following image shows the initial string in blue, compared with the final string plotted in green:
The two ends of the string have moved to the two energy minima (at (−0.98, −0.98) and (0.98, 0.98)), and the center
of the string has curved away from the energy barrier at the origin.
Developers
Ashley Guo, Ben Sikora, Yamil Colón
References
Forward-Flux
Forward Flux Sampling (FFS) is a specialized method to simulate “rare events” in non-equilibrium and equilibrium
systems with stochastic dynamics. Several review articles in the literature present a comprehensive perspective on the
basics, applications, implementations, and recent advances of FFS Here, we provide a brief general introduction to
FFS, and describe the Rosenbluth-like variant of forward flux method, which is implemented in SSAGES. We also
explain various options and variables to setup and run an efficient FFS simulation using SSAGES.
Introduction
Rare events occur infrequently in nature mainly due to significant activation energies necessary to take a system
from an initial state (commonly referred to as “A”) to a final state (or state “B”). The outcomes of rare events are
generally substantial and thereby it is essential to obtain a molecular-level understanding of the mechanisms and
kinetics of these events. Examples of small-scale rare events include conventional nucleation and growth phenomenon,
folding/unfolding of large proteins, and non-spontaneous chemical reactions. “Thermal fluctuations” commonly drives
the systems from an initial state to a final state over an energy barrier ∆𝐸. The transition frequency from state A to
4.5. Forward-Flux
21
SSAGES Documentation, Release 0.4.2-alpha
22
Chapter 4. Meta-Dynamics Methods
SSAGES Documentation, Release 0.4.2-alpha
−Δ𝐸
state B is proportional to 𝑒 𝑘𝐵 𝑇 , where 𝑘𝐵 𝑇 is the thermal energy of the system. Accordingly, the time required
for an equilibrated system in state A to reach state B grows exponentially (at a constant temperature) as the energy
barrier ∆𝐸 become larger. Eventually none or only a few transitions may occur within the typical timescale of
molecular simulations. In FFS method several intermediate states or so-called interfaces (𝜆𝑖 ) are placed along a
“reaction coordinate” or an “order parameter” between the initial state A and the final state B (Figure 1). These
intermediate states are chosen such that the energy barrier between adjacent interfaces are readily surmountable using
typical simulations. Using the stored configurations at an interface, several attempts are made to arrive at the next
interface in the forward direction (the order parameter must increase monotonically when going from A to B). This
incremental progress makes it more probable to observe a full transition path from state A to state B. FFS uses positive
flux expression to calculate rate constant. The system dynamics are integrated forward in time and therefore detailed
balance is not required.
Fig. 4.1: In Forward Flux sampling method, dimensionality of the system is reduced by choosing one or more
“reaction coordinate” or “order parameter”. Several equally-spaced intermediate states are placed along the order
parameter to link the initial state A and the final state B. Incremental progress of the system is recorded and analyzed
to obtain relevant kinetic and thermodynamic properties.
Several protocols of forward flux method have been adopted in the literature to
1. generate the intermediate configurations,
2. calculate the conditional probability of reaching state B starting from state A, 𝑃 (𝜆𝐵 = 𝜆𝑛 |𝜆𝐴 = 𝜆0 ),
3. compute various thermodynamic properties, and
4.5. Forward-Flux
23
SSAGES Documentation, Release 0.4.2-alpha
4. optimize overall efficiency of the method. The followings are the widely-used variants of forward flux sampling
method:
• Direct FFS (DFFS)
• Branched Growth FFS (BGFFS)
• Rosenbluth-like FFS (RBFFS)
• Restricted Branched Growth FFS (RBGFFS)
• FFS Least-Squares Estimation (FFS-LSE)
• FF Umbrella Sampling (FF-US)
Rosenbluth-like FFS (RBFFS) has been implemented in the current version of SSAGES. Direct FFS (DFFS) and
Branched growth FFS (BGFFS) will be included in the future release of SSAGES.
Rate Constant and Initial Flux
The overall rate constant or the frequency of going from state A to state B is computed using the following equation:
𝑘𝐴𝐵 = Φ𝐴,0 · 𝑃 (𝜆𝑁 |𝜆0 )
here, Φ𝐴,0 is the initial forward flux or the flux at the initial interface, and 𝑃 (𝜆𝑁 |𝜆0 ) is the conditional probability
of the trajectories that initiated from A and reached B before returning to A. In practice, Φ𝐴,0 can be obtained by
simulating a single trajectory in State A for a certain amount of time 𝑡𝐴 , and counting the number of crossing of the
initial interface 𝜆0 . Alternatively, a simulation may be carried out around state A for an unlimited period of time until
𝑁0 number of accumulated checkpoints is stored (this has been implemented in SSAGES):
Φ𝐴,0 =
𝑁0
𝑡𝐴
here, 𝑁0 is the number of instances in which 𝜆0 is crossed, and 𝑡𝐴 is the simulation time that the system was run
around state A. Note that
1. 𝜆0 can be crossed in either forward (𝜆𝑡 < 𝜆0 ) or backward (𝜆𝑡 > 𝜆0 ) directions, but only “forward crossing”
marks a checkpoint (see Figure 2) and
2. 𝑡𝐴 should only include the simulation time around state A and thereby the portion of time spent around state B
must be excluded, if any.
In general, the conditional probability is computed using the following expression:
𝑃 (𝜆𝑛 |𝜆0 ) =
𝑛−1
∏︁
𝑃 (𝜆𝑖+1 |𝜆𝑖 ) = 𝑃 (𝜆1 |𝜆0 ) · 𝑃 (𝜆2 |𝜆1 ) . . . 𝑃 (𝜆𝑛 |𝜆𝑛−1 )
𝑖=0
𝑃 (𝜆𝑖+1 |𝜆𝑖 ) is computed by initiating a large number of trials from the current interface and recording the number of
successful trials that reaches the next interface. The successful trials in which the system reaches the next interface
are stored in the memory and used as checkpoints in the next interface. The failed trajectories that go all the way back
to state A are terminated. Different flavors of forward flux method use their unique protocol to select checkpoints to
initiate trials at a given interface, compute final probabilities, create transitions paths, and analyze additional statistics.
24
Chapter 4. Meta-Dynamics Methods
SSAGES Documentation, Release 0.4.2-alpha
Fig. 4.2: A schematic representation of computation of initial flux using a single trajectory initiated in state A.
The simulation runs for a certain period of time 𝑡𝐴 and number of forward crossing is recorded. Alternatively, we
can specify the number of necessary checkpoints 𝑁0 and run a simulation until desired number of checkpoints are
collected. In this figure, green circles show the checkpoints that can be used to generate transition paths.
4.5. Forward-Flux
25
SSAGES Documentation, Release 0.4.2-alpha
Rosenbluth-like Forward Flux Sampling (RBFFS)
Rosenbluth-like Forward Flux Sampling (RBFFS) method is an adaptation of Rosenbluth method in polymer sampling
to the simulation of rare events 4 . The RBFFS is comparable to Branched Growth Forward Flux (BGFFS) 1 2 but, in
contrast to BGFFS, a single checkpoint is randomly selected at a non-initial interface instead of initiation of trials
from all checkpoints at a given interface (Figure 3). In RBFFS, first a checkpoint at 𝜆0 is selected and 𝑘0 trials are
initiated. The successful runs that reach 𝜆1 are stored and the rest that go back to A are terminated. Next, one of the
checkpoints at 𝜆1 is randomly chosen (in contrast to Branched Growth where all checkpoints are involved), and 𝑘1
trials are initiated to 𝜆2 . Last, this procedure is continued for the following interfaces until state B is reached or all
trials fail. This algorithm is then repeated for the remaining checkpoints at 𝜆0 to generate multiple “transition paths”.
Fig. 4.3: Rosenbluth-like Forward Flux Sampling (RBFFS) involves sequential generation of unbranched transition
paths from all available checkpoints at the first interface 𝜆0 . A single checkpoint at the interface 𝜆𝑖>0 is randomly
marked and 𝑘𝑖 trials are initiated from that checkpoint which may reach to the next interface 𝜆𝑖+1 (successful trials)
or may return to state A (failed trial).
In Rosenbluth-like forward flux sampling, we choose one checkpoint from each interface independent of the number
of successes. The number of available checkpoints at an interface are not necessarily identical for different transition
4 M. N. Rosenbluth, A. W. Rosenbluth, Monte-Carlo Calculation of the Average Extension of Molecular Chains. J. Chem. Phys. 1955, 23 (2),
356-359.
1 R. J. Allen, C. Valeriani, P. R. ten Wolde, Forward Flux Sampling for Rare Event Simulations. J Phys-Condens Mat 2009, 21 (46).
2 F. A. Escobedo, E. E. Borrero, J. C. Araque, Transition Path Sampling and Forward Flux Sampling. Applications to Biological Systems. J
Phys-Condens Mat 2009, 21 (33).
26
Chapter 4. Meta-Dynamics Methods
SSAGES Documentation, Release 0.4.2-alpha
paths 𝑝. This implies that more successful transition paths are artificially more depleted than less successful paths.
Therefore, we need to enhance those extra-depleted paths by reweighting them during post-processing. The weight of
path 𝑝 at the interface 𝜆𝑖 is given by:
𝑤𝑖,𝑏 =
𝑖−1
∏︁
𝑆𝑗,𝑝
𝑘𝑗
𝑗=0
where 𝑆𝑗,𝑝 is the number of successes at the interface 𝑗 for path 𝑝. The conditional probability is then computed using
the following expression:
𝑃 (𝜆𝑛 |𝜆0 ) =
𝑛−1
∏︁
∏︀𝑛−1 ∑︀
𝑃 (𝜆𝑖+1 |𝜆𝑖 ) =
𝑖=0
𝑖=0
∑︀𝑝
𝑝
𝑤𝑖,𝑝 𝑆𝑖,𝑝 /𝑘𝑖
𝑤𝑖,𝑝
Σ here runs over all transition paths in the simulation.
Options & Parameters
To run a RBFFS simulation using SSAGES, an input file in JSON format is required along with a general input file designed for your choice of molecular dynamics engine (MD engine). For your convenience, two files
Template_Input.json and FF_Input_Generator.py are provided to assist you in generating the JSON
file. Here we describe the parameters and options that should be set in Template_Input.json file in order to
successfully generate an input file and run a RBFFS simulation.
Input and parameters related to “driver”
type
• Type: string
• Default: “LAMMPS”
• Functionality: Defines the preferred MD engine for running the actual simulation. You are encouraged
to read the documentation page of the corresponding MD package to learn about input files and different
options of that package.
num processors
• Type: integer
• Default: 1
• Functionality: Sets the number of processors that each individual drivers uses to run the simulation. In
current version of SSAGES, drivers can only use one processor.
MDSteps
• Type: integer
• Default: 1000000000
• Functionality: Sets the maximum number of MD steps allowed for the FFS simulation on a given walker.
We recommend defining a large number here to ensure that the simulation is completed before reaching
that many steps. SSAGES will exit upon completion of the FFS simulation.
logfile
• Type: string
• Default: “none”
4.5. Forward-Flux
27
SSAGES Documentation, Release 0.4.2-alpha
• Functionality: Sets the name of engine-dependent log file that MD engine uses to write the simulation
information including timesteps, energies, etc.
Input and parameters related to “method”
type
• Type: string
• Default: “ForwardFlux”
• Functionality: Specifies that “ForwardFlux” module of SSAGES will be activated. Don’t change this if
you plan to run a forward flux sampling simulation.
index_file
• Type: string
• Default: none
• Functionality: Stores interface information in the format: Interface filename origin The file-naming
scheme is based on the interface the simulation is on and cumulative hash number. Origin is the filename
of the file the trajectory was previously fired from to reach the current interface position.
library_file
• Type: string
• Default: “library_input.dat”
• Functionality: Sets the name of the file that stores the checkpoints at the initial interface by running a serial
trajectory around state A.
results_file
• Type: string
• Default: “results.dat”
• Functionality: Specifies the name of the file in which the results of the forward flux simulation is stored.
This file can later be helpful for post-processing purposes.
centers
• Type: array
• Default: none
• Functionality: Defines an array of intermediate interfaces that links the initial state A to the final state B.
This array can either be defined in the Template_Input.json file or FF_Input_Generator.py
file. In the latter case, the values of centers is left blank in the Template_Input.json file.
generate_configs
• Type: integer
• Default: 1
• Functionality: Defines the number of checkpoints/configurations that ought to be generated at the first
interface, i.e. .
shots
• Type: integer
• Default: 1
28
Chapter 4. Meta-Dynamics Methods
SSAGES Documentation, Release 0.4.2-alpha
• Functionality: Sets the number of trials that should be initiated from the randomly selected checkpoints
at an interface (at the initial interface, all checkpoint are used to generate multiple transition paths). In
principle, this can change from interface to interface but in the current implementation of SSAGES, the
number of trials/shots from a checkpoint/node is assumed to be a constant number.
frequency
• Type: integer
• Default: 1
• Functionality: Specifies the frequency (in timesteps of MD simulation) that SSAGES recomputes the value
of “order parameter” and writes the output data.
restart_type
• Type: string
• Default: “new_library”
• Functionality: Defines how a FFS simulation should be restarted. Several options are available:
1. “new_library”: generates a new starting library. If this option is defined, a new FFS simulation is
setup and run.
2. “from_library”: restarts from a library of available configurations defined by library_file and library_point.
3. “from_interface”: restarts the simulation from an interface defined by the current position of the CV
from configurations found in index_contents.
4. “none”: SSAGES restarts the FFS simulation using snapshots of trajectories that are not necessarily
checkpoints/nodes located at a specific interface.
“from_library”, “from_interface” and “none” are typically reserved for restarting from crashes only.
library_point
• Type: integer
• Default: none
• Functionality: Specifies the current library configuration that you are on from the list of configurations
found in the library file defined by library_file.
current_hash
• Type: integer
• Default: 1
• Functionality: Used in the file-naming scheme. Mainly needed for restarts, or if specifying where the
number scheme should start. Default is based on walker_ID*1000000, meaning walker 0 files will be
dump_"interface"_0.dump, dump_"interface"_1.dump, etc.
index_contents
• Type: string
• Default: none
• Functionality: Only used for restarts by SSAGES, includes the same contents as index_file.
successes
• Type: array
• Default: none
4.5. Forward-Flux
29
SSAGES Documentation, Release 0.4.2-alpha
• Functionality: Only used for restarts by SSAGES, contains successes on each interface for each library configuration explored so far. Contents are exactly those as results_file: walker_id
library_point “list of successes at each interface”. For example, a library consisting of two configurations and 4 interfaces using 1 walker:
002340
011000
current_shot
• Type: integer
• Default: none
• Functionality: Mainly used for restarts, indicates which shot this walker is on.
Other required input parameters
CVs
• Type: array
• Default: none
• Functionality: Selection of “order parameter” or “reaction coordinate”. The current implementation of
FFS in SSAGES can only take one collective variable. See section XXX for more details.
inputfile
• Type: string
• Default: none
• Functionality: Specifies the name of engine-specific input file name. The user is encouraged to refer to the
documentation page of the corresponding MD package to learn about various input options as well as the
structure and format of input files suitable for MD engine of your choice.
Tutorial
This tutorial will walk you step by step through the user example provided with the SSAGES source code that runs
the forward flux method on the alanine dipeptide using LAMMPS. First, be sure you have compiled SSAGES with
LAMMPS. Then, navigate to the SSAGES/Examples/User/ForwardFlux/ADP subdirectory. Now, take a
moment to observe the in.ADP_Test and data.input files in order to familiarize yourself with the system
being simulated.
The next two files of interest are the FF_Template.json input file and the FF_Input_Generator.py script.
Both of these files can be modified in your text editor of choice to customize the inputs, but for this tutorial, simply
observe them and leave them be. FF_Template.json contains all the information necessary to fully specify one driver;
FF_Input_Generator.py copies this information a number of times specified within the script (for this tutorial, 12 times)
while also linearly interpolating through the start and end states defined in the script and substituting the correct values
into the “centers” portion of the method definition. Execute this script as follows:
python FF_Input_Generator.py
You will produce a file called EB.json. You can also open this file to verify for yourself that the script did what it
was supposed to do. Now, with your JSON input and your SSAGES binary, you have everything you need to perform
a simulation. Simply run:
30
Chapter 4. Meta-Dynamics Methods
SSAGES Documentation, Release 0.4.2-alpha
mpiexec -np 12 ./ssages FF.json
Allow your system to run for the specified number of iterations (2000 for this tutorial).
Developer
Ben Sikora, Hadi Ramezani-Dakhel & Joshua Lequieu.
References
Generic Metadynamics
Introduction
Todo
Short introduction to generic “vanilla” metadynamics.
Options & Parameters
Todo
Describe options and parameters to control the method.
Tutorial
Todo
Give a tutorial. The tutorial can be based on one of the examples for this method. Describe how to compile the input
files and how to call SSAGES. Describe how to understand and visualize the results.
Developer
Hythem Sidky.
Image Method
Introduction
Surface charging or polarization can strongly affect the nature of interactions between charged dielectric objects,
particularly when sharp dielectric discontinuities are involved. However, not any efficient and accurate computation
tools are publicly available especially for the description of polarization effects in many-body systems.
4.6. Generic Metadynamics
31
SSAGES Documentation, Release 0.4.2-alpha
For this purpose, Image Method, an analytic perturbative approach we recently developed for evaluating the polarization energy of a many-body collection of charged dielectric spheres embedded in a dielectric medium becomes
particularly suitable 1 .
The polarization-induced interactions between these spheres depend on the ratio of dielectric constants for the spheres
and the medium, and the ratio of the distance between particles and the radii of the particles. We have shown that,
in some cases, polarization completely alters the qualitative behavior, and in some other cases, polarization leads to
stable configurations that otherwise could not occur in its absence.
We think it is helpful to include Image Method into SSAGES for users to include polarization corrections properly in
their systems, and meanwhile, to couple with advanced sampling methods to accelerate their simulations.
Options & Parameters
SSAGES Image method is implemented in a way that is as easy as conducting a simulation using LAMMPS that only
includes pairwise Coulombic interactions into electrostatic interactions. To achieve this, we update the electrostatic
forces acting on all objects by adding up the polarization corrections using SSAGES engine and then pass the modified
snapshot back to LAMMPS engine at each time step. The JSON file needed for SSAGES engine should include:
einner The relative dielectric permittivity of polarizable object.
ion-type-start For cases that you have both polarizable objects and non-polarizable objects in you system, for example, in which colloids and ions are treated as polarizable and non-polarizable, respectively. This parameter
controls where the non-polarizable typos start.
atom type radius Radius of all types of objects.
Guidelines
It is very similar as running a simulation including electrostatic interactions using LAMMPS. Referring to the exampled LAMMPS INPUTFILE and DATAFILE, you need to double check you have declared the following variables that
are particularly necessary for Image Method to compute polarization corrections:
• charges
• dielectric (relative dielectric permittivity of the surrounding continuum)
Method Output
There are not special outputs files generated for Image method since it only provides an updated electrostatic forces by
including polarization corrections. Nevertheless, we provided options of dumping trajectories and printing out forcedistance data in the LAMMPS INPUTFILE examples for users to visualize how significant the polarization effects are
in some cases more conveniently.
Tutorial
Todo
Write a tutorial.
1 J. Qin, J. Li, V. Lee, H. Jaeger, J. J. de Pablo, and K. Freed, A theory of interactions between polarizable dielectric spheres, J. Coll. Int. Sci.
469, 237 - 241 (2016)
32
Chapter 4. Meta-Dynamics Methods
SSAGES Documentation, Release 0.4.2-alpha
Developer
Jiyuan Li.
References
Replica Exchange
Introduction
Options & Parameters
Tutorial
Developer
Swarm of Trajectories
Introduction
Like all string methods in general, the string method with swarms of trajectories (often abbreviated to “swarm of
trajectories” or even more simply “SoT”) is a method to identify a transition pathway in an arbitrarily high-dimensional
collective variable space between two metastable states of a system. This pathway (the string) is a parametrized curve
discretized into a set of images, each of which is itself a molecular system. The classical string method in collective
variables evolves each image by estimating a mean force and metric tensor at each image with restrained molecular
dynamics simulations. In the SoT method, the string is instead evolved by launching a large number (a swarm) of
unrestrained trajectories from each image and estimating the average drift of the collective variables over the swarm.
The mathematical background of the method can be expressed in a few relatively straightforward equations, with
further detail available in the original work of Benoit Roux and collaborators 1 . First, consider a path 𝑧(𝛼) constructed
between two metastable states, such that 𝛼 = 0 represents the starting state and 𝛼 = 1 is the final state. The “most
probable transition pathway” (MPTP) is defined such that a molecular system started from anywhere on the path will
most probably evolve while staying on the path. It is shown in the original work that a mathematical definition for
such a path is given when the collective variables evolve according to:
)︂
∑︁ (︂
𝜕
′
𝑧𝑖 (𝛼) = 𝑧𝑖 (𝛼 ) +
𝛽𝐷𝑖𝑗 [𝑧(0)] 𝐹𝑗 [𝑧(0)] +
(𝐷𝑖𝑗 [𝑧(0)]) 𝛿𝜏
𝜕𝑧𝑗
𝑗
Where the following notation is used: 𝑧𝑖 represents the collective variables belonging to the string, 𝛼 represents the
parameter identifying that point on the string, 𝛽 represents the temperature, 𝐷𝑖𝑗 represents the diffusion tensor, 𝐹𝑗
represents the mean force, 𝑧 represents the collective variables constructed from the molecular system at a given
moment in time, and 𝛿𝜏 represents the time step of the evolution of the dynamics. The SoT method approximates this
equation using the average drift evaluated from a large number of unbiased trajectories, each of length 𝛿𝜏 , launched
from each image:
)︂
∑︁ (︂
𝜕
¯
¯
∆𝑧𝑖 (𝛿𝜏 ) = 𝑧𝑖 (𝛿𝜏 ) − 𝑧𝑖 (0) ≡
𝛽𝐷𝑖𝑗 [𝑧(0)] 𝐹𝑗 [𝑧(0)]] +
(𝐷𝑖𝑗 [𝑧(0)]) 𝛿𝜏
𝜕𝑧𝑗
𝑗
1 Pan, A. C., Sezer, D. & Roux, B. Finding Transition Pathways Using the String Method with Swarms of Trajectories. J. Phys. Chem. B 112,
3432–3440 (2008).
4.8. Replica Exchange
33
SSAGES Documentation, Release 0.4.2-alpha
Like all string methods, there is an additional step beyond evolving the collective variables - after one iteration of
evolution, the images along the path must be reparametrized such that they lie (for example) an equal arc length apart.
This step is necessary to ensure that all images do not fall into one metastable basin or the other.
Algorithmically, the SoT method is implemented as follows:
1. An initial string is defined between the two states of interest. This can be defined however one wishes; often it
is simply a linear interpolation through the space of the collective variables. In fact, the ends of the string need
not necessarily be in the basins of interest; the dynamic nature of the method should allow the ends to naturally
fall into nearby metastable basins.
2. For each image of the string, a molecular system with atomic coordinates that roughly correspond to the collective variables of that image is constructed.
3. A set of equilibrium trajectories are generated from that system by performing restrained sampling around the
image’s collective variables.
4. That set of equilibrium trajectories is used as the starting point of a large number of short unbiased trajectories;
the resulting average displacement of each collective variable is used to update the positions of the images.
5. A reparameterization scheme is enforced to ensure that, for example, the string images are equally distant in
collective variable space.
Steps two through five are iterated upon, leading to convergence of the method and the MPTP.
Options & Parameters
These are all the options that SSAGES provides for running the SoT method. In order to add SoT to the JSON file, the
method should be labeled as “Swarm”.
centers For each driver, the initial values of each CV should be specified as a list under “centers”. In this way, the
initial string is defined.
number of nodes A specification of how many nodes (equivalent terminology to images) the string should be discretized into. This parameter only affects reparameterization; the actual number of string images depends on
how many drivers are specified. For accurate results, these values should match.
spring The spring constant to be used during instances of restrained sampling (the constant is part of a harmonic
restraint potential).
frequency The frequency of each integration step. This should almost always be set to 1.
initial steps For each iteration of the method, this is the number of steps to spend doing restrained sampling and not
harvesting trajectories. This time is important to ensure the underlying molecular system’s CV values are close
to the string CV values.
harvest length After the initial restraining is finished, a trajectory is harvested for later use in launching an unrestrained trajectory every so often - harvest length specifies how often this will be done. Harvest length multiplied
by number of trajectories (see below) will determine overall how many more steps will be taken under restrained
sampling.
number of trajectories The total number of unrestrained trajectories to be included in each swarm.
swarm length The length of each unrestrained trajectory in the swarm. Swarm length multiplied by number of
trajectories specifies how many total steps will be spent doing unrestrained sampling.
Tutorial
This tutorial will walk you step by step through the user example provided with the SSAGES source code that runs
the SoT method on the alanine dipeptide using LAMMPS. First, be sure you have compiled SSAGES with LAMMPS.
34
Chapter 4. Meta-Dynamics Methods
SSAGES Documentation, Release 0.4.2-alpha
Then, navigate to the SSAGES/Examples/User/Swarm/ADP subdirectory. Now, take a moment to observe the
in.ADP_Test and data.input files. In general, these should be the same as what you would use for any
other method, but for the SoT method, it is important to define a larger skin distance than one normally would in the
neighbor command in LAMMPS. This is because, under the hood, each unrestrained trajectory in the swarm is started
by manually resetting the positions of each atom in the LAMMPS simulation to the start of a new trajectory. From the
perspective of LAMMPS, this is a huge amount of distance to move in a single time step; this move triggers neighbor
list rebuilding, but LAMMPS considers it a “dangerous build” which threatens to crash the simulation. Thus, we
increase the skin distance, which forces LAMMPS to keep track of more pairs in the neighbor lists, and thus reduces
the number of dangerous builds. Keep this in mind for future runs of the SoT method.
The next two files of interest are the Template_Input.json input file and the Input_Generator.py script.
Both of these files can be modified in your text editor of choice to customize the inputs, but for this tutorial, simply
observe them and leave them be. Template_Input.json contains all the information necessary to fully specify
one driver; Input_Generator.py copies this information a number of times specified within the script (for this
tutorial, 12 times) while also linearly interpolating through the start and end states defined in the script and substituting
the correct values into the “centers” portion of the method definition. Execute this script as follows:
python Input_Generator.py
You will produce a file called Swarm.json. You can also open this file to verify for yourself that the script did
what it was supposed to do. Now, with your JSON input and your SSAGES binary, you have everything you need to
perform a simulation. Simply run:
mpiexec -np 12 ./ssages Swarm.json
Soon, the simulation will produce a node-X.log file for each driver, where X is the number specifying the driver
(in this case, 0-11 for our 12 drivers). Each one will report the following information, in order: the node number, the
iteration number, and for each CV, the current value of the string CV as well as the current value of the CV calculated
from the molecular system.
Allow your system to run for the desired number of MD steps, but keep an eye on it - the system should exit once
one driver reaches the maximum number of MD steps, but it is possible that instead one driver will exit and the rest
will get stuck. Check in on your node files and see if they’ve been updated recently - if not, the simulation has likely
finished. Once this is done, you can execute the included plotter.py function in a directory containing the node files
with the command line argument of how many images your string had. The script also accepts an argument to plot a
free energy surface alongside the string, but that goes beyond the scope of this tutorial. Thus, simply execute:
python plotter.py 12 none
And in a moment you should have a graph of your converged string. Thus concludes this tutorial.
Developer
Cody Bezik.
References
Umbrella Sampling
Introduction
Calculations of thermodynamic data and other properties rely on proper sampling of the configurational space. However, the presence of energy barriers can lead to configurations not being sampled properly or sampled at all. Umbrella
sampling is a simulation technique that helps to overcome those barriers and improve sampling by applying a bias
4.10. Umbrella Sampling
35
SSAGES Documentation, Release 0.4.2-alpha
along a coordinate. The bias takes the form of a harmonic potential. Usually, a series of umbrella-sampled simulations
are performed and analyzed together using the weighted histogram analysis method (WHAM).
Options & Parameters
The following parameters need to be set under “method” in the JSON input file:
"type" : "Umbrella"
The following options are available for Umbrella Sampling:
centers (required) Array of target CV values
ksprings (required) Array of spring constants to each CV
Tutorial
This tutorial will go through running Umbrella Sampling on an atomistic model of butane using LAMMPS as the MD
engine. Umbrella sampling will be performed on the torsional CV of the butane C atoms. The files that can be found
in Examples/User/Umbrella are:
Butane_SSAGES.in LAMMPS input file
Butane.data LAMMPS data file describing butane molecule.
Template_Input.json Template JSON input containing information for one Umbrella Sampling simulation.
Umbrella_Input_Generator.py Python script for creating Umbrella.json input file. The total number of
simulations and the ‘centers’ values are controlled in this file.
Once in the directory, the appropriate .json file needs to be generated. A .json file is already in the directory,
Template_Input.json, which contains the CV information and specifies the LAMMPS input files to be used.
Using
python Umbrella_Input_Generator.py
will generate Umbrella.json. Umbrella.json contains the information from Template_Input.json duplicated 12 times
with varying values of ‘centers’. These values correspond to the target values of the torsional CV.
To run SSAGES do:
mpiexec -np 12 /path/to/SSAGES/build/.ssages Umbrella.json
This will run 12 different umbrella sampling simulations, one per processor. 12 different output files will be generated,
each containing the iteration, target value of the corresponding ‘center’ CV, and the value of the CV at the iteration
number.
36
Chapter 4. Meta-Dynamics Methods
CHAPTER
FIVE
WRITE YOUR OWN METHODS AND CVS
One of the basic design goals of SSAGES is that it should be easily extensible. To this end, it provides intuitive and
simple tools to implement new collective variables (CVs) and new metadynamic methods. This section covers the
basic steps to implement a new CV and a new Method. Let us start first with the implementation of a new CV. The
techniques to implement a new Method are covered below (page 38).
How to write a new CV
Each CV consists of two components: A header file and a schema file. The header file contains the source code for the
calculation of the CV and the schema file describes the properties of the CV in a simple JSON format. Finally, you
will have to make SSAGES aware of the new CV.
The CV header file
Each CV in SSAGES is implemented as a child of the class CollectiveVariable. The header file should be
placed in the directory src/CVs and has to (re)implement the following functions:
void Initialize(const Snapshot&) (optional) This method is called during the pre-simulation phase. It
is typically used to allocate or reserve memory.
void Evaluate(const Snapshot&) Evaluation of the CV based on a simulation snapshot. Together with the
value, this function should also calculate the gradient of the CV. The gradient should be a vector of length n,
where n is the number of atoms in the Snapshot. Each element in the vector is the derivative of the CV with
respect to the corresponding atom’s coordinates. This method is called in the post-integration phase of every
iteration.
double GetValue() const Return the current value of the CV.
double GetPeriodicValue() const Return the current value of the CV, taking periodic boundary conditions into account. An example would be an angular CV which is bound to the region (−𝜋, 𝜋]. In this case,
GetValue() could return any angle, while GetPeriodicValue() should return the angle mapped back
into the region (−𝜋, 𝜋]. If the CV does not use periodic boundaries, this function should return the same value
as GetValue().
const std::vector<Vector33>& GetGradient() const Return the gradient of the CV (see
Evaluate(const Snapshot&) for how the gradient is defined).
const std::array<double, 2>& GetBoundaries() const Return a two-element array containing
the lower and the upper boundary for the CV.
double GetDifference(const double Location) const Return the distance of the current value of
the CV from a specified location, taking periodic boundary conditions into account. If the CV does not use
periodic boundary conditions, the return value should simply be GetValue() - Location.
37
SSAGES Documentation, Release 0.4.2-alpha
The CV schema file
Together with the header file that contains the source code of the CV, you will have to provide a schema file to make
the CV accessible to the SSAGES input files. The schema file should be placed in the directory schema/CVs/. It
has to be written in the JSON format and should contain the following items:
type The value of type should be set to object.
varname The name of your new CV.
properties The properties contain the type which is the internal name of the CV and a set of other properties that have
to be supplied to the constructor of the CV.
required A list containing the required properties. Optional parameters to the CV constructor are not listed here.
additionalProperties Optional properties.
Integrate the new CV into SSAGES
Once you have provided the header and the schema file, there is one more steps to do in order to make SSAGES aware
of the newly included CV.
Note: We are currently working on a method to automate this step. Revisit this section in future releases. Chances
are, that you no longer have to worry about this step.
To include your new CV, you have to edit the file src/CVs/CollectiveVariable.cpp, and
1. #include your CV header file at the top of the file.
2. Add a new else if clause in BuildCV(). The if-test checks for the CV type set as an enum in the list of
properties. Within the if-clause you should parse and validate the JSON schema, read the required properties
and create the CollectiveVariable. A pointer to the newly created object should be stored in the variable named
cv.
How to write a new method
Each method consists of three components: A header file, a cpp file, and a schema file. The header file and cpp file
contains the source code for the method and the schema file describes the properties of the method in a simple JSON
format. Finally, you will have to make SSAGES aware of the new method.
The method header file
Each method in SSAGES is implemented as a child of the class Methods. The header file should be placed in the
directory src/methods and has to (re)implement the following functions:
void PreSimulation(Snapshot* snapshot, const CVList& cvs) Setup done before the method
actually runs. This function will be called vefore the simulation is started.
void PostIntegration(Snapshot* snapshot, const CVList& cvs) This is where the heart of
your method should go. By using snapshot and the cvs, modify the forces, positions, velocities, etc. appropriated by the new method. This function will be called after each integration MD step.
code void PostSimulation(Snapshot* snapshot, const CVList& cvs) This function is called at the end of
the simulation run. Use it to close files your method opened, to write out data that you have been
storing, etc.
38
Chapter 5. Write your own Methods and CVs
SSAGES Documentation, Release 0.4.2-alpha
The method schema file
Together with the source code of the method, you will have to provide a schema file to make the CV accessible to the
SSAGES input files. The schema file should be placed in the directory schema/methods/. It has to be written in
the JSON format and should contain the following items:
type The value of type should be set to object.
varname The name of your new method.
properties The properties contain the type which is the internal name of the method and a set of other properties that
have to be supplied to the constructor of the method.
required A list containing the required properties. Optional parameters to the method constructor are not listed here.
additionalProperties Optional properties.
Integrate the new method into SSAGES
Once you have provided the header and the schema file, there is one more steps to do in order to make SSAGES aware
of the newly included method.
Note: We are currently working on a method to automate this step. Revisit this section in future releases. Chances
are, that you no longer have to worry about this step.
To include your new method, you have to edit the file src/methods/Methods.cpp, and
1. #include your method header file at the top of the file.
2. Add a new else if clause in BuildMethod(). The if-test checks for the method type set as an enum in
the list of properties. Within the if-clause you should parse and validate the JSON schema, read the required
properties and create the method. A pointer to the newly created object should be stored in the variable named
method.
5.2. How to write a new method
39
SSAGES Documentation, Release 0.4.2-alpha
40
Chapter 5. Write your own Methods and CVs
CHAPTER
SIX
CONTRIBUTE TO SSAGES
The SSAGES project is built on an inclusive and welcoming group of physicists, chemists, and chemical engineers
working on complex Molecular Dynamics simulations employing Metadynamic techniques. Metadynamics is an
exciting and fast developing field and similarly this project is designed to facilitate the usage and implementation of a
wide array of Metadynamics methods. And we welcome you heartily to join us and to embark with us on this great
adventure.
There are many ways to contribute to SSAGES and you do not necessarily need programming skills to be part of this
project (even though they surely help). But, if you decide to work on the code base, you will be happy to find that
SSAGES is designed to be easy to use and is just as easy to extend. We put a high priority on maintaining a readable
and clearly structured code base as well as an inclusive community welcoming new ideas and contributions.
Here is a short summary of ideas how you can become part of SSAGES:
Reporting, Triaging, and Fixing Bugs No software is without errors, inconsistencies, and strange behaviors. Even
with zero programming knowledge, you can help tremendously by reporting bugs or confirming issued bugs.
Read more... (page 41)
Improving the SSAGES documentation SSAGES would like to have a detailed yet comprehensive documentation
on what it does and how it does it. This should include concise introductions to the methods, quick to learn
tutorials, complete coverage of the nooks and crannies of each method, and of course helpful pointers in case you
run into errors. And while the documentation is already expansive, improvements on it never go unappreciated.
Read more... (page 41)
Including your Method and CV in SSAGES You have developed a new Metadynamics scheme or a Collective Variable and want to make it available to the community via SSAGES? Great! Read more... (page 45)
Working on the core SSAGES system If you would like to climb into the heart of SSAGES and get your hands dirty,
this task is for you. Read more... (page 45)
Reporting bugs and wishes
Todo
Link to GitHub issue tracker
Improving the Documentation
Great documentation and great code produces great software. -SSAGE advice
41
SSAGES Documentation, Release 0.4.2-alpha
Improvements on the documentation are always highly appreciated. The SSAGES documentation is split into two
parts: The User Manual (which you are reading right now), and the API documentation. While the Manual uses the
Sphinx documentation and contains all information necessary to use the program, the API docs are bulit on Doxygen
and describe the usage of the underlying classes and functions for everyone willing to extend and improve SSAGES.
Here are a few ideas on how you can help:
• Fix typos: Even though we have thoroughly checked, there are certainly still a few hidden somewhere.
• Check if all internal and external links are working.
• Make sure that the documentation is up to date, i.e. that it reflects the usage of the latest version.
• Add examples: An examples on how to use a method, avoid a common problem, etc. are more helpful than a
hundred pages of dry descriptions.
• Write a tutorial.
Building the documentation
Before you can work on the documentation, you first have to build it. The documentation is part of the SSAGES source
code. It is assumed that you have already downloaded and built the source code as described in the Getting Started
(page 5) section. You will find a collection of rst files comprising the User Manual under doc/source/ where the
file ending rst stands for ReStructured Text. The API documentation on the other hand resides directly in the header
files right next to the classes and functions they describe.
Assuming you have already built SSAGES, building the documentation is as easy as typing make doc in your build
directory. In order to make correctly check that you have the following programs installed:
• Sphinx (with PyPI via pip install Sphinx for example)
• Doxygen
• dot (in Ubuntu this is part of the graphViz package)
• Sphinx “Read the docs” theme (via pip install sphinx_rtd_theme)
Once you have successfully built the documentation you will find the User Manual under doc/Manual/ and the
API documentation under doc/API-doc/html/ (relative to your build directory - do not confuse it with the doc/
folder in the main directory of the project). To view it in your favorite web browser (using FireFox as an example) just
type
firefox doc/Manual/index.html
for the User Manual or
firefox doc/API-doc/html/index.html
for the API documentation.
How to write documentation
Here are a few pointers on how to write helpful documentation, before we dive into the details of Sphinx and Doxygen
for the User Manual and the API documentation:
• Write documentation “along the way”. Do not code first and write the documentation later.
• Use helpful error messages. These are considered part of the documentation and probably are the part that is
read most frequently.
• Do everything you can to structure the text. Let’s face it: Most people will just skim the documentation. Feel
encouraged to use all techniques that help to spot the relevant information, for example:
42
Chapter 6. Contribute to SSAGES
SSAGES Documentation, Release 0.4.2-alpha
– Format your text bold, italic, code, etc.
– Write in short paragraphs, use headers
– Use lists, code blocks, tables, etc.
Note: These Note blocks are extremely helpful for example.
Warning: Warnings work great, too!
See also:
Here you can find more examples
doc.org/en/stable/markup/para.html
for
helpful
Sphinx
markup:
http://www.sphinx-
• Use examples, a lot of them
• In the initial stages: Don’t be a perfectionist. Missing documentation is the worst kind of documentation. “It is
better to have written and coded than to have never written at all.” -SSAGE advice
How to write Sphinx
The Sphinx documentation system uses ReStructured text which is loosely based on the markdown format. Examples
for documentations written with Sphinx include:
• LAMMPS
• HOOMD
• Virtually all of the Python Documentation
The following tutorials are extremely helpful:
• http://www.sphinx-doc.org/en/stable/rest.html
• http://docutils.sourceforge.net/docs/user/rst/quickref.html
• http://openalea.gforge.inria.fr/doc/openalea/doc/_build/html/source/sphinx/rest_syntax.html
One of the great things of Sphinx is that most documentations have a “view page source” link where you can take a
look at the Sphinx source code. Thus, the best way to learn Sphinx is to click on this link right now and look at the
source code of this page. But here is a short summary of the most important commands:
• Markup: You can use *italic*, **bold**, and ‘‘code‘‘ for italic, bold and code.
• Headers. Underline your headers with at least three === for titles, --- for subtitles, ^^^ for subsubtitles and
~~~ for paragraphs.
• Bullet lists are indicated by lines beginning with *.
Note: These highlighted blocks can be created with .. note::. The content of this block needs to be indented.
You can also use warning and seealso. Even more can be found here.
How to write Doxygen
Doxygen follows a very different philosophy compared to Sphinx and is more steered towards API documentation,
exactly what we use it for in SSAGES. Instead of maintaining the documentation separate from the source code, the
6.2. Improving the Documentation
43
SSAGES Documentation, Release 0.4.2-alpha
classes and functions are documented in the same place where they are declared: The header files. Doxygen then reads
the source code and automatically builds the documentation. Examples for documentation created with Doxygen:
• Plumed
• Root
The mainpage of the Doxygen documentation is written in a separate header file, in our case doc/mainpage.h. A
good introduction to the Doxygen syntax can be found at
• http://www.stack.nl/~dimitri/doxygen/manual/docblocks.html
The basic rule is that Doxygen comments start with //! or /*! and document the class, namespace or function that
directly follows it. Let’s start with a short example:
//! Function taking the square of a value
/*!
* \param val Input value
* \returns Square of the input value
*
* This function calculates the square of a given value.
*/
double square(double val)
{
return val*val;
}
This example documents the function square() which simply calculates the square of a number. The first line,
starting with //!, is the brief description and should not be longer than one line. The second comment block, starting
with /*! is the full description. Here, two special commands are used:
\param This command documents one parameter of the function
\returns This command documents the return value of the function
There are many special Doxygen commands. They all start with a backslash and the most important, apart from the
two mentioned above, are:
\tparam Used to document a template parameter.
\ingroup This class is part of a group, such as Methods or Core. The groups are defined in doc/mainpage.h.
Helpful are also boxes highlighting a given aspect of the function, such as:
\attention Puts the following text in a raised box. A blank line ends the attention box.
\note Starts a highlighted block. A blank line ends the note block.
\remark Starts a paragraph where remarks may be entered.
\see Paragraph for “See also”.
\deprecated The documented class or function is deprecated and only kept for backwards compatibility.
\todo Leave a ToDo note with this command.
You can also highlight your text:
\em For italic word. To highlight more text use <em> Highlighted text </em>.
\b For bold text. To highlight more text use <b> Bold text </b>.
\c For typewriter font. To have more text in typewriter font, use <tt>Typewriter Font</tt>.
\code Starts a code block. The block ends with \endcode.
\li A line starting with \li is an entry in a bullet list.
44
Chapter 6. Contribute to SSAGES
SSAGES Documentation, Release 0.4.2-alpha
Another big benefit of doxygen is that you can use a lot of LaTeX syntax. For example:
\f$ Starts and ends an inline math equation, similar to $ in Latex.
\f[ and \f] Start and end a display-style LaTeX equation.
\cite <label> Cite a reference. The references are listed in doc/references.bib and follow the BibTex syntax.
Doxygen is very clever in producing automatic links. For example, there exists a class Method in SSAGES. Thus,
Doxygen automatically creates a link to the documentation of this class where the word “Method” appears. This does,
however, not work for the plural, “Methods”. Instead, you can write \link Method Methods \endlink. On
the other hand, if you want to prevent Doxygen from creating an autolink, put a % in front of the word.
What to document
We are aiming for a comprehensive documentation of all the methods available in SSAGES as well as the core features.
Thus, for each method the documentation should include
• An introduction into the method, what it does and how it does it.
• A short tutorial based on one of the working examples. The reader should be able to complete the tutorial in
~30min and should leave with a sense of accomplishment, e.g. a nice energy profile or a picture of a folded
protein.
• A detailed description on how to use the method, the parameters, constraints, requirements, etc.
Adding your method to SSAGES
See also:
See here (page 37) for an introduction to how to develop your own method.
So, you have developed a new Metadynamics method or a new collective variable (CV)? Great! SSAGES is about
collaboration and integrating your new CV or method is a priority. But before we do that, make sure you check the
following boxes:
• Your code needs to compile and run (obviously).
• If you have implemented a new method, this method should have been published in a peer reviewed journal and
the publication should be cited in the documentation of the method (see next point). If you have implemented a
CV, please give a small example of usage. In which case(s) does the new CV come in handy?
• Your method needs to come with the necessary documentation. For others to be able to use your method, you
will have to explain how it works. You can take a look at the section “How to improve the documentation”
(page 41) for a starter on how to write good documentation.
• Please provide an example system. This could be the folding of an Alanine Dipeptide molecule, a NaCl system
or just a toy model with a simple energy landscape. As long as the system is small and the method can easily
complete within a few hours, it will be fine.
Once these boxes have been checked, our team of friendly code-reviewers will take a look at your source code and
help you meet the high standard of the SSAGES code.
Working on the core classes
Todo
6.3. Adding your method to SSAGES
45
SSAGES Documentation, Release 0.4.2-alpha
Describe SSAGES development
46
Chapter 6. Contribute to SSAGES
CHAPTER
SEVEN
THE SSAGES COOKBOOK
A collection of short solutions to common problems. Just like a FAQ.
47
SSAGES Documentation, Release 0.4.2-alpha
48
Chapter 7. The SSAGES cookbook
CHAPTER
EIGHT
ACKNOWLEDGMENTS
We are grateful to Argonne National Labs for initiating this project and their continued support. Julian Helfferich
acknowledges financial support from the DFG research fellowship program, grant No. HE 7429/1.
Some important core functionality of SSAGES comes from SAPHRON. SAPHRON - Statistical Applied PHysics
through Random On-the-fly Numerics https://github.com/hsidky/SAPHRON
Project Supervisors
• Juan de Pablo
• Jonathan Whitmer
• Juan Hernandez-Ortiz
Project Leads
• Ben Sikora (SSAGES Core)
• Yamil J. Colón (Collective Variables, Methods, and Testing)
• Hythem Sidky (SSAGES Architecture)
• Julian Helfferich (Documentation)
SSAGES Core Development
• Ben Sikora
• Hythem Sidky
• Julian Helfferich (Testing Framework)
Methods
• Hythem Sidky (Umbrella Sampling, Metadynamics, Adaptive Biasing Force Algorithm, and Basis Function
Sampling)
• Ben Sikora (Metadynamics, Forward Flux, Finite Temperature String, and Elastic Band)
49
SSAGES Documentation, Release 0.4.2-alpha
• Cody Bezik (Swarm of Trajectories)
• Ashley Guo (Finite Temperature String)
• Jonathan Whitmer (Metadynamics)
• Emre Sevgen (Adaptive Biasing Force Algorithm)
• Joshua Moller (Basis Function Sampling)
• Jiyuan Li (COPSS integration)
Collective Variables
• Hythem Sidky (Angle, Torsional, Particle Coordinate, Particle Position, Particle Separation, Rg)
• Yamil J. Colón (RMSD, Torsional, Rg, Particle Separation)
• Ben Sikora (RMSD, Torsional, Rg, Particle Separation)
Documentation
• Julian Helfferich (Core)
• Yamil J. Colón ()
• Cody Bezik (Swarm of Trajectories and Elastic Band)
• Hadi Ramezani-Dakhel (Forward Flux)
• Emre Sevgen (Adaptive Biasing Force Algorithm)
• Joshua Moller (Basis Function Sampling)
Driver Hooks
• Hythem Sidky (LAMMPS and Gromacs)
• Ben Sikora (LAMMPS)
All contributors in alphabetical order
• Cody Bezik
• Yamil J. Colón
• Ashley Guo
• Julian Helfferich
• Juan Hernandez-Ortiz
• Joshua Lequieu
• Jiyuan Li
• Joshua Moller
50
Chapter 8. Acknowledgments
SSAGES Documentation, Release 0.4.2-alpha
• Juan de Pablo
• Hadi Ramezani-Dakhel
• Emre Sevgen
• Hythem Sidky
• Benjamin Sikora
• Jonathan Whitmer
8.8. All contributors in alphabetical order
51
SSAGES Documentation, Release 0.4.2-alpha
52
Chapter 8. Acknowledgments
CHAPTER
NINE
COPYRIGHT
SSAGES Copyright 2016 University of Chicago, University of Notre Dame. All Rights Reserved.
53
SSAGES Documentation, Release 0.4.2-alpha
54
Chapter 9. Copyright
CHAPTER
TEN
LICENSE INFORMATION
SSAGES is distributed under the GNU Lesseer General Public License either version 3 or (at your option) any later
version (LGPLv3). The full terms of the LGPLv3 can be found on the GNU LGPL homepage. You are free to modify
and redistribute this software under the terms of the GNU Lesser General Public License.
The documentation is distributed under the terms of Creative Commons 4.0 BY-SA. This means you are free to share
and adapt the documentation as long as you give credit to the original authors and release your derivative work under
the same license. The full license can be found here.
Contributors to the documentation (in alphabetical order):
• Cody Bezik
• Yamil J. Colón
• Grant Garner
• Ashley Guo
• Julian Helfferich
• Joshua Lequieu
• Jiyuan Li
• Joshua Moller
• Hadi Ramezani-Dakhel
• Ben Sikora
• Emre Sevgen
• Hythem Sidky
55
SSAGES Documentation, Release 0.4.2-alpha
56
Chapter 10. License information
CHAPTER
ELEVEN
INDICES AND TABLES
• genindex
• modindex
• search
57

Download Report

SSAGES Documentation - IME CODES

Paperzz.com

Your Paperzz