The aim of this tutorial is to introduce the users to the PLUMED syntax. We will go through the writing of simple collective variable and we will use them to analyze existing trajectories.
Once this tutorial is completed students will be able to:
The TARBALL for this project contains the following files:
This tutorial has been tested on a pre-release version of version 2.4. However, it should not take advantage of 2.4-only features, thus should also work with version 2.3.
Also notice that in the .solutions
directory of the tarball you will find correct input files. Please only look at these files after you have tried to solve the problems yourself.
This tutorial asks you to compute a variety of different collective variables using PLUMED for a particular trajectory and to compare the files and graphs that you obtain with the correct ones that are shown online. Compared to some of the other tutorials that are available here this tutorial contains considerably less guidance so in doing this tutorial you will have to learn how to consult the manual. If you would like a more guided introduction to PLUMED it might be better to start with the tutorials Belfast tutorial: Analyzing CVs or MARVEL tutorial: Analyzing CVs. Also notice that, whereas this tutorial was tested using a pre-release version of PLUMED 2.4, it should be completely feasible using PLUMED 2.3.
As we will see later, PLUMED provides a library that can be combined with multiple MD codes. However, in this tutorial we will only use PLUMED to analyze trajectories that have been produced already. Once PLUMED is installed you can run a plumed
executable that can be used for multiple purposes:
> plumed --help
Here we will use the driver tool, that allows you to process an already existing trajectory.
> plumed driver --help
What we will need is:
The syntax of the PLUMED input file is the same that we will use later to run enhanced sampling simulations, so all the things that you will learn now will be useful later when you will run PLUMED coupled to an MD code. In the following we are going to see how to write an input file for PLUMED.
The main goal of PLUMED is to compute collective variables, which are complex descriptors than can be used to analyze a conformational change or a chemical reaction. This can be done either on the fly, that is during molecular dynamics, or a posteriori, using PLUMED as a post-processing tool. In both cases one should create an input file with a specific PLUMED syntax. A sample input file is below:
# this is optional and tell to VIM that this is a PLUMED file # vim: ft=plumed # see comments just below this input file # Compute distance between atoms 1 and 10. # Atoms are ordered as in the trajectory files and their numbering starts from 1. # The distance is called "d" for future reference. d: DISTANCE ATOMS=1,10 # Create a virtual atom in the center between atoms 20 and 30. # The virtual atom only exists within PLUMED and is called "center" for future reference. center: CENTER ATOMS=20,30 # Compute the torsional angle between atoms 1, 10, 20, and center. # Notice that virtual atoms can be used as real atoms here. # The angle is called "phi" for future reference. phi: TORSION ATOMS=1,10,20,center # Compute some function of previously computed variables. # In this case we compute the cosine of angle phi and we call it "d2" d2: MATHEVAL ... ARG=phi FUNC=cos(x) PERIODIC=NO ... # The previous command has been split in multiple lines. # It could have been equivalently written in a single line: # d2: MATHEVAL ARG=phi FUNC=cos(x) PERIODIC=NO # Print d and d2 every 10 step on a file named "COLVAR1". PRINT ARG=d,d2 STRIDE=10 FILE=COLVAR1 # Print phi on another file names "COLVAR2" every 100 steps. PRINT ARG=phi STRIDE=100 FILE=COLVAR2
In the input file above, each line defines a so-called action. An action could either compute a distance, or the center between two or more atoms, or print some value on a file. Each action supports a number of keywords, whose value is specified. Action names are highlighted in green and, clicking on them, you can go to the corresponding page in the manual that contains a detailed description for each keyword. Actions that support the keyword STRIDE
are those that determine how frequently things are to be done. Notice that the default value for STRIDE
is always 1. In the example above, omitting STRIDE
keywords the corresponding COLVAR files would have been written for every frame of the analyzed trajectory. All the other actions in the example above do not support the STRIDE
keyword and are only calculated when requested. That is, d
and d2
will be computed every 10 frames, and phi
every 100 frames. In short, you can think that for every snapshot in the trajectory that you are analyzing PLUMED is going to execute all the listed actions, though some of them are optimized out when STRIDE
is different from 1.
Also notice that PLUMED works using kJ/nm/ps as energy/length/time units. This can be personalized using UNITS, but we will here stay with default values.
Variables should be given a name (in the example above, d
, phi
, and d2
), which is then used to refer to these variables. Instead of a: DISTANCE ATOMS=1,2
you might equivalently use DISTANCE ATOMS=1,2 LABEL=a
. Lists of atoms should be provided as comma separated numbers, with no space. Virtual atoms can be created and assigned a name for later use.
You can find more information on the PLUMED syntax at Getting Started page of the manual. The complete documentation for all the supported collective variables can be found at the Collective Variables page.
To analyze the trajectory provided here, you should:
plumed.dat
) similar to the one above.plumed driver --mf_xtc traj.xtc --plumed plumed.dat
.Here traj.xtc
is the trajectory that you want to analyze. Notice that driver can read multiple file formats using embedded molfile plugins from VMD (that's where the mf
letters come from).
Notice that you can also visualize trajectories with VMD directly. Trajectory traj.xtc
can be visualized with the command vmd ref.pdb traj-whole.xtc
.
In the following we will make practice with computing and printing collective variables.
Analyze the traj-whole.xtc
trajectory and produce a colvar file with the following collective variables.
ref.pdb
file which are the atoms that are part of RNA (search for the first occurrence of a water molecule, residue name SOL
). Remember that you don't need to list all the atoms: instead of ATOMS=1,2,3,4,5
you can write ATOMS=1-5
.ref.pdb
file or learn how to select a special angle reading the MOLINFO documentation.R_0
to 2.5 A (be careful with units!!). Try to be smart in selecting the water oxygen atoms without listing all of them explicitly.Notice that some of the atom selections can be made in a easier manner by using the MOLINFO keyword with a proper reference PDB file. Also read carefully the Groups and Virtual Atoms page before starting. Here you can find a sample plumed.dat
file that you can use as a template. Whenever you see an highlighted FILL string, this is a string that you should replace.
# First load information about the molecule. MOLINFO __FILL__ # Notice that this is special kind of "action" ("setup action") # that is only used during setup. It will not be re-executed at each step. # Define some group that will make the rest of the input more readable # Here are the atoms belonging to RNA. rna: GROUP ATOMS=1-258 # This is the Mg ion. A group with atom is also useful! mg: GROUP ATOMS=6580 # This group should contain all the atoms belonging to water molecules. wat: GROUP ATOMS=__FILL__ # Select water oxygens only: owat: GROUP __FILL__ # Select water hydrogens only: hwat: GROUP __FILL__ # Compute gyration radius: r: GYRATION ATOMS=__FILL__ # Compute the Chi torsional angle: c: TORSION ATOMS=__FILL__ # Compute coordination of RNA with water oxygens co: COORDINATION GROUPA=rna GROUPB=owat R_0=__FILL__ # Compute coordination of RNA with water hydrogens ch: COORDINATION GROUPA=rna GROUPB=hwat __FILL__ # Compute the geometric center of the RNA molecule: ce: CENTER ATOMS=__FILL__ # Compute the distance between the Mg ion and the RNA center: d: DISTANCE ATOMS=__FILL__ # Print the collective variables on COLVAR file # No STRIDE means "print for every step" PRINT ARG=r,c,co,ch,d FILE=COLVAR
Once your plumed.dat
file is complete, you can use it with the following command
> plumed driver --plumed plumed.dat --mf_xtc whole.xtc
Scroll in your terminal to read the PLUMED log. As you can see, PLUMED gives a lot of feedback about the input that he is reading. There's the place where you can check if PLUMED understood correctly your input.
The command above will create a file COLVAR
like this one:
#! FIELDS time r c co ch d #! SET min_c -pi #! SET max_c pi 0.000000 0.788694 -2.963150 207.795793 502.027244 0.595611 1.000000 0.804101 -2.717302 208.021688 499.792595 0.951945 2.000000 0.788769 -2.939333 208.347867 500.552127 1.014850 3.000000 0.790232 -2.940726 211.274315 514.749124 1.249502 4.000000 0.796395 3.050949 212.352810 507.892198 2.270682
Notice that the first line informs you about the content of each column and the second and third lines tell you that variable c
(the \(\chi\) torsion) is defined between \(-\pi\) and \(+\pi\).
In case you obtain different numbers, check your input, you might have made some mistake!
This file can then be shown with gnuplot
gnuplot> p "COLVAR" u 1:2, "" u 1:3
As a final note, look at what happens if you run the exercise twice. The second time, PLUMED will back up the previously produced file so as not to overwrite it. You can also concatenate your files by using the action RESTART at the beginning of your input file.
In this first exercise we only computed simple functions of the atomic coordinates. PLUMED is very flexible and allows you to also combine these functions to create more complicated variables. These variables can be useful when you want to describe a complex conformational change. PLUMED implements a number of functions that can be used to this aim that are described in the page Functions. Look at the following example: In case you have many distances to combine you can also use regular expressions to select them using Notice for many functions you should say to PLUMED if the function is periodic. See Functions for a detailed explanation of how to choose this keyword. You might think that it is easier to combine the variables after you have written them already, using, e.g., an awk or python script. That's fine if you are analyzing a trajectory. However, as we will learn later, computing variables within PLUMED you will be able to add bias potentials on those combinations, influencing their dynamics. Actually, you could implement any arbitrarily complex collective variable using just DISTANCE and MATHEVAL! Anyway, if the CV combinations that you are willing to use can be computed easily with some external program, do it and compare the results with the output of the PLUMED driver. As an optional exercise, create a file with the following quantities: Notice that the serial numbers of the phosphorous atoms can be easily extracted using the following command Here's a template input file to be completed by you. Notice that using the collective variable DISTANCES you might be able to do the same with a significantly simpler input file! If you have time, also try that and compare the result. The resulting To learn more: Combining collective variables
# Distance between atoms 1 and 2:
d1: DISTANCE ATOMS=1,2
# Distance between atoms 1 and 3:
d2: DISTANCE ATOMS=1,3
# Distance between atoms 1 and 4:
d3: DISTANCE ATOMS=1,4
# Compute the sum of the squares of those three distances:
c: COMBINE ARG=d1,d2,d3 POWERS=2,2,2 PERIODIC=NO
# Sort the three distances:
s: SORT ARG=d1,d2,d3
# Notice that SORT creates a compund object with three components:
# s.1: the smallest distance
# s.2: the middle distance
# s.3: the largest distance
p: MATHEVAL ARG=d1,d2,d3 FUNC=x*y*z PERIODIC=NO
# Print the sum of the squares and the largest among the three distances:
PRINT FILE=COLVAR ARG=c,s.3
ARG=(d.)
, see Regular Expressions.
Exercise 1b: Combining collective variables
> grep ATOM ref.pdb | grep " P " | awk '{print $2}'
# First load information about the molecule.
MOLINFO __FILL__
# Define some group that will make the rest of the input more readable
mg: GROUP ATOMS=6580 # a group with one atom is also useful!
# Distances between Mg and phosphorous atoms:
d1: DISTANCE ATOMS=mg,33
d2: DISTANCE __FILL__
__FILL__
d6: DISTANCE __FILL__
# You can use serial numbers, but you might also use MOLINFO strings
# Compute the sum of these distances
c: COMBINE __FILL__
# Compute the distance between Mg and the closest phosphorous atom
s: SORT __FILL__
# Print the requested variables
PRINT FILE=COLVAR __FILL__
COLVAR
file should look like this one:#! FIELDS time c s.1
0.000000 6.655622 0.768704
1.000000 7.264049 0.379416
2.000000 7.876489 0.817820
3.000000 8.230621 0.380191
4.000000 13.708759 2.046935
While running PLUMED can also dump the coordinate of the internally stored atoms using DUMPATOMS. This might seem useless (coordinates are already contained in the original trajectory) but can be used in the following cases:
The last point is perhaps the most surprising one. Some of the PLUMED actions can indeed move the stored atoms to positions better suitable for the calculation of collective variables.
The previous exercise was done on a trajectory where the RNA was already whole. For the next exercise you will use the traj-broken.xtc
file instead, which is a real trajectory produced by GROMACS. Open it with VMD to understand what we mean with broken
> vmd ref.pdb traj-broken.xtc
Select Graphics
, then Representations
, then type nucleic
in the box Selected Atoms
. You will see that your RNA duplex is not whole. This is not a problem during MD because of periodic boundary conditions. However, it is difficult to analyze this trajectory. In addition, some collective variables that you might want to compute could require the molecules to be whole (an example of such variables is RMSD).
You might think that there are alternative programs that can be used to reconstruct the molecules the molecules that are broken by the periodic boundary correctly in your trajectory before analyzing it. However, you should keep in mind that if you need to compute CVs on the fly to add a bias potential on those (as we will to in the next tutorials) you will have to learn how to reconstruct the molecules that are broken by the periodic boundary within PLUMED. If you know alternative tools that can reconstruct the molecules that are broken by the periodic boundary, it is a good idea to also use them and compare the result with PLUMED.
Analyze the provided trajectory traj-broken.xtc
and use the DUMPATOMS action to produce new trajectories in gro
format that contain:
reference.pdb
). See FIT_TO_TEMPLATE, using TYPE=OPTIMAL
. Notice that you should provide to FIT_TO_TEMPLATE a pdb file with only the atoms that you wish to align. Use the ref.pdb
file as a starting point and remove the lines non containing RNA atoms. More details on PDB files in PLUMED can be found here.Here you can find a template input file to be completed by you.
# First load information about the molecule. MOLINFO __FILL__ # Define here the groups that you need. # Same as in the previous exercise. rna: GROUP ATOMS=__FILL__ mg: GROUP ATOMS=__FILL__ wat: GROUP ATOMS=__FILL__ # Make RNA duplex whole. WHOLEMOLECULES __FILL__ # Dump first trajectory in gro format. # Notice that PLUMED understands the format based on the file extension DUMPATOMS ATOMS=rna FILE=rna-whole.gro # Align RNA duplex to a reference structure # This should not be the ref.pdb file but a new file with only RNA atoms. FIT_TO_TEMPLATE REFERENCE=__FILL__ TYPE=OPTIMAL # Notice that before using FIT_TO_TEMPLATE we used WHOLEMOLECULES to make RNA whole # This is necessary otherwise you would align a broken molecule! # Dump the aligned RNA on a separate file DUMPATOMS ATOMS=rna FILE=rna-aligned.gro # Compute the distance between the Mg and the Phosphorous from residue 8 d: DISTANCE ATOMS=mg,__FILL__ ## put the serial number of the correct phosphorous here # here we only dump frames conditioned to the value of d UPDATE_IF ARG=d __FILL__ DUMPATOMS ATOMS=rna,mg FILE=rna-select.gro UPDATE_IF ARG=d __FILL__ # this command is required to close the UPDATE_IF above # compute the center of the RNA molecule center: CENTER ATOMS=rna # Wrap atoms correctly WRAPAROUND ATOMS=mg AROUND=__FILL__ WRAPAROUND ATOMS=wat AROUND=center __FILL__ # anything missing here? # Dump the last trajectory DUMPATOMS ATOMS=rna,wat,mg FILE=rna-wrap.gro
After you have prepared a proper plumed.dat
file, you can use it with the following command
> plumed driver --plumed plumed.dat --mf_xtc broken.xtc
Visualize the resulting trajectories using VMD. Since the gro
files already contain atom names, you do not need to load the pdb
file first. For instance, the first trajectory can be shown with
> vmd rna-whole.gro
TODO: I should perhaps add reference plots
If you just simulate a single solute molecule in water it is easy to understand how to pick the right options for WHOLEMOLECULES. However, if you have multiple molecules it can be rather tricky. In the example above, we used WHOLEMOLECULES on the RNA molecule which is actually a duplex, that is two separated chains. This was correct for the following reasons: In case the two molecules can separate from each other this would be rather problematic. We will now see what happens when using WHOLEMOLECULES on multiple molecules incorrectly. Prepare a PLUMED input file that makes all the water molecules whole. Use the following template Now look at the resulting file with The important take-home message here is that when you want to reconstruct periodic boundary conditions correctly in systems with multiple molecules you should be careful and always verify with DUMPATOMS that the system is doing what you expect. To learn more: Mastering WHOLEMOLECULES
Exercise 2b: Mistakes with WHOLEMOLECULES
# First load information about the molecule.
MOLINFO __FILL__
# Define here the groups that you need
rna: GROUP ATOMS=__FILL__
mg: GROUP ATOMS=__FILL__
wat: GROUP ATOMS=__FILL__
# Make RNA whole
WHOLEMOLECULES ENTITY0=rna
# Now make water whole as if it was a single molecule
WHOLEMOLECULES ENTITY0=wat
# And dump the resulting trajectory
DUMPATOMS ATOMS=rna,wat,mg FILE=wrong.gro
vmd wrong.gro
. Can you understand which is the problem?
In an exercise above we used FIT_TO_TEMPLATE. This action uses as a reference a PDB file which typically contains a subset of atoms (those that are fitted). However, when you apply FIT_TO_TEMPLATE with Check how the periodic box rotates when using FIT_TO_TEMPLATE. Use the following template You should obtains files like the ones reported below. As you can see, the generating vectors of the periodic lattice before fitting are constant. On the other hand, after fitting these vectors change so as to keep RNA correctly aligned to its reference structure. Later on you will learn how to add a bias potential on a give collective variable. In principle, you could also add a RESTRAINT to the To learn more: Mastering FIT_TO_TEMPLATE
TYPE=OPTIMAL
, the whole system is translated and rotated. The whole system here means all atoms plus the vectors defining the periodic box.
Exercise 2c: Mastering FIT_TO_TEMPLATE
# First load information about the molecule.
MOLINFO __FILL__
# Define here the groups that you need
rna: GROUP ATOMS=__FILL__
mg: GROUP ATOMS=__FILL__
wat: GROUP ATOMS=__FILL__
# Make RNA whole
WHOLEMOLECULES ENTITY0=rna
# Here's a compund variable with the box vectors
# computed before aligning RNA
cell_before: CELL
# Now we align RNA
FIT_TO_TEMPLATE __FILL__ TYPE=OPTIMAL
# Here's a compund variable with the box vectors
# computed after aligning RNA
cell_after: CELL
PRINT ARG=cell_before.* FILE=CELL_BEFORE
PRINT ARG=cell_after.* FILE=CELL_AFTER
CELL_BEFORE
should be #! FIELDS time cell_before.ax cell_before.ay cell_before.az cell_before.bx cell_before.by cell_before.bz cell_before.cx cell_before.cy cell_before.cz
0.000000 4.533710 0.000000 0.000000 0.000000 4.533710 0.000000 2.266860 2.266860 3.205821
1.000000 4.533710 0.000000 0.000000 0.000000 4.533710 0.000000 2.266860 2.266860 3.205821
2.000000 4.533710 0.000000 0.000000 0.000000 4.533710 0.000000 2.266860 2.266860 3.205821
3.000000 4.533710 0.000000 0.000000 0.000000 4.533710 0.000000 2.266860 2.266860 3.205821
4.000000 4.533710 0.000000 0.000000 0.000000 4.533710 0.000000 2.266860 2.266860 3.205821
CELL_AFTER
should be #! FIELDS time cell_after.ax cell_after.ay cell_after.az cell_after.bx cell_after.by cell_after.bz cell_after.cx cell_after.cy cell_after.cz
0.000000 4.533710 -0.000059 -0.000008 0.000059 4.533710 -0.000172 2.266895 2.266952 3.205730
1.000000 -0.396226 4.289476 -1.413481 -1.244340 1.260309 4.173460 2.249665 3.307132 2.134590
2.000000 -3.016552 1.123968 -3.192434 -1.055123 -4.375593 -0.543533 -4.309790 -1.356178 0.375612
3.000000 -4.083873 1.923282 -0.421306 0.339577 -0.267554 -4.513051 -3.243502 -2.069026 -2.398628
4.000000 -4.020722 2.094622 -0.029688 -1.060483 -1.979827 3.938298 -1.263169 2.532008 3.542306
cell_after.*
variables of the last example. This would allow you to force your molecule to a specific orientation.
In summary, in this tutorial you should have learned how to use PLUMED to:
All of this was done by just reading an already available trajectory. Notice that there are many alternative tools that could have been used to do the same exercise. Indeed, if you are familiar with other tools, it might be a good idea to also try them and compare the results. The special things of working with PLUMED are the following:
The last point is probably the main reason why PLUMED exists and what distinguishes it from other available software.