This is part of the isdb module |
Calculate the fit of a structure or ensemble of structures with a cryo-EM density map.
This action implements the multi-scale Bayesian approach to cryo-EM data fitting introduced in Ref. [55] . This method allows efficient and accurate structural modeling of cryo-electron microscopy density maps at multiple scales, from coarse-grained to atomistic resolution, by addressing the presence of random and systematic errors in the data, sample heterogeneity, data correlation, and noise correlation.
The experimental density map is fit by a Gaussian Mixture Model (GMM), which is provided as an external file specified by the keyword GMM_FILE. We are currently working on a web server to perform this operation. In the meantime, the user can request a stand-alone version of the GMM code at massimiliano.bonomi_AT_gmail.com.
When run in single-replica mode, this action allows atomistic, flexible refinement of an individual structure into a density map. Combined with a multi-replica framework (such as the -multi option in GROMACS), the user can model an ensemble of structures using the Metainference approach [18] .
By default this Action calculates the following quantities. These quantities can be referenced elsewhere in the input by using this Action's label followed by a dot and the name of the quantity required from the list below.
Quantity | Description |
scoreb | Bayesian score |
In addition the following quantities can be calculated by employing the keywords listed below
Quantity | Keyword | Description |
acc | NOISETYPE | MC acceptance for uncertainty |
scale | REGRESSION | scale factor |
accscale | REGRESSION | MC acceptance for scale regression |
enescale | REGRESSION | MC energy for scale regression |
anneal | ANNEAL | annealing factor |
ATOMS | atoms for which we calculate the density map, typically all heavy atoms. For more information on how to specify lists of atoms see Groups and Virtual Atoms |
GMM_FILE | file with the parameters of the GMM components |
NL_CUTOFF | The cutoff in overlap for the neighbor list |
NL_STRIDE | The frequency with which we are updating the neighbor list |
SIGMA_MIN | minimum uncertainty |
RESOLUTION | Cryo-EM map resolution |
NOISETYPE | functional form of the noise (GAUSS, OUTLIERS, MARGINAL) |
NUMERICAL_DERIVATIVES | ( default=off ) calculate the derivatives for these quantities numerically |
NOPBC | ( default=off ) ignore the periodic boundary conditions when calculating distances |
NO_AVER | ( default=off ) don't do ensemble averaging in multi-replica mode |
SIGMA0 | initial value of the uncertainty |
DSIGMA | MC step for uncertainties |
MC_STRIDE | Monte Carlo stride |
ERR_FILE | file with experimental or GMM fit errors |
OV_FILE | file with experimental overlaps |
NORM_DENSITY | integral of the experimental density |
STATUS_FILE | write a file with all the data useful for restart |
WRITE_STRIDE | write the status to a file every N steps, this can be used for restart |
REGRESSION | regression stride |
REG_SCALE_MIN | regression minimum scale |
REG_SCALE_MAX | regression maximum scale |
REG_DSCALE | regression maximum scale MC move |
SCALE | scale factor |
ANNEAL | Length of annealing cycle |
ANNEAL_FACT | Annealing temperature factor |
TEMP | temperature |
PRIOR | exponent of uncertainty prior |
WRITE_OV_STRIDE | write model overlaps every N steps |
WRITE_OV | write a file with model overlaps |
In this example, we perform a single-structure refinement based on an experimental cryo-EM map. The map is fit with a GMM, whose parameters are listed in the file GMM_fit.dat. This file contains one line per GMM component in the following format:
#! FIELDS Id Weight Mean_0 Mean_1 Mean_2 Cov_00 Cov_01 Cov_02 Cov_11 Cov_12 Cov_22 Beta
0 2.9993805e+01 6.54628 10.37820 -0.92988 2.078920e-02 1.216254e-03 5.990827e-04 2.556246e-02 8.411835e-03 2.486254e-02 1
1 2.3468312e+01 6.56095 10.34790 -0.87808 1.879859e-02 6.636049e-03 3.682865e-04 3.194490e-02 1.750524e-03 3.017100e-02 1
...
To accelerate the computation of the Bayesian score, one can:
All the heavy atoms of the system are used to calculate the density map. This list can conveniently be provided using a GROMACS index file.
The input file looks as follows:
# include pdb info MOLINFO STRUCTURE=prot.pdb # all heavy atoms protein-h: GROUP NDX_FILE=index.ndx NDX_GROUP=Protein-H # create EMMI score gmm: EMMI NOPBC SIGMA_MIN=0.01 TEMP=300.0 NL_STRIDE=100 NL_CUTOFF=0.01 GMM_FILE=GMM_fit.dat ATOMS=protein-h # translate into bias - apply every 2 steps emr: BIASVALUE ARG=gmm.scoreb STRIDE=2 PRINT ARG=emr.* FILE=COLVAR STRIDE=500 FMT=%20.10f