| This is part of the isdb module |
Calculate the fit of a structure or ensemble of structures with a cryo-EM density map.
This action implements the multi-scale Bayesian approach to cryo-EM data fitting introduced in Ref. [55] . This method allows efficient and accurate structural modeling of cryo-electron microscopy density maps at multiple scales, from coarse-grained to atomistic resolution, by addressing the presence of random and systematic errors in the data, sample heterogeneity, data correlation, and noise correlation.
The experimental density map is fit by a Gaussian Mixture Model (GMM), which is provided as an external file specified by the keyword GMM_FILE. We are currently working on a web server to perform this operation. In the meantime, the user can request a stand-alone version of the GMM code at massimiliano.bonomi_AT_gmail.com.
When run in single-replica mode, this action allows atomistic, flexible refinement of an individual structure into a density map. Combined with a multi-replica framework (such as the -multi option in GROMACS), the user can model an ensemble of structures using the Metainference approach [18] .
By default this Action calculates the following quantities. These quantities can be referenced elsewhere in the input by using this Action's label followed by a dot and the name of the quantity required from the list below.
| Quantity | Description |
| scoreb | Bayesian score |
In addition the following quantities can be calculated by employing the keywords listed below
| Quantity | Keyword | Description |
| acc | NOISETYPE | MC acceptance for uncertainty |
| scale | REGRESSION | scale factor |
| accscale | REGRESSION | MC acceptance for scale regression |
| enescale | REGRESSION | MC energy for scale regression |
| anneal | ANNEAL | annealing factor |
| ATOMS | atoms for which we calculate the density map, typically all heavy atoms. For more information on how to specify lists of atoms see Groups and Virtual Atoms |
| GMM_FILE | file with the parameters of the GMM components |
| NL_CUTOFF | The cutoff in overlap for the neighbor list |
| NL_STRIDE | The frequency with which we are updating the neighbor list |
| SIGMA_MIN | minimum uncertainty |
| RESOLUTION | Cryo-EM map resolution |
| NOISETYPE | functional form of the noise (GAUSS, OUTLIERS, MARGINAL) |
| NUMERICAL_DERIVATIVES | ( default=off ) calculate the derivatives for these quantities numerically |
| NOPBC | ( default=off ) ignore the periodic boundary conditions when calculating distances |
| NO_AVER | ( default=off ) don't do ensemble averaging in multi-replica mode |
| SIGMA0 | initial value of the uncertainty |
| DSIGMA | MC step for uncertainties |
| MC_STRIDE | Monte Carlo stride |
| ERR_FILE | file with experimental or GMM fit errors |
| OV_FILE | file with experimental overlaps |
| NORM_DENSITY | integral of the experimental density |
| STATUS_FILE | write a file with all the data useful for restart |
| WRITE_STRIDE | write the status to a file every N steps, this can be used for restart |
| REGRESSION | regression stride |
| REG_SCALE_MIN | regression minimum scale |
| REG_SCALE_MAX | regression maximum scale |
| REG_DSCALE | regression maximum scale MC move |
| SCALE | scale factor |
| ANNEAL | Length of annealing cycle |
| ANNEAL_FACT | Annealing temperature factor |
| TEMP | temperature |
| PRIOR | exponent of uncertainty prior |
| WRITE_OV_STRIDE | write model overlaps every N steps |
| WRITE_OV | write a file with model overlaps |
In this example, we perform a single-structure refinement based on an experimental cryo-EM map. The map is fit with a GMM, whose parameters are listed in the file GMM_fit.dat. This file contains one line per GMM component in the following format:
#! FIELDS Id Weight Mean_0 Mean_1 Mean_2 Cov_00 Cov_01 Cov_02 Cov_11 Cov_12 Cov_22 Beta
0 2.9993805e+01 6.54628 10.37820 -0.92988 2.078920e-02 1.216254e-03 5.990827e-04 2.556246e-02 8.411835e-03 2.486254e-02 1
1 2.3468312e+01 6.56095 10.34790 -0.87808 1.879859e-02 6.636049e-03 3.682865e-04 3.194490e-02 1.750524e-03 3.017100e-02 1
...
To accelerate the computation of the Bayesian score, one can:
All the heavy atoms of the system are used to calculate the density map. This list can conveniently be provided using a GROMACS index file.
The input file looks as follows:
# include pdb info MOLINFO STRUCTURE=prot.pdb # all heavy atoms protein-h: GROUP NDX_FILE=index.ndx NDX_GROUP=Protein-H # create EMMI score gmm: EMMI NOPBC SIGMA_MIN=0.01 TEMP=300.0 NL_STRIDE=100 NL_CUTOFF=0.01 GMM_FILE=GMM_fit.dat ATOMS=protein-h # translate into bias - apply every 2 steps emr: BIASVALUE ARG=gmm.scoreb STRIDE=2 PRINT ARG=emr.* FILE=COLVAR STRIDE=500 FMT=%20.10f