Comparative Modeling with Modeller
The program
Modeller
from the Sali
Lab models the structure of a protein, given an alignment of its
sequence with the sequence of at least one known related structure.
This process is known as comparative modeling or
homology modeling.
The sequence to be modeled is the target,
and a structure whose sequence is aligned to the target
can be used as a modeling template.
Chimera provides a graphical interface to running Modeller, either locally
or via a Web service hosted by the
UCSF
RBVI. Choosing Structure... Modeller Tools from the
Multalign Viewer menu
shows the interface. As a trade-off for simplicity, it is limited
to a subset of the calculations that Modeller can perform; for example,
only a single protein chain or subunit can be modeled at a time.
Use of Modeller, whether
downloaded or via Web service, requires a
license key.
Academic users can register free of charge to receive a license key.
(Commercial entities and government research labs, please see
Modeller licensing.) Modeller users should cite:
Comparative protein modelling by satisfaction of spatial restraints.
Šali A, Blundell TL.
J Mol Biol. 1993 Dec 5;234(3):779-815.
See also: Model Loops,
Build Structure,
Rotamers,
Multalign Viewer,
fetching ModBase files
There are several ways
one could generate the needed inputs for comparative modeling.
Individual sequences and sequence alignments in Chimera are shown in
Multalign Viewer.
The Modeller interface can be started by choosing
Structure...
Modeller Tools from the
Multalign Viewer menu.
Settings:
- Choose the target (sequence to be modeled)
- the name of the target sequence should be chosen from the pulldown menu
of all sequences in the current alignment
- Choose at least one template
- at least one structure to use as a template should be chosen from the table,
by clicking or dragging with the left mouse button to highlight the
corresponding rows. Ctrl-click toggles the status of a single row.
The choices are actually the sequences in the alignment, and
choosing a sequence indicates using its associated structure(s).
Some of the columns are blank initially, but clicking
Fetch Structures/Annotations
loads the structures into Chimera and fills in the table, where possible.
The table can be sorted by the contents of any column by clicking the
column header:
- Sequence - sequence name;
choosing a sequence indicates using its associated structure as a template
(the table lists all of the sequences in the alignment except the target)
- Structure ID - identifier for the structure associated
with the sequence, if any; usually a 4-character PDB ID, but could be a
SCOP domain ID
(which includes a PDB ID)
- %ID - percent sequence identity of the sequence as compared to
the target, computed from the alignment
- Title - title of structure (from PDB entry)
- Organism - source organism of structure (from PDB entry)
Clicking Fetch Structures/Annotations:
- fetches the structure for any sequence not already associated with a
structure, if that structure can be deduced from the sequence name (as in the
Structure
preferences in
Multalign Viewer)
- uses the structure IDs to look up additional information
- Advanced Options:
- Run Modeller via web service
- Modeller license key
- a license key is required
to run the program; the Modeller home page includes links
to register for a key and to download the program
- Run Modeller locally
- Location of Modeller executable
- the location of the Modeller executable file;
the license key will
have been entered somewhere already during local installation
- Modeller script file (optional, overrides dialog)
- use the specified Modeller script to control the calculation; this
will override the settings in the dialog. The script corresponding
to the current dialog settings can be viewed in
IDLE by clicking
Get Current Modeller Script, saved to a file using the IDLE menu,
and edited by hand as desired. For more details on scripting Modeller,
consult the Modeller manual.
- Number of output models [N] (max 1000)
(default 5)
- Include non-water HETATM residues from template
(off by default)
- whether to include HETATM residues other than water (ligands, ions,
detergent molecules, etc.) from the template in the output models.
This option will propagate all qualifying residues, even from multiple
templates; those not desired in the output should be
deleted
from the template(s) beforehand.
- Include water molecules from template
(off by default)
- whether to include water residues from the template in the output models.
This option will propagate all qualifying residues, even from multiple
templates; those not desired in the output should be
deleted
from the template(s) beforehand.
- Build models with hydrogens (warning: slow)
(off by default)
- Use fast/approximate mode (produces only one model)
(off by default)
- use fast/approximate mode (~3 times faster) to get a rough idea of
model appearance or to confirm that the alignment is reasonable.
This mode does not randomize the starting structure (generates only a
single model) and performs very little optimization of the target function.
- "Loop" refinement (off by default; warning: slow)
- refine all segments of the target that are aligned to gaps in the template;
currently produces only five models, reflecting five refinements of a
single initial model. The scores in the resulting
Model List
are for the refined regions only.
- Temporary folder location (optional)
- use the specified location for temporary files; otherwise, a
location will be generated automatically
OK starts the calculation and dismisses the panel,
while Apply starts the calculation without dismissing the panel.
Close dismisses the panel without performing any calculation.
Help brings up this manual page in a browser window.
Running Modeller is a background task.
Clicking the information icon
in the Chimera status
line will bring up the Task Panel,
in which the job can be canceled if desired.
After the calculation has finished, the comparative models are opened
in Chimera and can be saved
in the usual ways. The models are automatically superimposed onto the template
(or the lowest-numbered of multiple templates) using
matchmaker
defaults, and the view is focused on that template.
Model-associated information is shown in a
Model List,
the same dialog used
for comparative models fetched
from ModBase.
Running Modeller with identical inputs on different machines may give
different (but equally valid) results, due to small numerical differences
that can lead to finding different local optima of the
modeling objective function.
UCSF Computer Graphics Laboratory / January 2012