Comparative Modeling with Modeller

The program Modeller from the Sali Lab models the structure of a protein, given an alignment of its sequence with the sequence of at least one known related structure. This process is known as comparative modeling or homology modeling. The sequence to be modeled is the target, and a structure whose sequence is aligned to the target can be used as a modeling template.

Chimera provides a graphical interface to running Modeller, either locally or via a Web service hosted by the UCSF RBVI. Choosing Structure... Modeller Tools from the Multalign Viewer menu shows the interface. As a trade-off for simplicity, it is limited to a subset of the calculations that Modeller can perform; for example, only a single protein chain or subunit can be modeled at a time.

Use of Modeller, whether downloaded or via Web service, requires a license key. Academic users can register free of charge to receive a license key. (Commercial entities and government research labs, please see Modeller licensing.) Modeller users should cite:

Comparative protein modelling by satisfaction of spatial restraints. Šali A, Blundell TL. J Mol Biol. 1993 Dec 5;234(3):779-815.

See also: Model Loops, Build Structure, Rotamers, Multalign Viewer, fetching ModBase files

There are several ways one could generate the needed inputs for comparative modeling.

Individual sequences and sequence alignments in Chimera are shown in Multalign Viewer. The Modeller interface can be started by choosing Structure... Modeller Tools from the Multalign Viewer menu. Settings:

Choose the target (sequence to be modeled) - the name of the target sequence should be chosen from the pulldown menu of all sequences in the current alignment
Choose at least one template - at least one structure to use as a template should be chosen from the table, by clicking or dragging with the left mouse button to highlight the corresponding rows. Ctrl-click toggles the status of a single row. The choices are actually the sequences in the alignment, and choosing a sequence indicates using its associated structure(s). Some of the columns are blank initially, but clicking Fetch Structures/Annotations loads the structures into Chimera and fills in the table, where possible. The table can be sorted by the contents of any column by clicking the column header:
- Sequence - sequence name; choosing a sequence indicates using its associated structure as a template (the table lists all of the sequences in the alignment except the target)
- Structure ID - identifier for the structure associated with the sequence, if any; usually a 4-character PDB ID, but could be a SCOP domain ID (which includes a PDB ID)
- %ID - percent sequence identity of the sequence as compared to the target, computed from the alignment
- Title - title of structure (from PDB entry)
- Organism - source organism of structure (from PDB entry)
Clicking Fetch Structures/Annotations:
1. fetches the structure for any sequence not already associated with a structure, if that structure can be deduced from the sequence name (as in the Structure preferences in Multalign Viewer)
2. uses the structure IDs to look up additional information
Advanced Options:
- Run Modeller via web service
  - Modeller license key - a license key is required to run the program; the Modeller home page includes links to register for a key and to download the program
- Run Modeller locally
  - Location of Modeller executable - the location of the Modeller executable file; the license key will have been entered somewhere already during local installation
  - Modeller script file (optional, overrides dialog) - use the specified Modeller script to control the calculation; this will override the settings in the dialog. The script corresponding to the current dialog settings can be viewed in IDLE by clicking Get Current Modeller Script, saved to a file using the IDLE menu, and edited by hand as desired. For more details on scripting Modeller, consult the Modeller manual.
- Number of output models [N] (max 1000) (default 5)
- Include non-water HETATM residues from template (off by default) - whether to include HETATM residues other than water (ligands, ions, detergent molecules, etc.) from the template in the output models. This option will propagate all qualifying residues, even from multiple templates; those not desired in the output should be deleted from the template(s) beforehand.
- Include water molecules from template (off by default) - whether to include water residues from the template in the output models. This option will propagate all qualifying residues, even from multiple templates; those not desired in the output should be deleted from the template(s) beforehand.
- Build models with hydrogens (warning: slow) (off by default)
- Use fast/approximate mode (produces only one model) (off by default) - use fast/approximate mode (~3 times faster) to get a rough idea of model appearance or to confirm that the alignment is reasonable. This mode does not randomize the starting structure (generates only a single model) and performs very little optimization of the target function.
- "Loop" refinement (off by default; warning: slow) - refine all segments of the target that are aligned to gaps in the template; currently produces only five models, reflecting five refinements of a single initial model. The scores in the resulting Model List are for the refined regions only.
- Temporary folder location (optional) - use the specified location for temporary files; otherwise, a location will be generated automatically

OK starts the calculation and dismisses the panel, while Apply starts the calculation without dismissing the panel. Close dismisses the panel without performing any calculation. Help brings up this manual page in a browser window.

Running Modeller is a background task. Clicking the information icon in the Chimera status line will bring up the Task Panel, in which the job can be canceled if desired.

After the calculation has finished, the comparative models are opened in Chimera and can be saved in the usual ways. The models are automatically superimposed onto the template (or the lowest-numbered of multiple templates) using matchmaker defaults, and the view is focused on that template. Model-associated information is shown in a Model List, the same dialog used for comparative models fetched from ModBase.

Running Modeller with identical inputs on different machines may give different (but equally valid) results, due to small numerical differences that can lead to finding different local optima of the modeling objective function.

UCSF Computer Graphics Laboratory / January 2012