ChimeraX docs icon

Command: alphafold

AlphaFold is an artificial intelligence method for predicting protein structures that has been highly successful in recent tests. The alphafold command:

Users should cite:

Highly accurate protein structure prediction with AlphaFold. Jumper J, Evans R, Pritzel A, et al. Nature. 2021 Jul 15. doi: 10.1038/s41586-021-03819-2.

The AlphaFold Database contains predictions for 21 species, including humans. It does not cover all of UniProt. The predicted structures vary in confidence levels and should be interpreted with caution. The AlphaFold system predicts structures for single chains, not complexes; assembling the individual predictions into a complex may give unphysical results where parts of the chains intersect or interact poorly with one another.

The alphafold command is also implemented as the AlphaFold tool. See also: Blast Protein, Modeller Comparative, comparing AlphaFold and experimental structures, and ChimeraX videos: searching by sequence for AlphaFold predictions (first 5 minutes), fetching an AlphaFold prediction, matching predictions to an assembly, predicting a new structure

Getting Predictions from the AlphaFold Database

Usage: alphafold fetch  uniprot-id  alignTo  chain-spec [ trim  true | false ]]colorConfidence  true | false ] [ ignoreCache  true | false ]
Usage: alphafold match  sequence  [ search  true | false ] [ trim  true | false ] [ colorConfidence  true | false ] [ ignoreCache  true | false ]
Usage: alphafold search  sequence  [ matrix  similarity-matrix ] [ cutoff  evalue ] [ maxSeqs  M ]

Options

alignTo  chain-spec
Superimpose the predicted structure from alphafold fetch onto a single chain in an already-open structure, and make its chain ID the same as that chain's. See also the trim option.
colorConfidence  true | false
Whether to color the predicted structures by the pLDDT confidence measure in the B-factor field (default true):

...in other words, using

color bfactor palette alphafold

The Color Key graphical interface or a command can be used to draw a corresponding color key, for example:

key red:low orange: yellow: cornflowerblue: blue:high  [other-key-options]
ignoreCache  true | false
The fetched predictions are stored locally in ~/Downloads/ChimeraX/AlphaFold/, where ~ indicates a user's home directory. If a file specified for opening is not found in this local cache or ignoreCache is set to true, the file will be fetched and cached.
search  true | false
When fetching predictions with alphafold match, whether to search the database for the most similar sequence if the UniProt accession number for a chain is not provided in the experimental structure's input file, or is provided but not found in the AlphaFold Database (true, default). The search uses a BLAT web service hosted by the UCSF RBVI. The closest sequence match for which a prediction is available will be retrieved, as long as the sequence identity is at least 25%. With search false, only the experimental structure's input file will be used as a potential source of UniProt accession numbers. When present, these are given in DBREF records in PDB format and in struct_ref and struct_ref_seq tables in mmCIF.
trim  true | false
Whether to trim a predicted protein structure to the same residue range as the corresponding experimental structure given with the alphafold match command or the alignTo option of alphafold fetch. If true (default):

Using trim false indicates retaining the full-length predictions for the UniProt sequences, which could be longer.

Running an AlphaFold Prediction

Usage: alphafold predict  sequence 

The protein sequence to predict can be given as any of the following:

AlphaFold calculations are run using Google Colab. A warning will appear saying that this Colab notebook is from github (was not authored by Google), with a button to click to run anyway. Users will need to have a Google account and to sign into it via a browser. Once that is done, the sign-in may be remembered depending on the user's browser settings; it is not kept in the ChimeraX preferences. A single prediction run generally takes on the order of an hour or more. The process includes installing various software packages on a virtual machine, searching sequence databases, generating a multiple sequence alignment, predicting atomic coordinates, and energy-minimizing the best structure. The free version of Google Colab does not allow for much run time. Those who want to run longer and/or more frequent calculations may wish to sign up for one of the paid Colab plans.

The prediction will be opened automatically and colored by confidence value. If the sequence was specified by structure chain, the prediction will be superimposed on that chain and assigned structure-comparison attributes for further analysis (details...).

Please note the following caveats:


UCSF Resource for Biocomputing, Visualization, and Informatics / October 2021