AlphaFold is an artificial intelligence method for predicting protein structures that has been highly successful in recent tests. The alphafold command:
Highly accurate protein structure prediction with AlphaFold. Jumper J, Evans R, Pritzel A, et al. Nature. 2021 Jul 15. doi: 10.1038/s41586-021-03819-2.
The AlphaFold Database contains predictions for 21 species, including humans. It does not cover all of UniProt. The predicted structures vary in confidence levels and should be interpreted with caution. The AlphaFold system predicts structures for single chains, not complexes; assembling the individual predictions into a complex may give unphysical results where parts of the chains intersect or interact poorly with one another.
The alphafold command is also implemented as the AlphaFold tool. See also: Blast Protein, Modeller Comparative, comparing AlphaFold and experimental structures, and ChimeraX videos: searching by sequence for AlphaFold predictions (first 5 minutes), fetching an AlphaFold prediction, matching predictions to an assembly, predicting a new structure
Usage: alphafold fetch uniprot-id [ alignTo chain-spec [ trim true | false ]] [ colorConfidence true | false ] [ ignoreCache true | false ]
Usage: alphafold match sequence [ search true | false ] [ trim true | false ] [ colorConfidence true | false ] [ ignoreCache true | false ]
Usage: alphafold search sequence [ matrix similarity-matrix ] [ cutoff evalue ] [ maxSeqs M ]
alphafold fetch p29474
– OR – (equivalent)
open p29474 from alphafold
alphafold match #1Alternatively, the sequence can be given as any of the following:
alphafold match #3/B,D trim false
For a specified structure chain, a prediction is obtained for its exact UniProt entry if available, otherwise the single top hit identified by BLAT-searching the AlphaFold Database (details...). For each prediction with a corresponding structure chain from the alphafold match command or the alignTo option of alphafold fetch:
The matrix option indicates which amino acid similarity-matrix to use for scoring the hits (uppercase or lowercase can be used): BLOSUM45, BLOSUM50, BLOSUM62 (default), BLOSUM80, BLOSUM90, PAM30, PAM70, PAM250, or IDENTITY. The cutoff evalue is the maximum or least significant expectation value needed to qualify as a hit (default 1e-3). Results can also be limited with the maxSeqs option (default 100); this is the maximum number of unique sequences to return; more hits than this number may be obtained because multiple structures or other sequence-database entries may have the same sequence.
Superimpose the predicted structure from alphafold fetch onto a single chain in an already-open structure, and make its chain ID the same as that chain's. See also the trim option.
colorConfidence true | false
Whether to color the predicted structures by the pLDDT confidence measure in the B-factor field (default true):
– high accuracy expected
– expected to be modeled well (a generally good backbone prediction)
– low confidence, treat with caution
– should not be interpreted, may be disordered
...in other words, usingcolor bfactor palette alphafold
The Color Key graphical interface or a command can be used to draw a corresponding color key, for example:key red:low orange: yellow: cornflowerblue: blue:high [other-key-options]
ignoreCache true | false
The fetched predictions are stored locally in ~/Downloads/ChimeraX/AlphaFold/, where ~ indicates a user's home directory. If a file specified for opening is not found in this local cache or ignoreCache is set to true, the file will be fetched and cached.
search true | false
When fetching predictions with alphafold match, whether to search the database for the most similar sequence if the UniProt accession number for a chain is not provided in the experimental structure's input file, or is provided but not found in the AlphaFold Database (true, default). The search uses a BLAT web service hosted by the UCSF RBVI. The closest sequence match for which a prediction is available will be retrieved, as long as the sequence identity is at least 25%. With search false, only the experimental structure's input file will be used as a potential source of UniProt accession numbers. When present, these are given in DBREF records in PDB format and in struct_ref and struct_ref_seq tables in mmCIF.
trim true | false
Whether to trim a predicted protein structure to the same residue range as the corresponding experimental structure given with the alphafold match command or the alignTo option of alphafold fetch. If true (default):
- Predictions with UniProt identifier determined by alphafold match from the experimental structure's input file will be trimmed to the same residue ranges as used in the experiment. These ranges are given in DBREF records in PDB format and in struct_ref and struct_ref_seq tables in mmCIF.
- Predictions retrieved with alphafold fetch or found by alphafold match searching for similar sequences in the AlphaFold Database will be trimmed to start and end with the first and last aligned positions in the sequence alignment calculated by matchmaker as part of the superposition step.
Using trim false indicates retaining the full-length predictions for the UniProt sequences, which could be longer.
Usage: alphafold predict sequence
The protein sequence to predict can be given as any of the following:
AlphaFold calculations are run using Google Colab. A warning will appear saying that this Colab notebook is from github (was not authored by Google), with a button to click to run anyway. Users will need to have a Google account and to sign into it via a browser. Once that is done, the sign-in may be remembered depending on the user's browser settings; it is not kept in the ChimeraX preferences. A single prediction run generally takes on the order of an hour or more. The process includes installing various software packages on a virtual machine, searching sequence databases, generating a multiple sequence alignment, predicting atomic coordinates, and energy-minimizing the best structure. The free version of Google Colab does not allow for much run time. Those who want to run longer and/or more frequent calculations may wish to sign up for one of the paid Colab plans.
The prediction will be opened automatically and colored by confidence value. If the sequence was specified by structure chain, the prediction will be superimposed on that chain and assigned structure-comparison attributes for further analysis (details...).
Please note the following caveats: