Command: sequence

The sequence command can:

show the sequence of a structure chain in the Sequence Viewer
control sequence-structure associations
show, hide, save alignment headers
calculate pairwise percent identities
calculate a new sequence alignment using a web service

Except for showing a sequence, these actions can also be accessed from the Sequence Viewer context menu.

Show Sequence from Structure

Usage: sequence chain chain-spec [ viewer true | false ]

The command sequence chain shows the sequence of the specified biopolymer chain in the Sequence Viewer, although the graphical interface can be suppressed (for example, to run a script that uses the sequence data but not its display) with viewer false. Only one structure chain should be specified per command. See also: Molecule Display icon

Independent of structure, sequence alignments and individual sequences can also be opened from files or fetched from UniProt. Other tools or commands may generate new sequence alignments (e.g., Blast Protein results, Matchmaker, sequence realignment).

[back to top: sequence]

Sequence-Structure Association

Usage: sequence associate chain-spec [ alignment-ID:sequence-ID ]
Usage: sequence ( dissociate | disassociate ) chain-spec alignment-ID

Sequence-structure association (such as for synchronized selection) occurs automatically, but the commands sequence associate and sequence dissociate (same as sequence disassociate) allow more precise control, for example, of which structure chains are used for header calculations, or forcing or removing associations regardless of whether the number of mismatches would be tolerated by the automatic procedure.

The command sequence associate associates one or more structure chains with a sequence. The target sequence for association is specified by alignment ID, as reported in the title bar of the Sequence Viewer window, and the name or index number of the target sequence in the alignment, in the form: alignment-ID:sequence-ID (details...).

Alternatively, the sequence-ID can be omitted to associate each specified structure chain with the the best-matching sequence in the alignment. The alignment-ID can be omitted if only one alignment is open, or if the sequence-ID is also omitted; in the latter case, each specified structure chain will be associated with the best-matching sequence in each open alignment. If either or both are omitted, the colon (:) should also be omitted except in rare cases to disambiguate an alignment and sequence that have the same name.

For sequence dissociate, only the alignment needs to be specified, not an individual sequence, because a structure chain can only be associated with one sequence per alignment.

[back to top: sequence]

Sequence Header Controls

Usage: sequence header [ alignment-ID ] header-name ( show | hide | save filename )

The command sequence header shows, hides, or saves a sequence header to a file. (It can also be used to change the sequence Headers preferences, but command details are omitted here because normally the Settings dialog will be used instead.)

The header-name can be consensus, conservation, or rmsd, although there will only be an effect when that header is available (for example, an RMSD header is only available for alignments associated with at least two structures). Headers are saved to a simple text format that lists the alignment positions and values. The filename can be given as a pathname or the word browse to bring up a file browser window for choosing the name and location interactively. If multiple alignments are open but an alignment-ID is not specified, showing/hiding affects all applicable alignments. However, saving only works for a single header at a time, so an alignment-ID must be given when more than one alignment is open.

[back to top: sequence]

Calculate Percent Identities

Usage: sequence identity alignment-ID [ denominator  shorter | longer | nongap ]
Usage: sequence identity alignment-ID alignment-ID:sequence-ID [ denominator  shorter | longer | nongap ]
Usage: sequence identity alignment-ID:sequence-ID alignment-ID:sequence-ID [ denominator  shorter | longer | nongap ]

The sequence identity command calculates the pairwise percent identity between sequences of the same length (including gaps, as shown in the Sequence Viewer window). The calculation is always pairwise, but can be performed for all-by-all pairs within a single alignment, or all-by-one, or between two specific sequences. An entire alignment is specified by its ID, shown in the title bar of the Sequence Viewer window, and an individual sequence by the alignment ID plus the sequence's name or index number in the alignment, in the form: alignment-ID:sequence-ID (details...).

Results are listed in the Log. For each pair, the number of columns with identical residues is given as a percentage of the specified denominator:

shorter (default) – the number of residues in the shorter of the two sequences
longer – the number of residues in the longer of the two sequences
nongap – the number of columns where neither sequence has a gap

[back to top: sequence]

Align Sequences using Clustal Omega or MUSCLE

Usage: sequence align alignment-ID [ replace  true | false ] [ program  clustalOmega | muscle ]
Usage: sequence align chain-spec [ program  clustalOmega | muscle ]
Usage: sequence align sequence1,sequence2[,sequence3...,sequenceN] [ program  clustalOmega | muscle ]

The sequence align command calculates a new alignment of the specified protein sequences using a web service hosted by the UCSF RBVI. The result is opened in a new Sequence Viewer window, except that replace true (default false) can be used to specify overwriting an existing alignment when all of its sequences are being realigned.

The sequences to align can be specified collectively by:

alignment ID, as shown in the title bar of an existing Sequence Viewer window;
a chain-spec (for atomic-structure protein chains already open in ChimeraX)

...or individually as a comma-separated list (without spaces) of any combination of:

plain text of the entire amino acid sequence pasted directly into the command line
UniProt name or accession number, for example:
sequence align ldlr_rat,ldlr_mouse,ldlr_human
the sequence-spec of a sequence in the Sequence Viewer, in the form: alignment-ID:sequence-ID (details...). Example:
open myfile.msf
sequence align 1,2,3,-1
– OR (if multiple sequence windows are open) –
sequence align myfile.msf:1,myfile.msf:2,myfile.msf:3,myfile.msf:-1

The program can be either of two choices:

clustalOmega (default, same as clustal or omega) – Clustal Omega v1.1.0 with parameters:
- Number of guide-tree/HMM iterations: 1
- Full distance matrix during initial alignment: true
- Full distance matrix during alignment iteration: true
See the README file at the Clustal Omega website for details. Users should cite:

Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H, Remmert M, Söding J, Thompson JD, Higgins DG. Mol Syst Biol. 2011 Oct 11;7:539.
muscle – MUSCLE v3.8.31 with parameters:
- Maximum number of iterations: 16
- Maximum time to iterate: unlimited
- Find diagonals (faster execution if sequences are similar, with possible decrease in accuracy): false
See the command-line reference at the MUSCLE website and BMC Bioinformatics 5:113 (2004) for details. Users should cite:

MUSCLE: multiple sequence alignment with high accuracy and high throughput. Edgar RC. Nucleic Acids Res. 2004 Mar 19;32(5):1792-7.

UCSF Resource for Biocomputing, Visualization, and Informatics / October 2023