[Chimera-users] Thanks, Re: mmcif, auth_xx information versus label_xx information
Greg Couch
gregc at cgl.ucsf.edu
Fri Dec 7 09:28:38 PST 2018
So it's not so simple. The auth_.... fields are not required, they just
happen to be in the files that the PDB provides. Other programs can not
be expected to provide them. The RCSB has software that can validate
mmCIF files against various dictionaries: the MMCIF Dictionary Suite,
https://sw-tools.rcsb.org/apps/MMCIF-DICT-SUITE/index.html. Programs
that generate mmCIF files should verify that the generated files are
complete. And the dictionary used for validation should be documented
in the audit_conform table.
If your data is in the mmCIF format, you are better off using ChimeraX
than Chimera. The reader is orders of magnitude faster. And I am
working on fixing any bugs that come up :-). I have documented the
parts of mmCIF files that are important for ChimeraX. For example, the
Cartesian coordinates are required for atoms. See
https://www.cgl.ucsf.edu/chimerax/docs/devel/bundles/mmcif/src/mmcif.html
-- in particular, read "ChimeraX Fast mmCIF Guidelines" and "Writing
mmCIF FIles in ChimeraX".
-- Greg
On 12/7/2018 12:47 AM, moocow at mindless.com wrote:
> Thanks for the explanation. The water-numbering argument is convincing
> as a reason to use author_... fields. I think these also correspond
> with the fields in the old pdb format.
> I can see the PDB's logic in using a flexible format with room to put
> in every conceivable piece of information, but one can wonder if they
> really did the world a favour. It is almost guaranteed that this
> program will use author_.. fields and that program will use label_...
> fields.
>
> Yes, Chimera uses the author's numbering because that it what is used
> in the author's publication. It turns out that this is especially
> important for waters since the PDB/RCSB refuses to number waters in
> the label_seq_id field which breaks the one-to-one mapping of
> (label_seq_id, label_asym_id) to a residue.
>
> In Chimera, the mapping with label_seq_id is discarded after is it
> used. In ChimeraX, there is no UI for label_seq_id, but, in Python
> code, the nth residue in a Chain instance is label_seq_id n. And
> residues have a mmcif_chain_id attribute which is the label_asym_id.
> So it is possible to reconstruct the mapping if need be.
>
> HTH,
>
> Greg
>
> On 12/6/18 4:03 AM, moocow at mindless.com wrote:
>
> An mmcif / numbering question...
> When specifying residues, chimera seems to use "auth_seq_id" (the
> authors' numbering), as well as the authors' naming of chains
> (auth_asym_id).
> Is there any way to ask chimera to find residues using
> "label_seq_id" and "label_seq_id" (the numbering the protein data
> bank gives) ?
> Is the label_xxx information discarded when chimera reads an mmcif
> file or is it hidden in an attribute somewhere ?
>
>
> _______________________________________________
> Chimera-users mailing list: Chimera-users at cgl.ucsf.edu
> Manage subscription: http://plato.cgl.ucsf.edu/mailman/listinfo/chimera-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://plato.cgl.ucsf.edu/pipermail/chimera-users/attachments/20181207/b6832a52/attachment.html>
More information about the Chimera-users
mailing list