[chimerax-users] label_seq_id in mmcif files for Chimera

Greg Couch gregc at cgl.ucsf.edu
Wed Jul 8 13:27:12 PDT 2020


On 7/8/2020 12:06 PM, Marcin Wojdyr wrote:
> Hi Tom,
>
> I don't want to continue it for too long, but I need to comment on this:
>
>> That documentation says it is used by the 60 mmCIF tables I copied below -- if you don't included it then none of those tables can refer to a specific residue.
> They can - by using different identifiers, the ones inherited from the
> PDB format. Moreover, if the residue that you refer to is HOH you have
> to use different identifiers, because label_seq_id is null (.) for
> ligands and waters. It cannot be used to refer to a specific water
> molecule.
>
> Marcin

Yes, that is a major failing of the mmCIF format.  If the 
atom_site.label_seq_id were non-null for solvent, then all of the lines 
in the atom_site table would be uniquely keyed, with the label_* values, 
like a database table -- that would be fantastic for us computer science 
types.  ChimeraX uses the atom_site.auth_seq_id to disambiguate the 
solvent.  The PDBe is thinking of adding another column to the atom_site 
table for the same purpose.

It appears that it would be fairly easy for the PDB to fix this. The 
mmcif dictionary description for atom_site.label_seq_id would have to be 
updated to allow values for solvent.  The current description is:

> This data item is a pointer to _entity_poly_seq.num in the 
> ENTITY_POLY_SEQ category.
That could be amended.  As far as I can tell, CIF validators do not use 
the description when validating CIF files, so the current limitation is 
not a formal requirement, just a convention.  The problem is all of the 
programs the PDB uses.



More information about the ChimeraX-users mailing list