[chimerax-users] label_seq_id in mmcif files for Chimera
Greg Couch
gregc at cgl.ucsf.edu
Wed Jul 8 13:27:12 PDT 2020
On 7/8/2020 12:06 PM, Marcin Wojdyr wrote:
> Hi Tom,
>
> I don't want to continue it for too long, but I need to comment on this:
>
>> That documentation says it is used by the 60 mmCIF tables I copied below -- if you don't included it then none of those tables can refer to a specific residue.
> They can - by using different identifiers, the ones inherited from the
> PDB format. Moreover, if the residue that you refer to is HOH you have
> to use different identifiers, because label_seq_id is null (.) for
> ligands and waters. It cannot be used to refer to a specific water
> molecule.
>
> Marcin
Yes, that is a major failing of the mmCIF format. If the
atom_site.label_seq_id were non-null for solvent, then all of the lines
in the atom_site table would be uniquely keyed, with the label_* values,
like a database table -- that would be fantastic for us computer science
types. ChimeraX uses the atom_site.auth_seq_id to disambiguate the
solvent. The PDBe is thinking of adding another column to the atom_site
table for the same purpose.
It appears that it would be fairly easy for the PDB to fix this. The
mmcif dictionary description for atom_site.label_seq_id would have to be
updated to allow values for solvent. The current description is:
> This data item is a pointer to _entity_poly_seq.num in the
> ENTITY_POLY_SEQ category.
That could be amended. As far as I can tell, CIF validators do not use
the description when validating CIF files, so the current limitation is
not a formal requirement, just a convention. The problem is all of the
programs the PDB uses.
More information about the ChimeraX-users
mailing list