[chimerax-users] label_seq_id in mmcif files for Chimera
Tom Goddard
goddard at sonic.net
Wed Jul 8 10:29:39 PDT 2020
Hi Marcin,
In the mmCIF format atom_site.label_seq_id is required as described in the mmCIF documentation
http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_atom_site.label_seq_id.html <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_atom_site.label_seq_id.html>
That documentation says it is used by the 60 mmCIF tables I copied below -- if you don't included it then none of those tables can refer to a specific residue. A minimal mmCIF file can of course not have any of those 60 tables so it could work in principle to omit label_seq_id or specify it as ".". But you are really asking for software not to work. It is like not including any atom names because you don't happen to need the atom names -- good luck getting software to work correctly when you omit basic information.
Tom
_atom_site_anisotrop.pdbx_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_atom_site_anisotrop.pdbx_label_seq_id.html> _geom_angle.atom_site_label_seq_id_1 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_angle.atom_site_label_seq_id_1.html> _geom_angle.atom_site_label_seq_id_2 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_angle.atom_site_label_seq_id_2.html> _geom_angle.atom_site_label_seq_id_3 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_angle.atom_site_label_seq_id_3.html> _geom_bond.atom_site_label_seq_id_1 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_bond.atom_site_label_seq_id_1.html> _geom_bond.atom_site_label_seq_id_2 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_bond.atom_site_label_seq_id_2.html> _geom_contact.atom_site_label_seq_id_1 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_contact.atom_site_label_seq_id_1.html> _geom_contact.atom_site_label_seq_id_2 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_contact.atom_site_label_seq_id_2.html> _geom_hbond.atom_site_label_seq_id_A <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_hbond.atom_site_label_seq_id_A.html> _geom_hbond.atom_site_label_seq_id_D <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_hbond.atom_site_label_seq_id_D.html> _geom_hbond.atom_site_label_seq_id_H <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_hbond.atom_site_label_seq_id_H.html> _geom_torsion.atom_site_label_seq_id_1 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_torsion.atom_site_label_seq_id_1.html>_geom_torsion.atom_site_label_seq_id_2 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_torsion.atom_site_label_seq_id_2.html> _geom_torsion.atom_site_label_seq_id_3 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_torsion.atom_site_label_seq_id_3.html> _geom_torsion.atom_site_label_seq_id_4 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_geom_torsion.atom_site_label_seq_id_4.html> _ndb_struct_na_base_pair.i_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_ndb_struct_na_base_pair.i_label_seq_id.html>_ndb_struct_na_base_pair.j_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_ndb_struct_na_base_pair.j_label_seq_id.html> _ndb_struct_na_base_pair_step.i_label_seq_id_1 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_ndb_struct_na_base_pair_step.i_label_seq_id_1.html> _ndb_struct_na_base_pair_step.i_label_seq_id_2 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_ndb_struct_na_base_pair_step.i_label_seq_id_2.html> _ndb_struct_na_base_pair_step.j_label_seq_id_1 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_ndb_struct_na_base_pair_step.j_label_seq_id_1.html> _ndb_struct_na_base_pair_step.j_label_seq_id_2 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_ndb_struct_na_base_pair_step.j_label_seq_id_2.html> _pdbx_atom_site_aniso_tls.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_atom_site_aniso_tls.label_seq_id.html> _pdbx_domain_range.beg_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_domain_range.beg_label_seq_id.html> _pdbx_domain_range.end_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_domain_range.end_label_seq_id.html> _pdbx_feature_monomer.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_feature_monomer.label_seq_id.html> _pdbx_refine_component.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_refine_component.label_seq_id.html> _pdbx_remediation_atom_site_mapping.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_remediation_atom_site_mapping.label_seq_id.html> _pdbx_sequence_range.beg_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_sequence_range.beg_label_seq_id.html> _pdbx_sequence_range.end_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_sequence_range.end_label_seq_id.html> _pdbx_struct_chem_comp_diagnostics.seq_num <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_chem_comp_diagnostics.seq_num.html> _pdbx_struct_chem_comp_feature.seq_num <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_chem_comp_feature.seq_num.html> _pdbx_struct_conn_angle.ptnr1_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_conn_angle.ptnr1_label_seq_id.html> _pdbx_struct_conn_angle.ptnr2_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_conn_angle.ptnr2_label_seq_id.html> _pdbx_struct_conn_angle.ptnr3_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_conn_angle.ptnr3_label_seq_id.html> _pdbx_struct_group_component_range.end_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_group_component_range.end_label_seq_id.html> _pdbx_struct_group_components.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_group_components.label_seq_id.html> _pdbx_struct_mod_residue.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_mod_residue.label_seq_id.html> _pdbx_struct_sheet_hbond.range_1_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_sheet_hbond.range_1_label_seq_id.html> _pdbx_struct_sheet_hbond.range_2_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_pdbx_struct_sheet_hbond.range_2_label_seq_id.html> _struct_conf.beg_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_conf.beg_label_seq_id.html> _struct_conf.end_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_conf.end_label_seq_id.html> _struct_conn.pdbx_ptnr3_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_conn.pdbx_ptnr3_label_seq_id.html> _struct_conn.ptnr1_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_conn.ptnr1_label_seq_id.html> _struct_conn.ptnr2_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_conn.ptnr2_label_seq_id.html> _struct_mon_nucl.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_mon_nucl.label_seq_id.html> _struct_mon_prot.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_mon_prot.label_seq_id.html> _struct_mon_prot_cis.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_mon_prot_cis.label_seq_id.html> _struct_mon_prot_cis.pdbx_label_seq_id_2 <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_mon_prot_cis.pdbx_label_seq_id_2.html> _struct_sheet_hbond.range_1_beg_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_sheet_hbond.range_1_beg_label_seq_id.html> _struct_sheet_hbond.range_1_end_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_sheet_hbond.range_1_end_label_seq_id.html> _struct_sheet_hbond.range_2_beg_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_sheet_hbond.range_2_beg_label_seq_id.html> _struct_sheet_hbond.range_2_end_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_sheet_hbond.range_2_end_label_seq_id.html> _struct_sheet_range.beg_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_sheet_range.beg_label_seq_id.html> _struct_sheet_range.end_label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_sheet_range.end_label_seq_id.html> _struct_site_gen.label_seq_id <http://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v40.dic/Items/_struct_site_gen.label_seq_id.html>
> On Jul 8, 2020, at 4:27 AM, Marcin Wojdyr <wojdyr at gmail.com> wrote:
>
> Hi Greg,
>
>> Adding a atom_site.label_seq_id isn't different from supplying a residue
>> number in PDB file. When there are adjacent residues of the same type,
>> does the PDB reader see a duplicate atom and generate a new residue?
>> merge the residues? generate an error? I haven't tested the PDB reader,
>> but a residue number helps it too.
>
> The sequence number/id from the PDB format tells which atoms are in
> the same residue, but it doesn't imply connectivity between residues,
> because the numbers don't need to be consecutive. In the mmCIF format
> it is stored as _atom_site.auth_seq_id (+pdbx_PDB_ins_code for the
> full id). So this makes conversion from PDB to mmCIF problematic. If
> SEQRES is present I do sequence alignment to determine label_seq_id.
> If SEQRES is missing I could ask the user to supply the full sequence,
> but then the user may think that (since this was not obligatory when
> working with PDB files) moving to mmCIF is a step backward. I could
> infer gaps and increase label_seq_id by 2 if there is a gap, but the
> resulting mmCIF file can be used for any purpose, not only by Chimera.
> The apparent gap may actually be caused by misplaced atoms and it
> could turn out that the gap in numbering is causing later different
> problems. It's not clear to me what's better here - leaving
> label_seq_id null or filling it with the best guesses (which sometimes
> will be wrong).
>
> Marcin
> _______________________________________________
> ChimeraX-users mailing list
> ChimeraX-users at cgl.ucsf.edu
> Manage subscription:
> https://www.rbvi.ucsf.edu/mailman/listinfo/chimerax-users
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://plato.cgl.ucsf.edu/pipermail/chimerax-users/attachments/20200708/48ca59f6/attachment-0001.html>
More information about the ChimeraX-users
mailing list