[chimerax-users] CIF writer renumbers residues

Tue Jun 23 12:03:02 PDT 2020

To set the sequence:

chain.bulk_set(residues, seq_string)

'residues' should be the same length as seq_string, which means it may contain Nones where structure is missing.  To make it so that the mmCIF writer believes that the sequence info is "authoritative", you also have to:

chain.from_seqres = True

(which you could do without the bulk_set() if the sequence info was otherwise already correct).

--Eric

> On Jun 23, 2020, at 11:49 AM, Tristan Croll <tic20 at cam.ac.uk> wrote:
> 
> I’m unclear on whether ChimeraX currently has a method to definitively assign a sequence to a chain. If it does, that would be really useful.
> 
> 
> 
> 
>> On 23 Jun 2020, at 19:40, Greg Couch <gregc at cgl.ucsf.edu> wrote:
>> 
>> If your mmCIF input file is incomplete and the missing sequence information, then the mmCIF writer does not write it out either. The are too many ways to get it wrong and mislead the scientist. That said, I can envision adding a "best guess" option to the mmCIF writer someday.
>> 
>> As for the residue renumbering, there are two sets of residue numbers in a mmCIF file, the internal label_seq_id that is used to link the the atom_site table entries with other mmCIF tables.  And the auth_seq_id, which is the author assigned value.  The auth_seq_id written is the same as the one read in.
>> 
>> In your case, you should use the auth_seq_id for matching (assuming it's present, you could add it with the same value as the label_seq_id to your original mmCIF file).  Or, the original mmCIF file needs to supply the sequence information (the entity, entity_poly, and entity_poly_seq tables), so the gaps are known instead of guessed.  Or, perhaps ISOLDE could help you by inserting "unknown" gap residues in to the chain to preserve the numbering.
>> 
>>    HTH,
>> 
>>    Greg
>> 
>>> On 6/23/2020 10:47 AM, Daniel Asarnow wrote:
>>> Hi all,
>>> When I save a CIF from ChimeraX (while using ISOLDE), the warning "Not saving entity_poly_seq for non-authoritative sequences" is produced, and all the residues have been sequentially renumbered from 1. When there are missing residues, each segment has to be renumbered manually afterwards. Is there some way to avoid this with PDB or CIF inputs? Is it a bug?
>>> 
>>> Best,
>>> -da
>>> 
>> _______________________________________________
>> ChimeraX-users mailing list
>> ChimeraX-users at cgl.ucsf.edu
>> Manage subscription:
>> https://www.rbvi.ucsf.edu/mailman/listinfo/chimerax-users
> 
> 
> _______________________________________________
> ChimeraX-users mailing list
> ChimeraX-users at cgl.ucsf.edu
> Manage subscription:
> https://www.rbvi.ucsf.edu/mailman/listinfo/chimerax-users
>