[chimerax-users] Ways to fill in missing residues of a protein chain

Elaine Meng meng at cgl.ucsf.edu
Fri Aug 26 15:59:29 PDT 2022


Hi Will,
I guess this complex is not already deposited in the PDB?  Entries in the PDB typically have associated UniProt IDs for the protein sequences, e.g.  entry 3og4 is a complex of human IFNL1 (UniProt Q8IU54 aka IFNL1_HUMAN) with interleukin 28 receptor, alpha (UniProt Q8IU57 aka INLR1_HUMAN), as reported in the RCSB PDB page for that structure:

<https://www.rcsb.org/structure/3OG4>
<https://www.uniprot.org/uniprotkb/Q8IU54/entry>
<https://www.uniprot.org/uniprotkb/Q8IU57/entry>

But I don't know what protein exactly is in your complex.  If you can figure out the UniProt ID for what you are trying to complete, you can fetch the sequence directly from UniProt into ChimeraX, e.g. command:

open uniprot:INLR1_HUMAN

It will associate automatically with same or almost-same sequences, e.g. if you already had 3og4 open, it would associate with chain B of that structure.  In that example, you can see black outline boxes on the sequence that show what residues are missing from the structure, and use Modeller to fill them in.

Another different approach alluded to in earlier answers is to get an AlphaFold prediction instead of running Modeller.  
<https://rbvi.ucsf.edu/chimerax/docs/user/tools/alphafold.html>

Actually, an AlphaFold model for your chain of interest may already be in the AlphaFold database, in which case you don't even have to predict it, just fetch the precomputed model into ChimeraX:

alphafold fetch INLR1_HUMAN

Or if you had your complex structure open  as #1 and want to see if there is a model of the complete chain B (you don't even need to know its UniProt ID), you could try

alphafold match #1/B
-or- 
alphafold match #1/B trim false

(see documentation for options and what they mean <https://rbvi.ucsf.edu/chimerax/docs/user/commands/alphafold.html#fetch>)

This will take a little longer since it's doing a sequence search rather than just fetching a single entry based on its ID.  But for 3og4 it does find a pre-existing model for INLR1_HUMAN with 93% sequence identity, as reported in the Log.

I hope this helps,
Elaine
-----
Elaine C. Meng, Ph.D.                       
UCSF Chimera(X) team
Department of Pharmaceutical Chemistry
University of California, San Francisco

> On Aug 26, 2022, at 2:27 PM, Greg Couch via ChimeraX-users <chimerax-users at cgl.ucsf.edu> wrote:
> 
> The issue of mmCIF files missing sequence information is unfortunately not uncommon.  For example, if you are using Phenix, it used to be (maybe still is) the case that you have to go through the step of preparing the data for submission to the PDB for it to incorporate the sequence information into the mmCIF file.  Having the actual sequence, especially when it is known, helps  ChimeraX help you.
> 
>     -- Greg
> 
> On 8/26/2022 1:47 PM, Will Grubbe via ChimeraX-users wrote:
>> Ok. Thank you for the help! I will try to figure it out. 
>> 
>> On Fri, Aug 26, 2022 at 3:45 PM Eric Pettersen <pett at cgl.ucsf.edu> wrote:
>> Hi Will,
>> 	Thanks for the job IDs, that was helpful.  So your jobs are failing but ChimeraX isn't alerting you to that -- which is a bug we need to fix.  The reason for the failure is that in order to fill in those missing residues, ChimeraX needs to know what those residues are, which apparently is not contained in the IFNL1_complex.cif file that you are using.  You either need to edit that file to insert the complete sequence, or open a separate sequence file containing the complete sequence and associate the structure with that sequence.  It should automatically associate due to the high similarity, but if it doesn't you would have to use the "Structure→Associations..." item in the sequence context menu to make the association.  Anyway, once you have the structure associated with the complete sequence, you should be able to successfully model the loop.
>> 
>> --Eric
>> 
>> 	Eric Pettersen
>> 	UCSF Computer Graphics Lab
>> 
>> 
>>> On Aug 26, 2022, at 8:57 AM, Will Grubbe via ChimeraX-users <chimerax-users at cgl.ucsf.edu> wrote:
>>> 
>>> Thank you for the response Elaine. It was a Modeller online job to fill in some missing residues of a protein complex. I kept ChimeraX                       open the whole time on my computer but can't find anything updated, either on the desktop, in the downloads, or in the results folder I specified. I don't see anything in the log list either - perhaps I am doing something wrong. I do have job IDs if that helps - 
>>> Webservices job id: EZ1QP7XJZLIY0BVI
>>> Webservices job id: 488559BRXJ20I5U4
>>> Webservices job id: L4AM4PK7N293HQ1L
>>> 
>>> On Thu, Aug 25, 2022 at 6:59 PM Elaine Meng <meng at cgl.ucsf.edu> wrote:
>>> HI WIll,
>>> It would help if you would say what kind of job you submitted.  Modeller?  AlphaFold prediction? In either case you would normally keep the same ChimeraX session open and results would automatically open in ChimeraX when they return. If you quit out of ChimeraX while Modeller is still running, currently you can't get the results later.  If it seems to take a really long time, maybe it just failed and you should look in the Log to see if there was a message.  
>>> 
>>> AlphaFold prediction is different since it doesn't run on our website but instead on Google Colab.  If there were any results returned                         before you quit from ChimeraX, they should be in folders under ~/Downloads/ChimeraX/AlphaFold/ on your machine.  Maybe the Colab timed out before returning any results, however.  Even if you were trying to fill in just a few residues, if you were predicting the whole complex it is potentially a very long job.
>>> 
>>> For Alphafold predict, make sure you are using a recent ChimeraX daily build -- in mid-July we improved it to use ColabFold which is a much faster version.  Also in the last day or two, a new option to specify the results folder location was added.
>>> <https://rbvi.ucsf.edu/chimerax/docs/user/tools/alphafold.html#predict>
>>> <https://www.rbvi.ucsf.edu/trac/ChimeraX/wiki/ChangeLog>
>>> 
>>> I hope this helps,
>>> Elaine
>>> -----
>>> Elaine C. Meng, Ph.D.                       
>>> UCSF Chimera(X) team
>>> Department of Pharmaceutical Chemistry
>>> University of California, San Francisco
>>> 
>>> > On Aug 25, 2022, at 3:29 PM, Will Grubbe via ChimeraX-users <chimerax-users at cgl.ucsf.edu> wrote:
>>> > 
>>> > Hello, I've submitted a few jobs using ChimeraX and am wondering how to check the status of them. I just need to fill in a few residues in a protein complex with known sequence - where can I find the results when they are done? 
>>> > Thank you for your help,
>>> > Will 
>>> 
>>> _______________________________________________
>>> ChimeraX-users mailing list
>>> ChimeraX-users at cgl.ucsf.edu
>>> Manage subscription:
>>> https://www.rbvi.ucsf.edu/mailman/listinfo/chimerax-users
>> 
>> 
>> 
>> _______________________________________________
>> ChimeraX-users mailing list
>> 
>> ChimeraX-users at cgl.ucsf.edu
>> 
>> Manage subscription:
>> 
>> https://www.rbvi.ucsf.edu/mailman/listinfo/chimerax-users
> _______________________________________________
> ChimeraX-users mailing list
> ChimeraX-users at cgl.ucsf.edu
> Manage subscription:
> https://www.rbvi.ucsf.edu/mailman/listinfo/chimerax-users




More information about the ChimeraX-users mailing list