[Chimera-users] problem with huge PDB atom numbers

Tue Feb 16 11:29:49 PST 2010

Hi Sam,
	To supplement Elaine's answer, obviously there is no approach to this  
issue that doesn't have it's problems or the PDB would have adopted it  
instead of its current awkward single-entry-split-across-multiple- 
files/IDs approach.  We thought about various options and decided to  
adopt AMBER's approach of bleeding the hundred-thousands digit into  
the sixth column (essentially adding "ATOM 1" through "ATOM 9"  
records).  The problem with resetting serial numbers is that it breaks  
CONECT records.  Of course, the digit-bleed approach does too (CONECT  
records only accommodate 5-digit serial numbers), but at least for a  
fairly common type of large-atom-number system, namely highly solvated  
proteins from MD (where CONECT records are typically only needed in  
the first 99999 atoms), the approach works without any problems.
	We're probably going to stick with our wrong approach instead of  
adopting your wrong approach. :-)  Perhaps the forthcoming revised PDB  
format that Michael Zimmermann mentioned will solve all our problems...

--Eric

	Eric Pettersen
	UCSF Computer Graphics Lab

On Feb 16, 2010, at 10:14 AM, Elaine Meng wrote:

> Hi Sam,
> The spillover into column 6 was intentional, but it is  
> understandable that it could cause problems for other programs.  The  
> original PDB format (as you must be painfully aware) was not  
> designed to handle such large structures as we have today, and  
> different programs have taken different liberties with the format to  
> try to accommodate these things.  PDB itself has taken the approach  
> of splitting large structures into multiple entries.  Your approach  
> breaks the rule of unique serial numbers, and Chimera's approach  
> messes with column 6.  Chimera can use the serial numbers as unique  
> identifiers, and if I remember correctly, we started using column 6  
> after noting that another program (maybe it was AMBER trajectories?)  
> expanded large serial numbers there.  Conversely, the programs you  
> are using apparently tolerate duplicate serial numbers but not  
> numbers in column 6.
>
> I'm not sure what the upshot will be, but it is useful to know about  
> these problems.  Thanks for letting us know,
> Elaine
> -----
> Elaine C. Meng, Ph.D.
> UCSF Computer Graphics Lab (Chimera team) and Babbitt Lab
> Department of Pharmaceutical Chemistry
> University of California, San Francisco
>
>
> On Feb 14, 2010, at 1:00 PM, Samuel Coulbourn Flores wrote:
>
>> Hi Guys,
>> I thought I'd point out a problem with Chimera's atom numbering.  I  
>> am working on a ribosome model which has about 50 chains.  The  
>> ribosome pushes the limits of the PDB format;  the five columns  
>> reserved for atom numbers are not sufficient to give each atom a  
>> unique atom number.  This is not typically a problem for me, I just  
>> start the atom numbering at 1 for each chain.  However chimera  
>> doesn't do this, it tries to assign sequential, unique numbers to  
>> each atom and ends up spilling over onto column 6, which is in the  
>> record name field.  This wreaks havoc with other programs which I  
>> use to process the PDB files that chimera puts out.  I am manually  
>> renumbering to deal with the issue, but the developers should think  
>> about a more permanent solution.
>> Sam
>>
>
> _______________________________________________
> Chimera-users mailing list
> Chimera-users at cgl.ucsf.edu
> http://www.cgl.ucsf.edu/mailman/listinfo/chimera-users