Opened 11 hours ago

Last modified 11 hours ago

#19910 assigned enhancement

Fetching AlphaFold databases entries with non-Uniprot ids

Reported by: phil.cruz@… Owned by: Tom Goddard
Priority: moderate Milestone:
Component: Structure Prediction Version:
Keywords: Cc:
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

Phil Cruz points out that AlphaFold Database now includes several other databases of sequences that do not use UniProt ids. These are described in the AFDB FAQ page:

https://alphafold.ebi.ac.uk/faq

Phil asks if ChimeraX could fetch these.

Hello Tom,

...

A final concern is that as of this month the AlphaFold database has added millions of new entries from external databases that may not have UniProt IDs. I think these will likely be beyond the scope of Quick Submits, but want to hear your thoughts on if there are ways to address this in the context of NIH 3D Quick Submits.

Looking forward to seeing your thoughts.

Phil

Change History (1)

comment:1 by Tom Goddard, 11 hours ago

I see the AlphaFold database added AllTheBacteria, Kinetoplastid, Big Fantastic Virus Database, and Viro3D which don't use UniProt ids. For instance I see the following Viro3D id AF_0000000365833946 and file name AF-0000000365833946-model_v1.cif which does not contain the "F1" (fragment 1) part that Uniprot id entries have. ChimeraX cannot currently fetch those. I should be able to add support for them. First I'd have to figure out the possible ids for the various databases and the corresponding URLs to fetch them (I see https://alphafold.ebi.ac.uk/files/AF-0000000365833946-model_v1.cif works, although not documented anywhere I could find). I see the different non-uniprot-id databases have different database versions, v1 for my Viro3D example. I'm not sure how ChimeraX is going to be able to keep up to date with the heterogeneous IDs and database versions.

Note: See TracTickets for help on using tickets.