What is DASH?
DASH is a database of structural alignments for all known structurally homologous protein domains and chains in the PDB.
The processing involves (a) clustering sequence-unique proteins from the PDB using CD-HIT at 99% sequence identity;
(b) decomposing the sequence representatives into domains using Protein Domain Parser (Alexandrov N, Shindyalov I; Bioinformatics 2003);
(c) aligning all domains against all domains on Google Cloud using RASH (Standley DM, Toh H, Nakamura H; BMC Bioinformatics 2007);
(d) building composite chain alignments from individual domain alignments.
(Rozewicki, et al.; Nucleic Acids Research 2019 [PubMed])
Last Update: 2019-06-10
Number of Chains: 454,237
Number of Domains: 171,929
Number of Chain Alignments: 59,004,105
Number of Domain Alignments: 98,363,095
Search for Alignments
By PDB ID:
By Single Sequence:
Upcoming Feature Roadmap
- Improved alignment accuracy.
- Better documentation of missing chains/domains.
- Browser-based search by PDB structure file.
- REST endpoint for search by PDB structure file.
- Search by PFAM ID.
- Online MAFFT-DASH jobs with custom parameters.
DASH is designed to work with all modern browsers.
Browsers we have tested and are known to work are listed below.
Windows: Chrome, Edge, Firefox, Internet Explorer, and Opera.
macOS: Chrome, Firefox, Opera, and Safari.
Linux: Chrome, Firefox, and Opera.
If you encounter a problem while using DASH in any of the above browsers, please let us know.