
DASH (Database of Aligned Structural Homologs) is a database of structural alignments for all known structurally homologous protein domains and chains in the PDB.

The processing involves (a) clustering sequence-unique proteins from the PDB using CD-HIT at 99% sequence identity; (b) decomposing the sequence representatives into domains using Protein Domain Parser (Alexandrov N, Shindyalov I; Bioinformatics 2003); (c) aligning all domains against all domains on Google Cloud using RASH (Standley DM, Toh H, Nakamura H; BMC Bioinformatics 2007); (d) building composite chain alignments from individual domain alignments. (Rozewicki, et al.; Nucleic Acids Research 2019 [PubMed])

