SysImm Logo

DASH

Database of Aligned Structural Homologs



Overview

DASH provides a REST-style API with support for custom filtering and ordering. Due to the very large size of the data, DASH operates somewhat differently from other REST services.

This API is currently undergoing beta testing. All endpoints are currently functional, but query input parameters, query output format, and error reporting may change slightly before full release. Upon release the API will be versioned so that updates to the API are released at new web addresses. This means that updates will not break existing tools which rely on an older version of the API.

Usage examples in more programming languages and more complete documentation of the various errors and status codes is coming soon.

Output Format

DASH results are output in plain text as a stream of either JSON(default) or XML objects, one per line of text. This allows the objects to be parsed more efficiently than wrapping the entire list in a single object.

Each object returned by the server contains an integer status code and status message. On success the code is -1, and the message is empty. In the case of an error or other information the Code will be greater than 0, and the Message field will contain further information. If the results of the query were limited in some way the last object will include this information in its StatusCode and StatusMessage.

Example Query Result - Success

{"StatusCode":-1,"StatusMessage":"","MD5":"144d0495e039ed43593146f2f0d586ea","PDBID":"158L_A","Quality":0.4035559892654419,"DepositionDate":"1994-06-20T00:00:00Z","IsMD5Representative":true,"MD5Representative":"158L_A","IsClusterRepresentative":true,"ClusterRepresentative":"158L_A","Sequence":"MNIFEMLRIDEGLRLKIYKDTEGYYTIGIGHLLTKSPSLNAAKSELDKAIGRNTNGVITKDEAEKLFNQDVDAAVRGILRNAKLKPVYDSLDAVRRAALINMVFQMGETGVAGFTNALAMLQQKRWDEAAVNLAKSRWYNQTPNRAKRVITTFRTGTWDAYKNL","Length":164}

Example Query Result - Limited

{"StatusCode":1,"StatusMessage":"Query results returned successfully, but limited to 100 results. Use `limited=0` for unlimited results (may be slow!).","MD5":"144d0495e039ed43593146f2f0d586ea","PDBID":"158L_A","Quality":0.4035559892654419,"DepositionDate":"1994-06-20T00:00:00Z","IsMD5Representative":true,"MD5Representative":"158L_A","IsClusterRepresentative":true,"ClusterRepresentative":"158L_A","Sequence":"MNIFEMLRIDEGLRLKIYKDTEGYYTIGIGHLLTKSPSLNAAKSELDKAIGRNTNGVITKDEAEKLFNQDVDAAVRGILRNAKLKPVYDSLDAVRRAALINMVFQMGETGVAGFTNALAMLQQKRWDEAAVNLAKSRWYNQTPNRAKRVITTFRTGTWDAYKNL","Length":164}

DASH Query Basics

0. The default output format for DASH is JSON, but XML can be specified in the format parameter.
    ./REST0.1/domains?format=XML
    ./REST0.1/chains?format=XML
1. Results can be filtered by specifying the column, an operator(<, =, >, >=, <=). Note, not all data types support all operators.
    ./REST0.1/domains?filter=length<100
    ./REST0.1/domains?filter=pdbid=5VZ0_A
    ./REST0.1/chains?filter=length<100
    ./REST0.1/chains?filter=pdbid=5VZ0_A
2. The number of results can also be specified using the limit parameter.
    ./REST0.1/domains?filter=length<100&limit=10
    ./REST0.1/chains?filter=length<100&limit=10
3. Smaller tables, such as chain and domain, support unlimited query results by specifying a limit of 0.
    ./REST0.1/domains?filter=length<20&limit=0
    ./REST0.1/chains?filter=length<20&limit=0
4. Results can be sorted using the order parameter. For example, sorting by sequence length.
    ./REST0.1/domains?order=length=asc&limit=1000
    ./REST0.1/chains?order=length=asc&limit=1000
5. Combining these different query parameters is very powerful. For example, it's possible to query the 10 top longest domains or chains from 2017.
    ./REST0.1/domains?filter=depositiondate>=2017-01-01,depositiondate<=2017-12-31&columns=pdbid,length,depositiondate&order=length=desc&limit=10
    ./REST0.1/chains?filter=depositiondate>=2017-01-01,depositiondate<=2017-12-31&columns=pdbid,length,depositiondate&order=length=desc&limit=10

DASH Client for MAFFT

Windows 64-bit   |  MacOS 64-bit   |  Linux 64-bit   |  Source Code (Go)

Python 2 Example Program

#!/usr/bin/env python
####
#NOTE:
#  This script is meant to get people started using the DASH REST API.
#  If implementing DASH as part of an automated process or in publicly-available
#    software it is also important to follow HTTP query best practices in order to
#    account for common network errors.
####
import json
import urllib2

dash_rest_url = "https://sysimm.org/dash/REST0.1"

def query_dash(url):
    print "Querying %s" % (url)
    #Download response from server
    response = urllib2.urlopen(url)
    raw_response = response.read()
    #Split response into lines
    json_object_lines = raw_response.splitlines()
    #Parse JSON objects
    json_objects = [json.loads(line) for line in json_object_lines]
    #Check JSON objects for errors
    for json_object in json_objects:
        if json_object["StatusCode"] > 0:
            code = json_object["StatusCode"]
            message = json_object["StatusMessage"]
            print "Status (%d): %s" % (code, message)
    return json_objects

#Build a query URL to search for 5 DASH representatives for a sequence
sequence = "QVQLQQSGPEDVKPGASVKISCKASGYSLSTSGMGVNWVKQSPGKGLEWLAHIYWDDDKRYNPSLKSRATLTVDKSSSTVYLELRSLTSEDSSVYYCARRGGSSHYYAMDYWGQGTTVTVSS"
search_url = "%s/domain_search_sequence?filter=sequence=%s&limit=5" % (dash_rest_url, sequence)

#Query DASH for representatives
dash_reps = query_dash(search_url)

for dash_rep in dash_reps:
    #Build a query URL to fetch top 5 alignments by score for search results
    alignment_url = "%s/domain_alignments?filter=id1=%s&order=score=desc&limit=5" % (dash_rest_url, dash_rep["ID"])
    #Query DASH for alignments
    alignments = query_dash(alignment_url)
    #Output information from alignments
    for alignment in alignments:
        print "---------"
        print "%s vs. %s - %d" % (alignment["ID1"], alignment["ID2"], alignment["SCORE"])
        print "PRIMS1: %s" % (alignment["PRIMS1"])
        print "SECOS1: %s" % (alignment["SECOS1"])
        print "PRIMS2: %s" % (alignment["PRIMS2"])
        print "SECOS2: %s" % (alignment["SECOS2"])
        print "EQUIV: %s" % (alignment["EQUIVALENCE"])

Data Type Reference

Type Operators Description Example
Boolean = True or false. 0, 1, true, false
Chain Pair One or more pairs of chain ID's on separate lines. 4X0K_A_2N01_B
Date <, <=, =, >, >= Day, month, and year 2018-08-27
Domain ID = An ID with PDB ID, chain, and domain number. 5VZ0_A_01
Domain Pair One or more pairs of domain ID's on separate lines. 4X0K_A_01_2N01_B_01
Decimal <=, >= Decimal value 3.14
Integer <, <=, =, >, >= Integer value. 3
MD5 = A hash of the sequence for fast comparison. A1B1DBCE8B0C61F80DEEF2F542979EA1
Order = Which order to sort results in. ASC or DESC
PDB ID = PDB ID with chain 5VZ0_A
Segments Which residues from chain sequence are part of the domain. 111-138; 209-344; ...
Sequence = Protein sequence RHKILHRLLQEGSPS
Text Raw text. Hello world!

Endpoint List & Column Type Reference


GET - ./REST1.0/chains

Field Type Filterable Sortable Required Note
ClusterRepresentative PDB ID * *
DepositionDate Date * *
Description Text *
IsClusterRepresentative Boolean * *
IsMD5Representative Boolean * *
Length Integer * *
MD5 MD5 * *
MD5Representative PDB ID * *
PDBID PDB ID * *
Quality Decimal * *
SecondaryStructure Sequence *
Sequence Sequence * * Only supports exact match.

GET - ./REST1.0/domains

Field Type Filterable Sortable Required Note
ClusterRepresentative PDB ID * *
DepositionDate Date * *
Description Text *
DomainID Domain ID * *
IsClusterRepresentative Boolean * *
IsMD5Representative Boolean * *
Length Integer * *
MD5Representative PDB ID * *
PDBID PDB ID * *
ResidueNumbers Segments
SecondaryStructure Sequence *
Segments Segments
Sequence Sequence * * Only supports exact match.

GET - ./REST1.0/chain_search_sequence

Field Type Filterable Sortable Required Note
Coverage Integer
Description
End Integer
ID PDB ID
Sequence Sequence * * Input only.
Start Integer

GET - ./REST1.0/domain_search_sequence

Field Type Filterable Sortable Required Note
Coverage Integer
Description
End Integer
ID Domain ID
Sequence Sequence * * Input only.
Start Integer

POST - ./REST1.0/chain_search_sequence

Field Type Filterable Sortable Required Note
Coverage Integer
Description
End Integer
ID PDB ID
Sequence Sequence * * Input only. Taken as FASTA in POST body.
Start Integer

POST - ./REST1.0/domain_search_sequence

Field Type Filterable Sortable Required Note
Coverage Integer
Description
End Integer
ID Domain ID
Sequence Sequence * * Input only. Taken as FASTA in POST body.
Start Integer

GET - ./REST1.0/chain_alignments_metadata

Field Type Filterable Sortable Required Note
ID1 PDB ID * * Not required if PDBID1 is requested.
ID2 PDB ID *
NALIGN Integer * *
NER Decimal * *
NRES1 Integer
NRES2 Integer
PDBID1 PDB ID * * Not required if ID1 is requested.
PDBID2 PDB ID *
RMSD Decimal * *
SCORE Integer * *
SEQID Decimal * *

GET - ./REST1.0/domain_alignments_metadata

Field Type Filterable Sortable Required Note
ID1 Domain ID * * Not required if PDBID1 is requested.
ID2 Domain ID *
NALIGN Integer * *
NER Decimal * *
NRES1 Integer
NRES2 Integer
PDBID1 PDB ID * * Not required if ID1 is requested.
PDBID2 PDB ID *
RMSD Decimal * *
SCORE Integer * *
SEQID Decimal * *

GET - ./REST1.0/chain_alignments

Field Type Filterable Sortable Required Note
ID1 PDB ID * * Not required if PDBID1 is requested.
ID2 PDB ID *
NALIGN Integer * *
NER Decimal * *
NRES1 Integer
NRES2 Integer
PDBID1 PDB ID * * Not required if ID1 is requested.
PDBID2 PDB ID *
RMSD Decimal * *
SCORE Integer * *
SEQID Decimal * *

GET - ./REST1.0/domain_alignments

Field Type Filterable Sortable Required Note
ID1 Domain ID * * Not required if PDBID1 is requested.
ID2 Domain ID *
NALIGN Integer * *
NER Decimal * *
NRES1 Integer
NRES2 Integer
PDBID1 PDB ID * * Not required if ID1 is requested.
PDBID2 PDB ID *
RMSD Decimal * *
SCORE Integer * *
SEQID Decimal * *

POST - ./REST1.0/chain_alignments

Field Type Filterable Sortable Required Note
ID1 PDB ID
ID2 PDB ID
NALIGN Integer
NER Decimal
NRES1 Integer
NRES2 Integer
PDBID1 PDB ID
PDBID2 PDB ID
Pairs Chain Pair * * Input only. Taken in HTTP POST body.
RMSD Decimal
SCORE Integer
SEQID Decimal

POST - ./REST1.0/domain_alignments

Field Type Filterable Sortable Required Note
ID1 Domain ID
ID2 Domain ID
NALIGN Integer
NER Decimal
NRES1 Integer
NRES2 Integer
PDBID1 PDB ID
PDBID2 PDB ID
Pairs Domain Pair * * Input only. Taken in HTTP POST body.
RMSD Decimal
SCORE Integer
SEQID Decimal



Bug Reporting & Feedback


Powered by CD-HIT, CentOS, DSSP, Go, Google Cloud, Molmil, MSAViewer, NCBI BLAST+, and PostgreSQL.
© 2019 Department of Genome Informatics; Research Institute for Microbial Diseases; Osaka University