High-throughput structural modeling for B-cell and T-cell receptors
Added ability to get an e-mail with a link to job results.
Rotated all template data into a common reference frame.
Header lines are not required if only submitting one sequence/sequence pair.
Jobs are limited to 10,000 models.
Header lines for sequence pairs(including blacklists, punctuation, etc.) must be identical in order to match heavy/beta sequences and alpha/light sequences.
Output file names will be extracted before the first separator in each header.
Separators are |, : and whitespace such as spaces or tabs.
Output file name length is limited to 32 characters.
Output file names must not contain characters other than the following: a-z, A-Z, 0-9, and the special characters - and _ Output file names must be unique for each sequence/sequence pair.
Sequences must be amino-acid sequences with no gaps(-) and no ambiguous residues(B, O, U, X, Z).
PDB ID blacklists are extracted between (( and )) in the header line.
Overview of Repertoire Builder Workflow
A) MSAs of templates for each chain (heavy, light) are prepared CDRs, framework and orientation
B) for CDRs, MSAs are binned by the loop length
C) Extension of template MSAs retains original template-template relationships
D) query-template alignments are ranked by the dot-product between a query-template feature vector and an MSA-specific weight vector
E) top-scoring templates are used to assemble a coherent model.