About
Stephen Altschul, PhD
Senior Investigator, Computational Biology Branch, NCBIMy research centers on the comparison and analysis of DNA and protein sequences, with a recent focus on multiple alignments and the correlations among positions within protein superfamilies.
Contact Information
Building 38A, 8600 Rockville Pike MSC 6075 Bethesda, MD 20894
Tel: (301) 435-7803
Research Interests
Correlations in amino acid usage among sequence positions are evident in very large multiple sequence alignments (MSAs). Two distinct hypotheses for how these correlations arise lead to distinct mathematical approaches to their description, recognition and analysis. The first imagines the homologous proteins within a large MSA as having a common three-dimensional structure, and that correlations are due to the physical interaction of residues near to one another within this structure. This approach, in which correlations are modeled directly using pairwise coupling terms, has been extensively studied for many years. It has gained notable recent success with the introduction of Direct Coupling Analysis (DCA), which mitigates the confounding effects of indirect correlations, in which contacting positions i & j and j & k, lead to correlation between non-contacting positions i & k. The second hypothesis imagines the homologous proteins within a large MSA as falling into related families and sub-families, whose divergent but related functions impose different constraints on their constituent members. Under this model, correlations can be completely explained by the hierarchical structure of family and subfamily divergence, without the need to assume correlations between sequence positions within any particular subfamily. The MSA definition and associated statistical model that correspond to this view have been much less widely studied, and have been a principle focus of my recent research.