A genetic algorithm approach to detecting lineage-specific variation in selection pressure.


The ratio of nonsynonymous (dN) to synonymous (dS) substitution rates, omega, provides a measure of selection at the protein level. Models have been developed that allow omega to vary among lineages. However, these models require the lineages in which differential selection has acted to be specified a priori. We propose a genetic algorithm approach to assign lineages in a phylogeny to a fixed number of different classes of omega, thus allowing variable selection pressure without a priori specification of particular lineages. This approach can identify models with a better fit than a single-ratio model, and with fits that are better than (in an information theoretic sense) a fully local model, in which all lineages are assumed to evolve under different values of omega, but with far fewer parameters. By averaging over models which explain the data reasonably well, we can assess the robustness of our conclusions to uncertainty in model estimation. Our approach can also be used to compare results from models in which branch classes are specified a priori with a wide range of credible models. We illustrate our methods on primate lysozyme sequences and compare them with previous methods applied to the same data sets.

MIDAS Network Members