Our findings suggest that a combination of factors contribute to duplicate gene persistence following whole genome duplication, but that the total expression level and evenness of expression across tissues and through development before duplication are most important. We speculate that these parameters are useful predictors of duplicate gene longevity after whole genome duplication in other taxa.
Gene duplication is an important biological phenomenon associated with genomic redundancy, degeneration, specialization, innovation, and speciation. After duplication, both copies continue functioning when natural selection favors duplicated protein function or expression, or when mutations make them functionally distinct before one copy is silenced.
Here we quantify the degree to which genetic parameters related to gene expression, molecular evolution, and gene structure in a diploid frog - Silurana tropicalis - influence the odds of functional persistence of orthologous duplicate genes in a closely related tetraploid species - Xenopus laevis. Using public databases and 454 pyrosequencing, we obtained genetic and expression data from S. tropicalis orthologs of 3,387 X. laevis paralogs and 4,746 X. laevis singletons - the most comprehensive dataset for African clawed frogs yet analyzed. Using logistic regression, we demonstrate that the most important predictors of the odds of duplicate gene persistence in the tetraploid species are the total gene expression level and evenness of expression across tissues and development in the diploid species. Slow protein evolution and information density (fewer exons, shorter introns) in the diploid are also positively correlated with duplicate gene persistence in the tetraploid.