Phylogenetic comparative analysis is an approach to inferring evolutionary process from a combination of phylogenetic and phenotypic data. The last few years have seen increasingly sophisticated models employed in the evaluation of more and more detailed evolutionary hypotheses, including adaptive hypotheses with multiple selective optima and hypotheses with rate variation within and across lineages. The statistical performance of these sophisticated models has received relatively little systematic attention, however. We conducted an extensive simulation study to quantify the statistical properties of a class of models toward the simpler end of the spectrum that model phenotypic evolution using Ornstein-Uhlenbeck processes. We focused on identifying where, how, and why these methods break down so that users can apply them with greater understanding of their strengths and weaknesses. Our analysis identifies three key determinants of performance: a discriminability ratio, a signal-to-noise ratio, and the number of taxa sampled. Interestingly, we find that model-selection power can be high even in regions that were previously thought to be difficult, such as when tree size is small. On the other hand, we find that model parameters are in many circumstances difficult to estimate accurately, indicating a relative paucity of information in the data relative to these parameters. Nevertheless, we note that accurate model selection is often possible when parameters are only weakly identified. Our results have implications for more sophisticated methods inasmuch as the latter are generalizations of the case we study.