Classification and regression tree methods represent a potentially powerful means of identifying patterns in exposure data that may otherwise be overlooked. Here, regression tree models are developed to identify associations between blood concentrations of benzene and lead and over 300 variables of disparate type (numerical and categorical), often with observations that are missing or below the quantitation limit. Benzene and lead are selected from among all the environmental agents measured in the NHEXAS Region V study because they are ubiquitous, and they serve as paradigms for volatile organic compounds (VOCs) and heavy metals, two classes of environmental agents that have very different properties. Two sets of regression models were developed. In the first set, only environmental and dietary measurements were employed as predictor variables, while in the second set these were supplemented with demographic and time-activity data. In both sets of regression models, the predictor variables were regressed on the blood concentrations of the environmental agents. Jack-knife cross-validation was employed to detect overfitting of the models to the data. Blood concentrations of benzene were found to be associated with: (a) indoor air concentrations of benzene; (b) the duration of time spent indoors with someone who was smoking; and (c) the number of cigarettes smoked by the subject. All these associations suggest that tobacco smoke is a major source of exposure to benzene. Blood concentrations of lead were found to be associated with: (a) house dust concentrations of lead; (b) the duration of time spent working in a closed workshop; and (c) the year in which the subject moved into the residence. An unexpected finding was that the regression trees identified time-activity data as better predictors of the blood concentrations than the measurements in environmental and dietary media.