CIF: Small: Collaborative Research: Rank Aggregation with Heterogeneous Information Sources: Efficient Algorithms and Fundamental Limits


While advances in the ability to collect and store data have made large data sets commonplace, these data sets increasingly consist of information obtained from different sources with various data types and properties that impede the ability to extract knowledge and make decisions. This project focuses on inferring the ranking of a set of objects from heterogeneous datasets with arbitrary noise, which is also known as rank aggregation with heterogeneous information sources. The developed algorithms will be made publicly available as open source software tools, and will significantly expand the applicability of rank aggregation to real-world problems, such as data fusion, information retrieval, crowd-sourcing, recommendation systems, as well as social choice and voting. This project will also provide educational and training opportunities and exposure to sophisticated statistical tools, rigorous theoretical analysis, and the empirical work of extracting knowledge from large heterogeneous data sets. In this project, based on statistical models of data, efficient and scalable rank aggregation algorithms for various settings will be developed along with performance guarantees and fundamental limits, in three complementary research thrusts. First, it will develop rank aggregation algorithms based on flexible latent probabilistic models that exploit side information and allow both ordinal and numerical data types. It will also provide information-theoretic lower bounds on the performance of such algorithms. Second, it will design robust algorithms for latent probabilistic models in which the unknown parameters are a superposition of many structured parameters, and models in which data can be corrupted by arbitrary noise. Finally, the problem of inferring a ranking through interactive bandit algorithms will be studied. This project aims to push the frontier of rank aggregation research, and can potentially advance research in machine learning, nonconvex optimization and information theory. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.


Funding Source

Project Period