Statistical theory of Internet exploration.


The general methodology used to construct Internet maps consists in merging all the discovered paths obtained by sending data packets from a set of active computers to a set of destination hosts, obtaining a graphlike representation of the network. This technique, sometimes referred to as Internet tomography, spurs the issue concerning the statistical reliability of such empirical maps. We tackle this problem by modeling the network sampling process on synthetic graphs and by using a mean-field approximation to obtain expressions for the probability of edge and vertex detection in the sampled graph. This allows a general understanding of the origin of possible sampling biases. In particular, we find a direct dependence of the map statistical accuracy upon the topological properties (in particular, the betweenness centrality property) of the underlying network. In this framework, it appears that statistically heterogeneous network topologies are captured better than the homogeneous ones during the mapping process. Finally, the analytical discussion is complemented with a thorough numerical investigation of simulated mapping strategies in network models with varying topological properties.

MIDAS Network Members