In recent years, large networks are routinely used to represent data from many scientific fields. Statistical analysis of these networks, such as model fitting and hypothesis testing, has received considerable attention. However, most of the methods proposed in the literature are computational expensive for large networks. In this paper, we propose a subsampling-based method to reduce the computational cost of estimation and two-sample hypothesis testing. The idea is to divide the network into smaller subgraphs with an overlap region, then draw inference based on each subgraph, and finally combine the results together. We first develop the subsampling method for random dot product graph (RDPG) models, and establish theoretical consistency of the proposed method. Then we extend the subsampling method to a more general setup that includes the RDPG model, and provide similar theoretical properties. We demonstrate the performance of our methods through simulation experiments and real data analysis.
Variational inference provides an appealing alternative to traditional sampling based approaches for implementing Bayesian inference due to its conceptual simplicity, statistical accuracy and computational scalability. However, common variational approximation schemes, such as the mean-field (MF) approximation, still require certain conjugacy structure to facilitate efficient computation, which may add unnecessary restrictions to the viable prior distribution family and impose further constraints on the variational approximation family. In this work, we develop a general computational framework for implementing MF variational inference via Wasserstein gradient flow (WGF), a modern mathematical technique for realizing a gradient flow over the space of probability measures. When specialized to a common class of Bayesian latent variable models, we analyze the algorithmic convergence of an alternating minimization scheme based on a time-discretized WGF for implementing the MF approximation. In particular, the proposed algorithm resembles a distributional version of the classical Expectation–Maximization algorithm, consisting of an E-step of updating the latent variable variational distribution and an M-step of conducting steepest descent over the variational distribution of model parameters. Our theoretical analysis relies on optimal transport theory and subdifferential calculus in the space of probability measures. As an intermediate result of independent interest, we prove the exponential convergence of the time-discretized WGF for minimizing a generic objective functional given it is strictly convex along generalized geodesics. We also provide a new proof, utilizing the fixed point equation of the time-discretized WGF, to show the exponential contraction (towards the true model parameter) of the variational distribution obtained from the MF approximation. The developed method and theory are applied to two representative Bayesian latent variable models, the Gaussian mixture model and the mixture of regression model. Numerical experiments are also conducted to compliment the theoretical findings under these two models.
616 E Green St. suite 213
Lunch RSVP Form (Please Fill the form by Thursday noon)