David Sirl
[Statistics and Probability Seminar]
Scaling Gaussian Process Models
Gaussian processes (GPs) are a mainstay of statistical machine learning. GPs provide priors over functions which can then be used directly for regression, or as building blocks in more complex models. A straightforward GP implementation has a computational requirement that scales cubically with the number of data; an issue which many in the machine learning community have attempted to address with approximations which generally scale as O(NM^2), where N is the number of data and M is a user defined complexity choice. For really large N, we wish to either apply data-parallelism or stochastic optimization: both of which require factorizing properties of the objective function. Existing approximate GP methods are generally inapplicable, but we formulate an approach with the introduction of variational parameters which enables the necessary objective-function form. We show empirically that it is now possible to fit GPs to hundreds of thousands of data in the regression setting. When GPs are employed as a part of a more complex model, similar techniques apply. I shall present results from recent work where GPs are used in classification, with N in the millions. I shall also present preliminary work on nested GPs, with GPs priors placed on compound functions f(g(x)); again inference is scalable to very large data sets.
The University of NottinghamUniversity Park Nottingham, NG7 2RD
For all enquiries please visit: www.nottingham.ac.uk/enquire