# Locally induced Gaussian processes for large-scale simulation experiments

@article{Cole2021LocallyIG, title={Locally induced Gaussian processes for large-scale simulation experiments}, author={D. Austin Cole and R. Christianson and Robert B. Gramacy}, journal={Statistics and Computing}, year={2021}, volume={31}, pages={1-21} }

Gaussian processes (GPs) serve as flexible surrogates for complex surfaces, but buckle under the cubic cost of matrix decompositions with big training data sizes. Geospatial and machine learning communities suggest pseudo-inputs, or inducing points, as one strategy to obtain an approximation easing that computational burden. However, we show how placement of inducing points and their multitude can be thwarted by pathologies, especially in large-scale dynamic response surface modeling tasks. As… Expand

#### Figures from this paper

#### 4 Citations

Batch-sequential design and heteroskedastic surrogate modeling for delta smelt conservation

- Mathematics, Computer Science
- 2020

A batch sequential design scheme is proposed, generalizing one-at-a-time variance-based active learning for HetGP surrogates, as a means of keeping multi-core cluster nodes fully engaged with expensive runs. Expand

Large-scale local surrogate modeling of stochastic simulation experiments

- Mathematics
- 2021

Gaussian process (GP) regression in large-data contexts, which often arises in surrogate modeling of stochastic simulation experiments, is challenged by cubic runtimes. Coping with input-dependent… Expand

Sensitivity Prewarping for Local Surrogate Modeling

- Mathematics, Computer Science
- ArXiv
- 2021

A framework is proposed for incorporating information from a global sensitivity analysis into the surrogate model as an input rotation and rescaling preprocessing step and performs an input warping such that the “warped simulator” is equally sensitive to all input directions, freeing local models to focus on local dynamics. Expand

Active Learning for Deep Gaussian Process Surrogates.

- Mathematics, Computer Science
- 2020

This work transport a DGP's automatic warping of the input space and full uncertainty quantification, via a novel elliptical slice sampling (ESS) Bayesian posterior inferential scheme, through to active learning (AL) strategies that distribute runs non-uniformly in theinput space -- something an ordinary (stationary) GP could not do. Expand

#### References

SHOWING 1-10 OF 81 REFERENCES

laGP: Large-Scale Spatial Modeling via Local Approximate Gaussian Processes in R

- Computer Science
- 2016

This work discusses an implementation of local approximate Gaussian process models, in the laGP package for R, that offers a particular sparse-matrix remedy uniquely positioned to leverage modern parallel computing architectures. Expand

Speeding Up Neighborhood Search in Local Gaussian Process Prediction

- Mathematics, Computer Science
- Technometrics
- 2016

This work studies how predictive variance is reduced as local designs are built up for prediction, and suggests that searching the space radially, that is, continuously along rays emanating from the predictive location of interest, is a far thriftier alternative. Expand

Distance-Distributed Design for Gaussian Process Surrogates

- Computer Science, Mathematics
- Technometrics
- 2021

This work studies the distribution of pairwise distances between design elements, and develops a numerical scheme to optimize those distances for a given sample size and dimension, and proposes a family of new schemes by reverse engineering the qualities of the random designs which give the best estimates of GP length scales. Expand

Gaussian predictive process models for large spatial data sets.

- Mathematics, Medicine
- Journal of the Royal Statistical Society. Series B, Statistical methodology
- 2008

This work achieves the flexibility to accommodate non-stationary, non-Gaussian, possibly multivariate, possibly spatiotemporal processes in the context of large data sets in the form of a computational template encompassing these diverse settings. Expand

Exact Gaussian Processes on a Million Data Points

- Computer Science, Mathematics
- NeurIPS
- 2019

A scalable approach for exact GPs is developed that leverages multi-GPU parallelization and methods like linear conjugate gradients, accessing the kernel matrix only through matrix multiplication, and is generally applicable, without constraints to grid data or specific kernel classes. Expand

Emulating Satellite Drag from Large Simulation Experiments

- Computer Science, Mathematics
- SIAM/ASA J. Uncertain. Quantification
- 2019

This paper shows how extensions to the local approximate Gaussian Process (laGP) method allow accurate full-scale emulation, and demonstrates that the method achieves the desired level of accuracy, when trained on seventy thousand core hours of drag simulations for two real-world satellites. Expand

Hierarchical Nearest-Neighbor Gaussian Process Models for Large Geostatistical Datasets

- Computer Science, Medicine
- Journal of the American Statistical Association
- 2016

A class of highly scalable nearest-neighbor Gaussian process (NNGP) models to provide fully model-based inference for large geostatistical datasets are developed and it is established that the NNGP is a well-defined spatial process providing legitimate finite-dimensional Gaussian densities with sparse precision matrices. Expand

Sparse Gaussian Processes using Pseudo-inputs

- Computer Science
- NIPS
- 2005

It is shown that this new Gaussian process (GP) regression model can match full GP performance with small M, i.e. very sparse solutions, and it significantly outperforms other approaches in this regime. Expand

Local Gaussian Process Approximation for Large Computer Experiments

- Computer Science, Mathematics
- 2013

A family of local sequential design schemes that dynamically define the support of a Gaussian process predictor based on a local subset of the data are derived, enabling a global predictor able to take advantage of modern multicore architectures. Expand

Mercer kernels and integrated variance experimental design: connections between Gaussian process regression and polynomial approximation

- Mathematics, Computer Science
- SIAM/ASA J. Uncertain. Quantification
- 2016

This paper introduces algorithms for minimizing a posterior integrated variance (IVAR) design criterion for GP regression, and shows how IVAR-optimal designs, while sacrificing discrete orthogonality of the kernel eigenfunctions, can yield lower approximation error than orthogonalizing point sets. Expand