Arvind K. Saibaba
Education
PhD Computational and Mathematical Engineering Stanford University 2013
Area(s) of Expertise
Inverse Problems, Numerical Linear Algebra, Applications to Medical Imaging and Geosciences.
Publications
- A COMPUTATIONAL FRAMEWORK FOR EDGE-PRESERVING REGULARIZATION IN DYNAMIC INVERSE PROBLEMS , ELECTRONIC TRANSACTIONS ON NUMERICAL ANALYSIS (2023)
- EFFICIENT ALGORITHMS FOR BAYESIAN INVERSE PROBLEMS WITH WHITTLE--MATERN PRIORS , SIAM JOURNAL ON SCIENTIFIC COMPUTING (2023)
- HYBRID PROJECTION METHODS FOR SOLUTION DECOMPOSITION IN LARGE-SCALE BAYESIAN INVERSE PROBLEMS\ast , SIAM JOURNAL ON SCIENTIFIC COMPUTING (2023)
- MONTE CARLO METHODS FOR ESTIMATING THE DIAGONAL OF A REAL SYMMETRIC MATRIX , SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS (2023)
- RANDOMIZED ALGORITHMS FOR ROUNDING IN THE TENSOR-TRAIN FORMAT , SIAM JOURNAL ON SCIENTIFIC COMPUTING (2023)
- Randomized reduced basis methods for parameterized fractional elliptic PDEs , FINITE ELEMENTS IN ANALYSIS AND DESIGN (2023)
- Tensor-based flow reconstruction from optimally located sensor measurements , JOURNAL OF FLUID MECHANICS (2023)
- Computationally efficient methods for large-scale atmospheric inverse modeling , GEOSCIENTIFIC MODEL DEVELOPMENT (2022)
- Efficient randomized tensor-based algorithms for function approximation and low-rank kernel interactions , ADVANCES IN COMPUTATIONAL MATHEMATICS (2022)
- Kryging: geostatistical analysis of large-scale datasets using Krylov subspace methods , STATISTICS AND COMPUTING (2022)
Grants
This proposal aims to develop efficient randomized methods for two classes of inverse problems -- dynamic inverse problems in which reconstructions have to be carried out at multiple time instances, and hierarchical Bayesian inverse problems, in which one has to jointly reconstruct unknowns and estimate hyperparameters. The proposed methods are expected to dramatically accelerate several inverse problems across many disciplines in science and engineering, but the algorithms will be validated on a testbed of model problems and illustrated on challenging applications in cosmic microwave background estimation and data assimilation problems in weather prediction.
This project aims to develop scalable randomized algorithms---achieving order-of-magnitude gains in efficiency and provable performance guarantees---for data collection and processing in Bayesian inverse problems. Inverse problems use experimental data to infer parameters governing physical models. In settings where data collection is time-consuming or expensive, optimal experimental design (OED) for inverse problems seeks to determine experimental conditions for data acquisition that {minimize uncertainty} (or maximize information gain) in parameters, predictions, or other quantities of interest---subject to budgetary constraints. Naively processing large volumes of data (e.g., those generated in DOE user facilities) can make inversion algorithms impractically slow, so OED is also essential in such situations. More generally, OED is relevant to many DOE mission-relevant applications that we will tackle in our proposed work: imaging and tomography; storing and processing massive datasets; and global climate modeling.
My work is motivated by the need to visualize regions that are difficult to "see." For example, in biomedical applications, accurate visualization of tissue is needed to diagnose and treat tumors. In the U.S. alone, there are nearly 1.7 million new cancer diagnoses each year. Effective imaging technologies can help doctors make life-saving decisions. Each imaging technique requires one to solve an inverse problem in order to transform indirect data into detailed image reconstructions of one or more parameters of interest. What makes inverse problems challenging is that each step of the image reconstruction is inherently riddled with uncertainty. For example, uncertainty commonly arises due to noisy measurements from sensors, experimental constraints in collecting sufficient data, difficulty in modeling the physical processes accurately, and limited computational resources to solve the image reconstruction. However, the field of uncertainty quantification (UQ) in imaging is at its infancy. UQ is challenging because solving one inversion for image reconstruction can itself be time-consuming; generating statistics for UQ, in turn, often require several thousands of inversions beyond the initial inversion. Current approaches to UQ for inverse problems are inadequate because they either fail to deliver solutions in a reasonable computational time or lack the applicability across a broad range of imaging technologies. I propose to develop fast and accurate algorithms for large-scale inverse problems that are applicable to a broad range of imaging technologies and that explicitly integrates UQ with the image reconstruction. The resulting algorithms will be validated on a broad range of applications, with rigorous analyses that will show clear trade-offs between computational cost and accuracy. The advances that I make here are anticipated to aid scientists and practitioners make informed decisions across a wide range of applications.
Quantifying the effects of input variations on computed outputs is critical in scientific computing; it is generally referred to as sensitivity analysis (SA). SA plays a key role in the verification, the understanding and, central to this proposal, the simplification of models. Indeed, models are often "simplified" by constructing mathematical surrogates whose properties approximate some??????????????????but not all??????????????????of the properties of the original model with the goal of enabling computational study. We propose to develop computational methods to (i) identify of the ???????????????"important parts" of a model and (ii) quantify the effects of ignoring the "less important parts." In other words, our work will contribute to dimension reduction and sensitivity analysis. The research is organized around three complementary thrusts in numerical linear algebra, nonlinear solvers and global sensitivity analysis. The RTG participants will not just "be trained," they will play an essential role in our research activities. Group dynamic and esprit de corps will be generated through working groups. Every year, we will organize three working groups consisting of undergraduates, graduate trainees, postdocs and faculty. Each group will be active for one semester and be led by a member of the senior personnel, possibly jointly with one of the postdoctoral fellows. The working groups will concentrate on specific aspects of randomized numerical analysis through hands-on exploratory activities of either computational or analytical nature. Advanced graduate students will be in charge of introducing the undergraduates to basic concepts through short presentations. The project will support 5 undergraduates per year as well as, over the duration of the award, 9 graduate students and 2 postdocs. The project will have a lasting impact on curriculum and departmental activities. It also has significant outreach components to (i) industry and national labs, (ii) to high school (North Carolina School of Science and Mathematics) and (iii) to the public as well as through the development on graduate distance education courses. Our public outreach efforts will take place in collaboration with the NC State Office of Public Science.
Large-scale anomalous emissions of greenhouse gases or air pollution pose threats to human health in the vicinity of the emissions, state emission targets, and threaten energy security. Two recent high-profile cases of blowouts, involving natural gas and methane respectively, underscore the need for early detection and intervention. New satellites are being launched with the specific purpose of detecting and monitoring greenhouse gas emissions and recent studies have demonstrated the potential of detecting these blowout events from the data collected by these satellites by solving large-scale inverse problems. There are enormous computational challenges because of the massive amounts of satellite data that need to be processed and the fine-scale resolution at which the reconstructions are needed for threat detection. Furthermore, a flexible approach is necessary for incorporating prior information and estimating hyperparameters that determine the prior. Therefore, new algorithms are required for handling these computational challenges and this is the focus of the current proposal.
Many problems in scientific computing involve either data or operators that are inherently multidimensional, yet standard numerical methods treat these problems as two dimensional objects, i.e. matrices. A single image can define a matrix, but a database of images is more naturally handled as a tensor. Recent research has shown that tensors (a.k.a. multiway arrays) and a number of types of corresponding decomposition methods can be instrumental in revealing latent correlations residing in high dimensional spaces. Indeed, tensor decompositions can be provably superior to matrix-based counterparts (1) in representation of certain types tensor data, for example in facial recognition, (2) reorganizing computational steps, yielding efficient parallelization methods. We therefore propose to develop and use our tensor decompositions in the context of several important applications that are traditionally treated through matrix-based approaches. Our research tackles two important questions: How can we identify and exploit tensor structure and algebra in scientific applications? How do we use these revealed structures to develop a powerful computational framework that can harvest the benefits of this structure?
Overview: Image reconstruction problems occur in a plethora of practical settings. Consider, for example, the need to "see" water, oil, or pollutants under the earth's surface; the desire to visualize cancers within the body; the urgency in detecting explosive materials in crowded venues. Their solution is commonly formulated as a deterministic, regularized inverse problem, yielding a single image. Under this formulation, inversion is cast as the solution to an optimization problem the solution of which requires tens to hundreds of evaluations of the underlying objective function. For the problems of interest here, where the forward model (the map from the parameters of interest to the data) is defined by a partial differential equation (PDE), each such evaluation itself involves the numerical solution of thousands of such PDEs. Of increasing concern is extending these methods to a Bayesian setting, where not only point estimates are available (e.g., the maximum a posteriori solution or MAP estimate), but where sophisticated sampling techniques can be used for uncertainty quantification (UQ); that is, assessing the reliability of the information in the reconstruction. The computational burden of such processing, however, is many times that of a single reconstruction, putting this analysis out of reach for many applications. This project aims to address the computational barriers to UQ for the large collection of important applications where images of interest are nearly piecewise constant (NPC) and well-defined using (relatively) low order parametric representations that are essentially grid-independent. Intellectual Merit: The costs for solving large 3D inverse problems with many measurements are formidable. Often, we need real-time solutions to support crucial decisions, with access to, at best, a fast multicore workstation with accelerators. We propose algorithms to reduce solution times by orders of magnitude. Our methods will also support computationally expensive Bayesian inversion and UQ which ultimately supports better informed decisions. Four key attributes of this proposal address computational barriers to inverse problems with large data sets and provide an efficient alternative to the usual approach: (1) use of randomization and/or model reduction to evaluate the objective function, (2) use of complementary strategies to reduce the number of posterior samples required for accurate solutions, including randomized MCMC techniques (3) direct resolution of features living in a low-dimensional feature space without the intermediate phase of generating the images from whence the features could have been deduced (4) demonstration of success of our techniques on several important inverse problems, including medical imaging, luggage screening, and groundwater remediation. Broader impacts: Our computational innovations will have application beyond those listed; i.e. in topology optimization and for other imaging modalities. Model reduction and randomization methods can reduce the cost for any objective function minimization that includes many terms that each require an expensive simulation. Similarly, estimating bilinear and quadratic forms is ubiquitous in scientific computing applications. Moreover, in the context of uncertainty quantification, use of our methods has the potential to provide practitioners/clinicians with more accurate tools to guide decisions. Several innovations in the use of stochastic solvers and randomized computations have significant potential outside the imaging community. The project will also develop a new graduate course on stochastic techniques for inverse problems, complementing existing graduate courses. The course materials will be modular, and will be available on the web. The project will support existing efforts to recruit female graduate students, and provide them with valuable interdisciplinary education.
Honors and Awards
- NSF CAREER Award 2019
- Water Resources Research Editor's Choice Award 2013