see Google Scholar for the most up-to-date information.
recent preprints
-
ClimaX: A foundation model for weather and climate
Tung Nguyen, Johannes Brandstetter, Ashish Kapoor, Jayesh K Gupta, and Aditya Grover
arXiv preprint arXiv:2301.10343, 2023
Abs PDF Spotlight Oral Presentation at the ICLR Workshop on Tackling Climate Change with Machine Learning Most state-of-the-art approaches for weather and climate modeling are based on physics-informed numerical models of the atmosphere. These approaches aim to model the non-linear dynamics and complex interactions between multiple variables, which are challenging to approximate. Additionally, many such numerical models are computationally intensive, especially when modeling the atmospheric phenomenon at a fine-grained spatial and temporal resolution. Recent data-driven approaches based on machine learning instead aim to directly solve a downstream forecasting or projection task by learning a data-driven functional mapping using deep neural networks. However, these networks are trained using curated and homogeneous climate datasets for specific spatiotemporal tasks, and thus lack the generality of numerical models. We develop and demonstrate ClimaX, a flexible and generalizable deep learning model for weather and climate science that can be trained using heterogeneous datasets spanning different variables, spatio-temporal coverage, and physical groundings. ClimaX extends the Transformer architecture with novel encoding and aggregation blocks that allow effective use of available compute while maintaining general utility. ClimaX is pre-trained with a self-supervised learning objective on climate datasets derived from CMIP6. The pre-trained ClimaX can then be fine-tuned to address a breadth of climate and weather tasks, including those that involve atmospheric variables and spatio-temporal scales unseen during pretraining. Compared to existing data-driven baselines, we show that this generality in ClimaX results in superior performance on benchmarks for weather forecasting and climate projections, even when pretrained at lower resolutions and compute budgets.
-
Leaving Reality to Imagination: Robust Classification via Generated Datasets
Hritik Bansal, and Aditya Grover
2023
https://arxiv.org/abs/2302.02503
-
Generative pretraining for black-box optimization
Siddarth Krishnamoorthy, Satvik Mehul Mashkaria, and Aditya Grover
arXiv preprint arXiv:2206.10786, 2022
Abs PDF Oral Presentation at the NeurIPS Workshop on Foundation Models for Decision Making Many problems in science and engineering involve optimizing an expensive black-box function over a high-dimensional space. For such black-box optimization (BBO) problems, we typically assume a small budget for online function evaluations, but also often have access to a fixed, offline dataset for pretraining. Prior approaches seek to utilize the offline data to approximate the function or its inverse but are not sufficiently accurate far from the data distribution. We propose BONET, a generative framework for pretraining a novel black-box optimizer using offline datasets. In BONET, we train an autoregressive model on fixed-length trajectories derived from an offline dataset. We design a sampling strategy to synthesize trajectories from offline data using a simple heuristic of rolling out monotonic transitions from low-fidelity to high-fidelity samples. Empirically, we instantiate BONET using a causally masked Transformer and evaluate it on Design-Bench, where we rank the best on average, outperforming state-of-the-art baselines.
-
Reliable Conditioning of Behavioral Cloning for Offline Reinforcement Learning
Tung Nguyen, Qinqing Zheng, and Aditya Grover
arXiv preprint arXiv:2210.05158, 2022
Behavioral cloning (BC) provides a straightforward solution to offline RL by mimicking offline trajectories via supervised learning. Recent advances (Chen et al., 2021; Janner et al., 2021; Emmons et al., 2021) have shown that by conditioning on desired future returns, BC can perform competitively to their value-based counterparts, while enjoying much more simplicity and training stability. While promising, we show that these methods can be unreliable, as their performance may degrade significantly when conditioned on high, out-of-distribution (ood) returns. This is crucial in practice, as we often expect the policy to perform better than the offline dataset by conditioning on an ood value. We show that this unreliability arises from both the suboptimality of training data and model architectures. We propose ConserWeightive Behavioral Cloning (CWBC), a simple and effective method for improving the reliability of conditional BC with two key components: trajectory weighting and conservative regularization. Trajectory weighting upweights the high-return trajectories to reduce the train-test gap for BC methods, while conservative regularizer encourages the policy to stay close to the data distribution for ood conditioning. We study CWBC in the context of RvS (Emmons et al., 2021) and Decision Transformers (Chen et al., 2021), and show that CWBC significantly boosts their performance on various benchmarks.
-
Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories
Qinqing Zheng, Mikael Henaff, Brandon Amos, and Aditya Grover
arXiv preprint arXiv:2210.06518, 2022
Natural agents can effectively learn from multiple data sources that differ in size, quality, and types of measurements. We study this heterogeneity in the context of offline reinforcement learning (RL) by introducing a new, practically motivated semi-supervised setting. Here, an agent has access to two sets of trajectories: labelled trajectories containing state, action, reward triplets at every timestep, along with unlabelled trajectories that contain only state and reward information. For this setting, we develop and study a simple meta-algorithmic pipeline that learns an inverse dynamics model on the labelled data to obtain proxy-labels for the unlabelled data, followed by the use of any offline RL algorithm on the true and proxy-labelled trajectories. Empirically, we find this simple pipeline to be highly successful - on several D4RL benchmarks, certain offline RL algorithms can match the performance of variants trained on a fully labelled dataset even when we label only 10% trajectories from the low return regime. To strengthen our understanding, we perform a large-scale controlled empirical study investigating the interplay of data-centric properties of the labelled and unlabelled datasets, with algorithmic design choices (e.g., choice of inverse dynamics, offline RL algorithm) to identify general trends and best practices for training RL agents on semi-supervised offline datasets.
journal and conference articles
2023
-
Scaling Pareto-Efficient Decision Making via Offline Multi-Objective RL
Baiting Zhu, Meihua Dang, and Aditya Grover
In International Conference on Learning Representations (ICLR), 2023
The goal of multi-objective reinforcement learning (MORL) is to learn policies that simultaneously optimize multiple competing objectives. In practice, an agent’s preferences over the objectives may not be known apriori, and hence, we require policies that can generalize to arbitrary preferences at test time. In this work, we propose a new data-driven setup for offline MORL, where we wish to learn a preference-agnostic policy agent using only a finite dataset of offline demonstrations of other agents and their preferences. The key contributions of this work are two-fold. First, we introduce D4MORL, (D)atasets for MORL that are specifically designed for offline settings. It contains 1.8 million annotated demonstrations obtained by rolling out reference policies that optimize for randomly sampled preferences on 6 MuJoCo environments with 2-3 objectives each. Second, we propose Pareto-Efficient Decision Agents (PEDA), a family of offline MORL algorithms that builds and extends Decision Transformers via a novel preference-and-return-conditioned policy. Empirically, we show that PEDA closely approximates the behavioral policy on the D4MORL benchmark and provides an excellent approximation of the Pareto-front with appropriate conditioning, as measured by the hypervolume and sparsity metrics.
2022
-
Controllable Generative Modeling via Causal Reasoning
Joey Bose, Ricardo Pio Monti, and Aditya Grover
Transactions of Machine Learning Research, 2022
Deep latent variable generative models excel at generating complex, high-dimensional data, often exhibiting impressive generalization beyond the training distribution. However, many such models in use today are black-boxes trained on large unlabelled datasets with statistical objectives and lack an interpretable understanding of the latent space required for controlling the generative process. We propose CAGE, a framework for controllable generation in latent variable models based on causal reasoning. Given a pair of attributes, CAGE infers the implicit cause-effect relationships between these attributes as induced by a deep generative model. This is achieved by defining and estimating a novel notion of unit-level causal effects in the latent space of the generative model. Thereafter, we use the inferred cause-effect relationships to design a novel strategy for controllable generation based on counterfactual sampling. Through a series of large-scale synthetic and human evaluations, we demonstrate that generating counterfactual samples which respect the underlying causal relationships inferred via CAGE leads to subjectively more realistic images.
-
Masked Autoencoding for Scalable and Generalizable Decision Making
Fangchen Liu, Hao Liu, Aditya Grover, and Pieter Abbeel
In Advances in Neural Information Processing Systems (NeurIPS), 2022
We are interested in learning scalable agents for reinforcement learning that can learn from large-scale, diverse sequential data similar to current large vision and language models. To this end, this paper presents masked decision prediction (MaskDP), a simple and scalable self-supervised pretraining method for reinforcement learning (RL) and behavioral cloning (BC). In our MaskDP approach, we employ a masked autoencoder (MAE) to state-action trajectories, wherein we randomly mask state and action tokens and reconstruct the missing data. By doing so, the model is required to infer masked out states and actions and extract information about dynamics. We find that masking different proportions of the input sequence significantly helps with learning a better model that generalizes well to multiple downstream tasks. In our empirical study we find that a MaskDP model gains the capability of zero-shot transfer to new BC tasks, such as single and multiple goal reaching, and it can zero-shot infer skills from a few example transitions. In addition, MaskDP transfers well to offline RL and shows promising scaling behavior w.r.t. to model size. It is amenable to data efficient finetuning, achieving competitive results with prior methods based on autoregressive pretraining.
-
CyCLIP: Cyclic Contrastive Language-Image Pretraining
Shashank Goel, Hritik Bansal, Sumit Bhatia, Ryan A Rossi, Vishwa Vinay, and Aditya Grover
In Advances in Neural Information Processing Systems (NeurIPS), 2022
Recent advances in contrastive representation learning over paired image-text data have led to models such as CLIP that achieve state-of-the-art performance for zero-shot classification and distributional robustness. Such models typically require joint reasoning in the image and text representation spaces for downstream inference tasks. Contrary to prior beliefs, we demonstrate that the image and text representations learned via a standard contrastive objective are not interchangeable and can lead to inconsistent downstream predictions. To mitigate this issue, we formalize consistency and propose CyCLIP, a framework for contrastive representation learning that explicitly optimizes for the learned representations to be geometrically consistent in the image and text space. In particular, we show that consistent representations can be learned by explicitly symmetrizing (a) the similarity between the two mismatched image-text pairs (cross-modal consistency); and (b) the similarity between the image-image pair and the text-text pair (in-modal consistency). Empirically, we show that the improved consistency in CyCLIP translates to significant gains over CLIP, with gains ranging from 10%-24% for zero-shot classification accuracy on standard benchmarks (CIFAR-10, CIFAR-100, ImageNet1K) and 10%-27% for robustness to various natural distribution shifts.
-
-
Online decision transformer
Qinqing Zheng, Amy Zhang, and Aditya Grover
In International Conference on Machine Learning (ICML), 2022
Abs PDF Long Oral Presentation Recent work has shown that offline reinforcement learning (RL) can be formulated as a sequence modeling problem (Chen et al., 2021; Janner et al., 2021) and solved via approaches similar to large-scale language modeling. However, any practical instantiation of RL also involves an online component, where policies pretrained on passive offline datasets are finetuned via task-specific interactions with the environment. We propose Online Decision Transformers (ODT), an RL algorithm based on sequence modeling that blends offline pretraining with online finetuning in a unified framework. Our framework uses sequence-level entropy regularizers in conjunction with autoregressive modeling objectives for sample-efficient exploration and finetuning. Empirically, we show that ODT is competitive with the state-of-the-art in abstractolute performance on the D4RL benchmark but shows much more significant gains during the finetuning procedure.
-
Matching normalizing flows and probability paths on manifolds
Heli Ben-Hamu, Samuel Cohen, Joey Bose, Brandon Amos, Aditya Grover, Maximilian Nickel, Ricky Chen, and Yaron Lipman
In International Conference on Machine Learning (ICML), 2022
Continuous Normalizing Flows (CNFs) are a class of generative models that transform a prior distribution to a model distribution by solving an ordinary differential equation (ODE). We propose to train CNFs on manifolds by minimizing probability path divergence (PPD), a novel family of divergences between the probability density path generated by the CNF and a target probability density path. PPD is formulated using a logarithmic mass conservation formula which is a linear first order partial differential equation relating the log target probabilities and the CNF’s defining vector field. PPD has several key benefits over existing methods: it sidesteps the need to solve an ODE per iteration, readily applies to manifold data, scales to high dimensions, and is compatible with a large family of target paths interpolating pure noise and data in finite time. Theoretically, PPD is shown to bound classical probability divergences. Empirically, we show that CNFs learned by minimizing PPD achieve state-of-the-art results in likelihoods and sample quality on existing low-dimensional manifold benchmarks, and is the first example of a generative model to scale to moderately high dimensional manifolds.
-
It Takes Four to Tango: Multiagent Selfplay for Automatic Curriculum Generation
Yuqing Du, Pieter Abbeel, and Aditya Grover
In International Conference on Learning Representations (ICLR), 2022
We are interested in training general-purpose reinforcement learning agents that can solve a wide variety of goals. Training such agents efficiently requires automatic generation of a goal curriculum. This is challenging as it requires (a) exploring goals of increasing difficulty, while ensuring that the agent (b) is exposed to a diverse set of goals in a sample efficient manner and (c) does not catastrophically forget previously solved goals. We propose Curriculum Self Play (CuSP), an automated goal generation framework that seeks to satisfy these desiderata by virtue of a multi-player game with four agents. We extend the asymmetric curricula learning in PAIRED (Dennis et al., 2020) to a symmetrized game that carefully balances cooperation and competition between two off-policy student learners and two regret-maximizing teachers. CuSP additionally introduces entropic goal coverage and accounts for the non-stationary nature of the students, allowing us to automatically induce a curriculum that balances progressive exploration with anti-catastrophic exploitation. We demonstrate that our method succeeds at generating an effective curricula of goals for a range of control tasks, outperforming other methods at zero-shot test-time generalization to novel out-of-distribution goals.
-
Frame averaging for invariant and equivariant network design
Omri Puny, Matan Atzmon, Heli Ben-Hamu, Edward J Smith, Ishan Misra, Aditya Grover, and Yaron Lipman
In International Conference on Learning Representations (ICLR), 2022
Many machine learning tasks involve learning functions that are known to be invariant or equivariant to certain symmetries of the input data. However, it is often challenging to design neural network architectures that respect these symmetries while being expressive and computationally efficient. For example, Euclidean motion invariant/equivariant graph or point cloud neural networks. We introduce Frame Averaging (FA), a general purpose and systematic framework for adapting known (backbone) architectures to become invariant or equivariant to new symmetry types. Our framework builds on the well known group averaging operator that guarantees invariance or equivariance but is intractable. In contrast, we observe that for many important classes of symmetries, this operator can be replaced with an averaging operator over a small subset of the group elements, called a frame. We show that averaging over a frame guarantees exact invariance or equivariance while often being much simpler to compute than averaging over the entire group. Furthermore, we prove that FA-based models have maximal expressive power in a broad setting and in general preserve the expressive power of their backbone architectures. Using frame averaging, we propose a new class of universal Graph Neural Networks (GNNs), universal Euclidean motion invariant point cloud networks, and Euclidean motion invariant Message Passing (MP) GNNs. We demonstrate the practical effectiveness of FA on several applications including point cloud normal estimation, beyond 2-WL graph separation, and n-body dynamics prediction, achieving state-of-the-art results in all of these benchmarks.
-
Pretrained transformers as universal computation engines
Kevin Lu, Aditya Grover, Pieter Abbeel, and Igor Mordatch
In AAAI Conference on Artificial Intelligence, 2022
We investigate the capability of a transformer pretrained on natural language to generalize to other modalities with minimal finetuning – in particular, without finetuning of the self-attention and feedforward layers of the residual blocks. We consider such a model, which we call a Frozen Pretrained Transformer (FPT), and study finetuning it on a variety of sequence classification tasks spanning numerical computation, vision, and protein fold prediction. In contrast to prior works which investigate finetuning on the same modality as the pretraining dataset, we show that pretraining on natural language can improve performance and compute efficiency on non-language downstream tasks. Additionally, we perform an analysis of the architecture, comparing the performance of a random initialized transformer to a random LSTM. Combining the two insights, we find language-pretrained transformers can obtain strong performance on a variety of non-language tasks.
2021
-
Moser flow: Divergence-based generative modeling on manifolds
Noam Rozen, Aditya Grover, Maximilian Nickel, and Yaron Lipman
Advances in Neural Information Processing Systems (NeurIPS), 2021
PDF Outstanding Paper Award -
BCD nets: Scalable variational approaches for bayesian causal discovery
Chris Cundy, Aditya Grover, and Stefano Ermon
Advances in Neural Information Processing Systems (NeurIPS), 2021
-
Pirank: Scalable learning to rank via differentiable sorting
Robin Swezey, Aditya Grover, Bruno Charron, and Stefano Ermon
Advances in Neural Information Processing Systems (NeurIPS), 2021
-
Decision transformer: Reinforcement learning via sequence modeling
Lili Chen, Kevin Lu, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Misha Laskin, Pieter Abbeel, Aravind Srinivas, and Igor Mordatch
Advances in Neural Information Processing Systems (NeurIPS), 2021
-
Bayesian learning for rapid prediction of lithium-ion battery-cycling protocols
Benben Jiang, William E Gent, Fabian Mohr, Supratim Das, Marc D Berliner, Michael Forsuelo, Hongbo Zhao, Peter M Attia, Aditya Grover, Patrick K Herring, and others
Joule, 2021
-
Anytime sampling for autoregressive models via ordered autoencoding
Yilun Xu, Yang Song, Sahaj Garg, Linyuan Gong, Rui Shu, Aditya Grover, and Stefano Ermon
2021
-
Reset-free lifelong learning with skill-space planning
Kevin Lu, Aditya Grover, Pieter Abbeel, and Igor Mordatch
2021
-
Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits
Wenshuo Guo, Kumar Krishna Agrawal, Aditya Grover, Vidya Muthukumar, and Ashwin Pananjady
In International Conference on Artificial Intelligence and Statistics (AISTATS), 2021
2020
-
Closed-loop optimization of extreme fast charging for batteries using machine learning
Peter Attia, Aditya Grover, Norman Jin, Kristen Severson, Bryan Cheong, Jerry Liao, Michael H Chen, Nicholas Perkins, Zi Yang, Patrick Herring, Muratahan Aykol, Stephen Harris, Richard Braatz, Stefano Ermon, and William Chueh
Nature, 2020
-
Fair Generative Modeling via Weak Supervision
Kristy Choi, Aditya Grover, Trisha Singh, Rui Shu, and Stefano Ermon
In International Conference on Machine Learning (ICML), 2020
-
Permutation Invariant Graph Generation via Score-Based Generative Modeling
Chenhao Niu, Yang Song, Jiaming Song, Shengjia Zhao, Aditya Grover, and Stefano Ermon
In International Conference on Artificial Intelligence and Statistics (AISTATS), 2020
-
AlignFlow: Cycle Consistent Learning from Multiple Domains via Normalizing Flows
Aditya Grover, Christopher Chute, Rui Shu, Zhangjie Cao, and Stefano Ermon
In AAAI Conference on Artificial Intelligence, 2020
2019
-
Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting
Aditya Grover, Jiaming Song, Alekh Agarwal, Kenneth Tran, Ashish Kapoor, Eric Horvitz, and Stefano Ermon
In Advances in Neural Information Processing Systems (NeurIPS), 2019
-
Graphite: Iterative generative modeling of graphs
Aditya Grover, Aaron Zweig, and Stefano Ermon
In International Conference on Machine Learning (ICML), 2019
-
Neural Joint Source-Channel Coding
Kristy Choi, Kedar Tatwawadi, Aditya Grover, Tsachy Weissman, and Stefano Ermon
In International Conference on Machine Learning (ICML), 2019
-
Stochastic Optimization of Sorting Networks via Continuous Relaxations
Aditya Grover, Eric Wang, Aaron Zweig, and Stefano Ermon
In International Conference on Learning Representations (ICLR), 2019
-
Uncertainty Autoencoders: Learning Compressed Representations via Variational Information Maximization
Aditya Grover, and Stefano Ermon
In International Conference on Artificial Intelligence and Statistics (AISTATS), 2019
-
Learning Controllable Fair Representations
Jiaming Song, Pratyusha Kalluri, Aditya Grover, Shengjia Zhao, and Stefano Ermon
In International Conference on Artificial Intelligence and Statistics (AISTATS), 2019
2018
-
Streamlining variational inference for constraint satisfaction problems
Aditya Grover, Tudor Achim, and Stefano Ermon
In Advances in Neural Information Processing Systems (NeurIPS), 2018
-
Learning Policy Representations in Multiagent Systems
Aditya Grover, Maruan Al-Shedivat, Jayesh K Gupta, Yura Burda, and Harrison Edwards
In International Conference on Machine Learning (ICML), 2018
-
Modeling sparse deviations for compressed sensing using generative models
Manik Dhar, Aditya Grover, and Stefano Ermon
In International Conference on Machine Learning (ICML), 2018
-
Best arm identification in multi-armed bandits with delayed feedback
Aditya Grover, Todor Markov, Peter Attia, Norman Jin, Nicholas Perkins, Bryan Cheong, Michael Chen, Zi Yang, Stephen Harris, William Chueh, and Stefano Ermon
In International Conference on Artificial Intelligence and Statistics (AISTATS), 2018
-
Variational Rejection Sampling
Aditya Grover, Ramki Gummadi, Miguel Lazaro-Gredilla, Dale Schuurmans, and Stefano Ermon
In International Conference on Artificial Intelligence and Statistics (AISTATS), 2018
-
Evaluating Generalization in Multiagent Systems using Agent-Interaction Graphs
Aditya Grover, Maruan Al-Shedivat, Jayesh K Gupta, Yuri Burda, and Harrison Edwards
In International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2018
-
Boosted generative models
Aditya Grover, and Stefano Ermon
In AAAI Conference on Artificial Intelligence, 2018
-
Flow-GAN: Combining maximum likelihood and adversarial learning in generative models
Aditya Grover, Manik Dhar, and Stefano Ermon
In AAAI Conference on Artificial Intelligence, 2018
2016
-
Variational Bayes on Monte Carlo Steroids
Aditya Grover, and Stefano Ermon
In Advances in Neural Information Processing Systems (NeurIPS), 2016
-
node2vec: Scalable Feature Learning for Networks
Aditya Grover, and Jure Leskovec
In International Conference on Knowledge Discovery and Data Mining (KDD), 2016
-
Contextual Symmetries in Probabilistic Graphical Models
Ankit Anand, Aditya Grover, Mausam, and Parag Singla
In International Joint Conference on Artificial Intelligence (IJCAI), 2016
2015
-
A deep hybrid model for weather forecasting
Aditya Grover, Ashish Kapoor, and Eric Horvitz
In International Conference on Knowledge Discovery and Data Mining (KDD), 2015
-
ASAP-UCT: abstracttraction of state-action pairs in UCT
Ankit Anand, Aditya Grover, Mausam, and Parag Singla
In International Joint Conference on Artificial Intelligence (IJCAI), 2015
-
A Novel abstracttraction Framework for Online Planning
Ankit Anand, Aditya Grover, Mausam, and Parag Singla
In International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2015