By the Numbers
click images to interact with charts
We introduced a novel unsupervised approach for learning disentangled representations of neural activity called SwapVAE. Our approach combines a generative modeling framework with an instance-specific alignment loss that tries to maximize the representational similarity between transformed views of the input (brain state). Through evaluations on both synthetic data and neural recordings from hundreds of neurons in different primate brains, we show that it is possible to build representations that disentangle neural datasets along relevant latent dimensions linked to behavior.
Ran LiuPhD Machine Learning Student
We introduce Habitat 2.0 (H2.0), a simulation platform for training virtual robots in interactive 3D environments and complex physics-enabled scenarios. We make comprehensive contributions to all levels of the embodied AI stack – data, simulation, and benchmark tasks. Specifically, we present: (i) ReplicaCAD: an artist-authored, annotated, reconfigurable 3D dataset of apartments (matching real spaces) with articulated objects (e.g. cabinets and drawers that can open/close); (ii) H2.0: a high-performance physics-enabled 3D simulator with speeds exceeding 25,000 simulation steps per second (850x real-time) on an 8-GPU node, representing 100x speed-ups over prior work; and, (iii) Home Assistant Benchmark (HAB): a suite of common tasks for assistive robots (tidy the house, stock groceries, set the table) that test a range of mobile manipulation capabilities. These large-scale engineering contributions allow us to systematically compare deep reinforcement learning (RL) at scale and classical sense-plan-act (SPA) pipelines in long-horizon structured tasks, with an emphasis on generalization to new objects, receptacles, and layouts. We find that (1) flat RL policies struggle on HAB compared to hierarchical ones; (2) a hierarchy with independent skills suffers from ‘hand-off problems’, and (3) SPA pipelines are more brittle than RL policies.
Andrew SzotPhD Machine Learning Student
We find we do not need simulation and RL to learn navigation. Instead, we present NRNS which uses simple self-supervised distance learning from passive house tour videos along with greedy decision-making. NRNS not only beats RL/IL algorithms in simulation but when it comes to real-world…there is no sim2real required! A step towards a world where skill learning is guided by lots of rich, diverse, real world data.
Meera HahnPhD Computer Science Student
We present an embodied agent that can navigate through indoor environments by following natural language instructions such as “Walk down the hall, enter the bedroom, and stop by the green chairs.” Our key observation is that instructions often use two different types of visual cues: scene descriptions (e.g., bedroom) and object references (e.g, green chairs). Accordingly, we create a model that explicitly encodes the inductive bias that the world is composed of objects and scenes and show that this design can substantially improve instruction-guided visual navigation performance.
Arjun MajumdarPhD Computer Science Student
We present two human-subject studies quantifying the benefits of deploying Explainable AI (xAI) techniques within a human-machine teaming scenario, finding that the benefits of xAI are not universal. We create a rich, interactive human-machine teaming scenario in Minecraft where a human and collaborative robot (i.e., a cobot) must work together to build a house. We show that xAI techniques providing an abstraction of the cobot’s behavior can support situational awareness (SA) and examine how different SA levels induced via a collaborative AI policy abstraction affect ad hoc human-machine teaming performance. Our work presents one of the first analyses looking at the impact of explainable AI in collaborative sequential decision-making settings. Our results demonstrate that researchers must deliberately design and deploy the right xAI techniques in the right scenario by carefully considering human-machine team composition and how the xAI method augments SA.
Rohan PalejaPhD Robotics Student
A recent, influential thread of research has shown that modern ML models, which are highly overparameterized, tend to perfectly fit even noisy training data. Despite such fitting of noise being traditionally associated with overfitting, these “interpolating” models still achieve state-of-the-art generalization performance in practice. Recent theory has corroborated this counterintuitive observation in the case of linear regression, by showing that very high-dimensional models can fit noise in a relatively harmless manner.
Despite a near-complete explanation for this phenomenon for the case of regression, the classification problem (which underlies all practical success stories in ML) remains relatively poorly understood, partly due to a more complex mode of evaluation (the 0-1 loss) and partly because practical solutions are defined only implicitly and not explicitly (i.e. through the implicit bias of gradient descent, rather than a minimum-norm interpolator).
Our results show two new surprises that arise in overparameterized classification as a consequence of the insights that were previously derived in regression:
– It is possible for a classification task to generalize even when the corresponding regression task doesn’t, and
– Gradient descent run with very different training loss functions can yield very similar solutions, which all interpolate the training data. Thus, “all roads lead to interpolation of training data”.
Vidya MuthukumarAsst. Professor, ECE and ISyE
This paper presents a neural network training technique (selective backpropagation through time) that facilitates simultaneous upsampling and denoising of sparsely sampled neural data. Our technique enables researchers to increase the number of neurons recorded and / or the sampling frequency for a given recording bandwidth, and is applicable to both basic science and brain-machine interfaces. We demonstrate in the paper that our technique improves decoding performance on real electrophysiological recordings and calcium imaging data.
Andrew SedlerPhD Machine Learning Student
Natural data mined from images or texts can be stored in probabilistic databases for querying. Methods have been developed to train models to retrieve the correct answer given the query on these databases. However , these methods do not scale as the number of candidate answer increase. Therefore we propose Scallop as a scalable version of such logic queries to enable model training over larger databases. Combining deep learning and scalable logic programming enables us to operate over probabilistic databases and include external information such as knowledge graphs as well. This enables us to achieve state of the art performance on tasks such as knowledge dependent visual question answering while being able to learn models in a scalable fashion.
Binghong ChenPhD Machine Learning Student
We present a novel neural network Maximum Mean Discrepancy (MMD) statistic by identifying a new connection between neural tangent kernel (NTK) and MMD. This connection enables us to develop a computationally efficient and memory-efficient approach to compute the MMD statistic and perform NTK based two-sample tests towards addressing the long-standing challenge of memory and computational complexity of the MMD statistic, which is essential for online implementation to assimilating new samples. Theoretically, such a connection allows us to understand the NTK test statistic properties, such as the Type-I error and testing power for performing the two-sample test, by adapting existing theories for kernel MMD. Numerical experiments on synthetic and real-world datasets validate the theory and demonstrate the effectiveness of the proposed NTK-MMD statistic.
Yao XieAssoc. Professor, ISyE
We introduce a way to figure out how well machine learning models taken from the internet will do on your data after a little bit of fine tuning. Doing this for any random machine learning model taken from the web is new. We create a new task around this and release a benchmark for this task. We also provide a method that works very well on this task. This can help speed up the implementation of machine learning in the wild, since now you don’t spend time searching for a model that will work well for you.
Daniel BolyaPhD Machine Learning Student
Many real world tasks are expressed in programs and have to operate on complex data, such as images or videos. Such examples are present in answering questions about an image, or providing specifications to a robotic agent to navigate its environment. In ProTo we develop a program guided transformer to solve these program guided tasks. Encoding this program guidance provides better performance in such reliant tasks and improves generalizability over existing methods.
Karan SamelPhD Machine Learning Student
We prove that a simple step size achieves a O(1/t) convergence rate in primal gap and in dual gap for a simple variant of the Frank-Wolfe algorithm when minimizing generalized self-concordant functions over compact convex domains, using only first-order information. Previous approaches achieved a O(1/t) convergence rate in primal gap using first and second-order information.
Alejandro CardereraPhD Machine Learning Student
Our work establishes the first finite-sample convergence guarantees in the literature of average-reward reinforcement learning algorithms: (i) average-reward TD learning with linear function approximation for policy evaluation and (ii) average-reward tabular Q-learning to find an optimal policy. Analysis of average-reward reinforcement learning algorithms is known to be more challenging to study than their discounted-reward counterparts. The key property that is exploited in the study of discounted-reward problems is the contraction property of the underlying Bellman operator. In the average-reward setting, such a contraction property does not hold under any norm, and the Bellman equation is known to have multiple fixed points. To resolve this difficulty, we construct Lyapunov functions using projection and infimal convolution to analyze the convergence of equivalent classes generated by these algorithms. Our approach is simple and general, so we expect it to have broader applications in other problems.
Sheng ZhangPhD Machine Learning Student
This work presents a framework for discovering uncertainty of an ill-posed medical imaging problem. We leverage Normalizing Flows in a Bayesian framework to estimate the full posterior distribution of the problem allowing for straight forward uncertainty quantification. This allows practitioners to use our imaging algorithm with more confidence in the robustness of the images.
Rafael OrozcoPhD Computational Science and Engineering Student
Latent variable models (LVMs) have proven to be a powerful approach for extracting structure from neural population activity, but the lack of standardized datasets and evaluation metrics for LVMs of neural data hinders progress in their development. Neural Latents Benchmark ’21 is the first benchmark suite for neural data LVMs, providing curated datasets, codepacks for data preparation and model evaluation, and a public benchmark challenge hosted on EvalAI. We anticipate that our work will lower the barrier to entry for machine learning experts to contribute to the field and enable the computational neuroscience community to identify, compare, and build upon promising approaches.
Felix PeiBS Electrical Engineering Student
#NATURAL LANGUAGE PROCESSING
This work proposes an organizational framework for the over 60 datasets which have been collected to explain machine learning tasks in natural language. We also summarize the various ways in which explanation datasets have been collected, present takeaways on collecting high-quality crowdsourced datasets, and discuss how dataset collection methodologies can affect modeling and evaluation. Datasets are an often overlooked part of the machine learning pipeline, but play an integral role in the development of systems that can explain their predictions in natural language. We hope this work inspires a more consistent process of dataset collection and reporting, and can serve as a resource for newcomers to the the field of explainable natural language processing.
Sarah WiegreffePhD Computer Science Student
A large-scale graph representation learning database offering over 1.2 million graphs, averaging 15k nodes and 35k edges per graph. With the rapid emergence of graph representation learning, the construction of new large-scale datasets are necessary to distinguish model capabilities and accurately assess the strengths and weaknesses of each technique. By carefully analyzing existing graph databases, we identify 3 critical components important for advancing the field of graph representation learning: (1) large graphs, (2) many graphs, and (3) class diversity. To date, no single graph database offers all of these desired properties. We introduce MalNet , the largest public graph database ever constructed, representing a large-scale ontology of malicious software function call graphs. MalNet contains over 1.2 million graphs, averaging over 15k nodes and 35k edges per graph, across a hierarchy of 47 types and 696 families. Compared to the popular REDDIT-12K database, MalNet offers 105x more graphs, 44x larger graphs on average, and 63x more classes. We provide a detailed analysis of MalNet, discussing its properties and provenance, along with the evaluation of state-of-the-art machine learning and graph neural network techniques. The unprecedented scale and diversity of MalNet offers exciting opportunities to advance the frontiers of graph representation learning–enabling new discoveries and research into imbalanced classification, explainability and the impact of class hardness. The database is publicly available at www.mal-net.org.