Filter by type:

Sort by year:

Memory Augmented Recursive Neural Networks

Forough Arabshahi*, Zhichu Lu*, Sameer singh, Animashree Anandkumar
Conference Papers ArXiv Pre-Print


Recursive neural networks have shown an impressive performance for modeling compositional data compared to their recurrent counterparts. Although recursive neural networks are better at capturing long range dependencies, their generalization performance starts to decay as the test data becomes more compositional and potentially deeper than the training data. In this paper, we present memory-augmented recursive neural networks to address this generalization performance loss on deeper data points. We augment Tree-LSTMs with an external memory, namely neural stacks. We define soft push and pop operations for filling and emptying the memory to ensure that the networks remain end-to-end differentiable. In order to assess the effectiveness of the external memory, we evaluate our model on a neural programming task introduced in the literature called equation verification. Our results indicate that augmenting recursive neural networks with external memory consistently improves the generalization performance on deeper data points compared to the state-of-the-art Tree-LSTM by up to 10%.

Look-up and Adapt: A One-shot Semantic Parser

Zhichu Lu *, Forough Arabshahi*, Igor Labutov, Tom Mitchell
Conference Papers Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019


Computing devices have recently become capable of interacting with their end users via natural language. However, they can only operate within a limited "supported" domain of discourse and fail drastically when faced with an out-of-domain utterance, mainly due to the limitations of their semantic parser. In this paper, we propose a semantic parser that generalizes to out-of-domain examples by learning a general strategy for parsing an unseen utterance through adapting the logical forms of seen utterances, instead of learning to generate a logical form from scratch. Our parser maintains a memory consisting of a representative subset of the seen utterances paired with their logical forms. Given an unseen utterance, our parser works by looking up a similar utterance from the memory and adapting its logical form until it fits the unseen utterance. Moreover, we present a data generation strategy for constructing utterance-logical form pairs from different domains. Our results show an improvement of up to 68.8% on one-shot parsing under two different evaluation settings compared to the baselines.

Towards Solving Differential Equations through Neural Programming

Forough Arabshahi, Sameer Singh, Anima Anandkumar
Workshop Papers the ICML workshop Neural Abstract Machines & Program Induction v2 (NAMPI), Stockholm, Sweden, 2018


We propose using symbolic data for training neural networks that solve differential equations.This results in a generalizable and scalable neural solver. The main reason is that we jointly learn a large number of functions, that cover an entire mathematical domain, and use these trained functions for solving an unseen differential equation. Almost all of the literature focuses on hand-crafting architectures that are tailored for a specific type of differential equation. Moreover, they use numerical evaluations of a differential equation for training, which means that training and tuning needs to be redone for solving a different input differential equation resulting in a lack of scalability and generalizability.

In this work, we investigate the possibility of using neural programs for solving ordinary differential equations (ODEs) by verifying/rejecting a candidate solution of an ODE. We design a neural programmer that is capable of choosing the correct solution with a high accuracy. Our neural programmer, based on a Tree-LSTM, leverages the compositionality of each input ODE.

Combining Symbolic Expressions and Black-box Function Evaluations in Neural Programs

Forough Arabshahi, Sameer Singh, Anima Anandkumar
Conference Papers The 6th International Conference on Learning Representations (ICLR), 2018


Neural programming involves training neural networks to learn programs, mathematics, or logic from data. Previous works have failed to achieve good generalization performance, especially on problems and programs with high complexity or on large domains. This is because they mostly rely either on black-box function evaluations that do not capture the structure of the program, or on detailed execution traces that are expensive to obtain, and hence the training data has poor coverage of the domain under consideration. We present a novel framework that utilizes black-box function evaluations, in conjunction with symbolic expressions that define relationships between the given functions. We employ tree LSTMs to incorporate the structure of the symbolic expression trees. We use tree encoding for numbers present in function evaluation data, based on their decimal representation. We present an evaluation benchmark for this task to demonstrate our proposed model combines symbolic reasoning and function evaluation in a fruitful manner, obtaining high accuracies in our experiments. Our framework generalizes significantly better to expressions of higher depth and is able to fill partial equations with valid completions.

Combining Symbolic Expressions and Black-box Function Evaluations in Neural Programs

Forough Arabshahi, Sameer Singh, Anima Anandkumar
Workshop Papers NIPS 2017, MLtrain Workshop, Long Beach, California

Spectral Methods for Correlated Topic Models

Forough Arabshahi, Anima Anandkumar
Conference Papers Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), PMLR 54:1439-1447, 2017


In this paper we propose guaranteed spectral methods for learning a broad range of topic models, which generalize the popular Latent Dirichlet Allocation (LDA). We overcome the limitation of LDA to incorporate arbitrary topic correlations, by assuming that the hidden topic proportions are drawn from a flexible class of Normalized Infinitely Divisible (NID) distributions. NID distributions are generated by normalizing a family of independent Infinitely Divisible (ID) random variables. The Dirichlet distribution is a special case obtained by normalizing a set of Gamma random variables. We prove that this flexible topic model class can be learnt via spectral methods using only moments up to the third order, with (low order) polynomial sample and computational complexity. The proof is based on a key new technique derived here that allows us to diagonalize the moments of the NID distribution through an efficient procedure that requires evaluating only univariate integrals, despite the fact that we are handling high dimensional multivariate moments. In order to assess the performance of our proposed Latent NID topic model, we use two real datasets of articles collected from New York Times and Pubmed. Our experiments yield improved perplexity on both datasets compared with the baseline.

Are You Going to the Party: Depends, Who Else is Coming?:[Learning Hidden Group Dynamics via Conditional Latent Tree Models]

Forough Arabshahi, Furong Huang, Anima Anandkumar, Carter T Butts, Sean M Fitzhugh
Conference Papers Data Mining (ICDM), 2015 IEEE International Conference on (pp. 697-702). IEEE.


Scalable probabilistic modeling and prediction in high dimensional multivariate time-series is a challenging problem, particularly for systems with hidden sources of dependence and/or homogeneity. Examples of such problems include dynamic social networks with co-evolving nodes and edges and dynamic student learning in online courses. Here, we address these problems through the discovery of hierarchical latent groups. We introduce a family of Conditional Latent Tree Models (CLTM), in which tree-structured latent variables incorporate the unknown groups. The latent tree itself is conditioned on observed covariates such as seasonality, historical activity, and node attributes. We propose a statistically efficient framework for learning both the hierarchical tree structure and the parameters of the CLTM. We demonstrate competitive performance in multiple real world datasets from different domains. These include a dataset on students' attempts at answering questions in a psychology MOOC, Twitter users participating in an emergency management discussion and interacting with one another, and windsurfers interacting on a beach in Southern California. In addition, our modeling framework provides valuable and interpretable information about the hidden group structures and their effect on the evolution of the time series.

Beyond LDA: Spectral Methods for Topic Modeling Based on Exchangeable Partitions

Forough Arabshahi, Roi Weiss, Anima Anandkumar
Workshop Papers NIPS workshop on Bayesian Nonparametrics: The Next Generation, 2015.

A frequency domain MVDR beamformer for UWB microwave breast cancer imaging in dispersive mediums

Forough Arabshahi, Sadaf Monajemi, Hamid Sheikhzadeh, Kaamran Raahemifar, Reza Faraji-Dana
Conference Paper Signal Processing and Information Technology (ISSPIT), 2013 IEEE International Symposium on 2013 Dec 12 (pp. 000362-000367). IEEE


In this paper a new imaging technique for early stage ultra wideband (UWB) microwave breast cancer detection is propose A circular array of antennas illuminates the breast tissue with UWB pulses and the bac cattered signals are then passed through a beamformer designed and applied in frequency domain. This design enables the bea ormer to compensate for non-integer delays and frequency dependent dispersion and at the same time increases the accuracy of the beamformer. It is shown that the proposed imaging algorithm reduces the computational cost and memory of the imaging system by decreasing the sampling rate to the Nyquist rate and significantly reducing the number of required matrix inversions. Furthermore, the proposed algorithm significantly improves the quali of the obtained image and on average the signal-to-clutter ratio of the image is increased by 89.29% compared to other cases.