My PhD research applies techniques from statistical machine learning (particularly hierarchical Bayesian models and nonparametric extensions) to create automated methods for extracting meaningful information from multimedia. I study theoretical methods for efficient inference in graphical models (e.g. MCMC sampling, variational methods) as well as practical ways to apply these methods to real-world data such as video clips or text documents.
Reliable and Scalable Variational Inference for the Hierarchical Dirichlet Process [PDF] [Supplement] [Code]
in AISTATS 2015
We develop a new objective function for the HDP topic model that allows merge and delete moves to remove ineffective topics during training.
in NIPS 2013
We develop new learning algorithms for scalable variational inference, including new birth and merge moves. Start with just one cluster, and grow as needed.
in NIPS 2012
We develop new data-driven inference methods that enable unsupervised behavior discovery in hundreds of motion capture sequences.
Nonparametric Metadata-Dependent Relational Model [PDF]
in ICML 2012
We find community structure in social & ecological networks, using metadata like age or organism type to improve community discovery.
in POCV 2012 (a workshop at CVPR)
Across many videos, we identify short segments showing the same human activity (motion and appearance), without predefining relevant activities or even their number.
Detailed analysis of algorithms for sampling from a truncated normal distribution. Email me if you're interested in getting the code.
Using machine learning, can we take as input the "clicks" and "clacks" produced by a user typing at a keyboard and recover the typed text? I review modern techniques and make several suggestions. I also collected my own dataset of audio recordings of text (along with ground truth). Please email if interested.