## Core ML

#### Stochastic Cellular Automata for Inference

In this project, we advocate perhaps the most embarrassingly parallel of all computational models, stochastic cellular automata (SCA) is sufficient for inference on large class of latent variable models. In SCA, the atomic unit of computation is the cell, and each cell interacts with a set of neighbouring cells via a local update rule. At each time-step, every cell in the automaton updates simultaneously, thus enabling extreme parallelism. Given this computation model, the challenge is to design update rules that correctly implement the desired inference algorithm in our choice of statistical model. Applied to latent Dirichlet allocation we find that our algorithm is over an order of magnitude faster than the current state-of-the art. A simple C++/MPI implementation on a 4-node cluster samples 570 million tokens per second. We process 3 billion documents and achieve predictive power competitive with collapsed Gibbs sampling and variational inference. We are looking forward to more interesting models such as restricted Boltzmann machines (RBMs) or models combining deep networks and Bayesian models.

#### Fast Alias Sampling Technique and Analysis

The objective of this project is to scale Bayesian Nonparametrics such as topic models, clustering, and hierarchical representations to Big Data by overcoming the computational issues. In this pursuit, we are trying to exploit the fact that clusters, topics, trees, or other structures far away from the current set of observations are almost irrelevant and thus allowing us to ignore them. This lets us obtain a proposal distribution cheaply for cluster membership that does not require all comparisons. Then we will use Metropolis-Hastings (MH) algorithm along with alias techniques for fast sampling. However, a big problem in applying such alias techniques to Bayesian Nonparametrics is that possible number of categories is not pre-determined and keeps changing during the inference procedure. If a category is inserted or deleted we cannot use the stale distribution for MH sampling because now domain of proposal doesn't contain domain of target distribution. So we cannot directly cache alias tables for fast sampling from the posteriors. In order to circumvent this problem we are exploring smart data structure for storing the Alias table, which can allow cheap updates to insert a new possible outcome or delete an existing outcome.

#### Gaussian LDA for Topic Models with Word Embeddings

Continuous space word embeddings learned from large, unstructured corpora have been shown to be surprisingly effective at capturing semantic regularities in language. While such word embeddings have been incorporated to produce state-of-the-art results in numerous supervised natural language processing tasks from the word level to document level; however, they have played a more minor role in unsupervised learning problems. This work shows some of the promise that they hold in this domain. In particular, we replace LDA’s parameterization of “topics” as categorical distributions over opaque word types with multivariate Gaussian distributions on the embedding space. This encourages the model to group words that are a priori known to be semantically related into topics. To perform inference, we introduce a fast collapsed Gibbs sampling algorithm based on Cholesky decompositions of the covariance matrices of the posterior predictive distributions. <> Using vectors learned from a domain general corpus (English Wikipedia), we report results on two document collections (20-newsgroups and NIPS). Qualitatively, Gaussian LDA infers different (but still very sensible) topics relative to standard LDA. Quantitatively, our technique outperforms existing models at dealing with OOV words in held-out documents. More broadly still, running LDA on documents consisting of different modalities than just text is facilitated by using the lingua franca of vector space representations, so we expect numerous interesting applications in this area.

#### Topic Modelling for WWW

The objective of this project is to come up with a better generative model for graphs, wherein both edge and vertex properties are captured. In other words, the topic of a page are function of both content of the page and links to which connects. Also we hope to preserve the power law properties of the graphs, i.e. vertex degree distribution, triangles, etc. Such a model would be useful in detecting if some parts of the graph are spam or ham, or come up with a good model of relevance, or predict how they will extend, or also to identify the content / parameters of a vertex given the rest. We are working towards developing a good theoretical graph generative model which better explains the WWW and an efficient inference algorithm for the said purpose which can handle Amazon public web-crawl which has about 6 billion pages with links and content.

#### Spectral Methods for Hierarchical Dirichlet Process

Nonparametric Bayesian mixture models are often used to solve the problems of topic modelling or clustering. A popular tool is the Hierarchical Dirichlet process (HDP) mixture model. To estimate the parameters, commonly used algorithms are online variational Bayes method or Gibbs sampling as Chinese restaurant franchise. However, these approaches typically make computations prohibitively expensive. In this paper we introduce a computationally and statistically efficient spectral algorithm for inferring parameters in HDPs using tensor decomposition. The lower-order moments of HDP parameters are derived and we provide theoretical guarantees for recovery accuracy. Moreover, we propose a hybrid method which allows us to combine spectral and likelihood approaches. Finally, we show that the resulting algorithm is both faster and outperforms it in terms of the predictive power of the trained model by applying it to two corpora, namely Enron emails and Wikipedia.

## CAD Applications

#### Dirichlet Process Regression

High-speed I/O link is an important component in computer systems, and estimating its bit error rate (BER) is a critical task to guarantee its performance. In this project, we propose an efficient algorithm of multi-population regression via Dirichlet Process (MPR-DP) method to estimate BER of high-speed I/O link with extremely small number of measurements. The key idea is to partition all populations (e.g., different environmental conditions, setup configurations, etc.) into groups. The populations within the same group are similar and their common knowledge can be extracted to improve the accuracy of regression. As will be demonstrated by the silicon measurement data of a high-speed I/O link, MPR-DP reduces the moment estimation error by up to 200% compared to other conventional methods.#### Co-Training for Circuit Modeling Cost Reduction

Efficient performance modeling of today's analog and mixed-signal (AMS) circuits is an important yet challenging task. In this project, we propose a novel performance modeling algorithm that is referred to as Co-Learning Bayesian Model Fusion (CL-BMF). The key idea of CL-BMF is to take advantage of the additional information collected from simulation and/or measurement to reduce the performance modeling cost. Different from the traditional performance modeling approaches which focus on the prior information of model coefficients (i.e. the coefficient side information) only, CL-BMF takes advantage of another new form of prior knowledge: the performance side information. In particular, CL-BMF combines the coefficient side information, the performance side information and a small number of training samples through Bayesian inference based on a graphical model. Two circuit examples designed in a commercial 32nm SOI CMOS process demonstrate that CL-BMF achieves up to 5X speed-up over other state-of-the-art performance modeling techniques without surrendering any accuracy.

#### Moment Estimation with Extremely Small Sample Size

Most of the statistics/machine learning literatures methods, like density estimation or moment estimation come with a caveat: they typically operate in the big data regime, i.e. need sufficiently large amount of data available to get accurate results. However in applications such as circuit validation only a very small sample size data can be observed due to time and money constraints. But most of the modern scientific technology excels at simultaneous execution of parallel investigations, where the data sets have underlying correlations. We aim to exploit this fact and hence find an efficient moment estimation method that could potentially provide significant estimation accuracy improvement under small sample size while automatically handling clustered data or outliers. In order to achieve this we employ an assortment of empirical, hierarchical and non-parametric Bayesian methods with which we have been successful in acquiring some intermediate, yet convincing results.

#### Markov Decision Process for Self-Healing Circuits

A major consequence of the drive toward ever smaller transistor gate length is an increase in process variations that degrade circuit performance. Consequently, the systems are made adaptive in nature. Tuning knobs are added to compensate for process variation and thus introduce “self-healing” properties in the circuit. Proper selection of tuning knobs is required to meet system specifications. Global optimization with all the tuning knobs is prohibitively expensive to implement at circuit level, but various tuning algorithms available to tune individual knobs. Now the question arises about the order in which to run them. This project tries to addresses this issue by noticing that the situation obeys Markov perfectly and thus develops a Markov Decision Process (MDP) based knob-tuning procedure. The proposed approach is validated on a simple self-healing RF receiver circuit.

## Course Projects

#### Large Scale Structure Learning of Conditional Gaussian Graphical Models

Gaussian graphical models (GGMs) are commonly used to represent relationships between objects in a complex system. One of the fundamental questions is how to estimate the structure of the graphical model that represents these relationships? Answer to this question is of particular interest in modern financial markets where firms have become increasingly linked to each other through a complex and usually opaque network of relationships. From a economic point of view, if investors take into account these relationships, prices of both linked firms should adjust when news about one of the linked firms is released into the market (as long as those news represent a change in the fundamental value of one of the linked firms). To understand prices behavior then it may be important to uncover and understand the nature of such (possibly time-varying) relationships. So we used Gaussian Graphical Model in order to uncover customer-supplier relationships in the US business network by using the companies’ monthly returns as input.

The standard method to estimate structure in a GGM is to look at the inverse of the covariance matrix, known as the precision matrix, encodes the structure of the relationships through the pattern on non-zero elements, that is, two objects are connected if the corresponding element in the precision matrix is not equal to zero. Therefore, given observations on the objects in the complex system, one estimates a sparse precision matrix in order to uncover relationships between objects. This can be done using a neighborhood selection procedure or graphical lasso. Unfortunately, GGM are not flexible enough to capture relationships that change over time and depend on environmental conditions.

In this project, we developed a novel method for learning the fixed structure of a graphical model, but allowing the parameters of the model to change. This was done by modifying the graphical lasso procedure, by treating each element of the precision matrix to be smooth functions of time. Model complexity control was done by assuming that each element of the precision matrix belongs to a reproducing kernel Hilbert space (RKHS) and used an appropriate RKHS norm as a the penalty which simultaneously encourages sparsity and smoothness. Then in order to effectively solve the resulting optimization procedures, we developed a special ADMM solver that could scale to big problem instances and implemented it on a Hadoop cluster. We tested the algorithm on synthetic data and achieved precision of around 0.5 for case when number of dimensions is larger than number of observations, which is good. Unfortunately, the algorithm was able to only cluster different firms belonging to same industries, and failed to uncover the customer-supplier linkages. We believe the input dataset of monthly returns did not contain sufficient information and we need to include more information about each particular firm to be able to capture such type of relationships.

#### Mining Interesting Local Structure from Tensors

Tensors are ubiquitous in real-world recommender systems. In this project, we focus on finding interesting local structure in massive real-world tensors beyond tensor decompositions. We break down this problem into three sub-tasks. In particular, in a tensor of subject-relation-object triplets derived from the Yago or NELL dataset, we consider the question of finding the most interesting, importantand semantically coherent facts about a subject or a relation.

For the first task we suggest CF-ICF (a variant of TF-IDF) to answer this question. Also a big question, is what do we precisely mean by interesting fact? Unlike popularity or importance which have been well studied, the interestingness is much more complicated and not studied yet much to the best of our knowledge. To elaborate, spotting obscure facts about an obscure entity is not very interesting, e.g. \Riemann zeta function obeys analytic continuation". However, to humans it is very interesting in spotting an obscure fact about a popular entity. For example, everybody would be interested to know that \Michael Jordan (a famous basketball player) is hydrophobic". Thus, specificity in a fact is strong indicator of surprise factor present in it. So we wish to mine for such facts from the tensor.

For the second task, we are inspired by the similarity between PageRank algorithm and value iteration for Markov Decision Process. The PageRank provides an importance score for all vertices in a normal graph. Unfortunately, the PageRank algorithm only works for simple graphs/matrices and its extension for mult-graphs/tensors is non-trivial. But, if we observe a big similarity between the Bellman equation for MDP and the PageRank equation and thus look at Bellman equation as a generalization of PageRank for multi-graph with score given by the value vector rather than the eigenvector. Also we design a special reward function for the task.

Another question that we consider is how to discover and characterize dense and semantically coherent local blocks in the tensor. To answer this question, we propose Bayesian Tensor Blockmodels as a method that detects and characterizes dense tensor blocks. We perform a literature survey of important related work in the area, and evaluate our solutions on the tensors constructed from the Yago and NELL datasets.

#### Local List-Decoding of Reed-Muller Codes over Binary Field

In this report we study the paper titled, "List-Decoding Reed-Muller Codes over Small Fields" by Parikshit Gopalan, Adam R. Klivans and David Zuckerman. The Reed-Muller codes are multivariate generalization of Reed-Solomon codes, which use polynomials in multiple variables to encode messages through functional encoding. Next, any list decoding problem for a code has two facets: Firstly, what is the maximal radius up to which any ball of that radius contains only a constant number of codewords. Secondly, how to nd these codewords in an ecient manner.

The paper by GKZ presents the first local list-decoding algorithm for the *r*-th order Reed-Muller code RM(*r*,*m*) over 𝔽_{2} for *r* ≥ 2. Given an oracle for a received word *R* : 𝔽_{2}^{m} ⟶ 𝔽_{2}, their randomized local list-decoding algorithm produces a list containing all degree *r* polynomials within relative distance (2^{-r}-*ϵ*) from *R* for any *ϵ* > 0 in time poly(*m ^{r}*,

*ϵ*). Since RM(

^{r}*r*,

*m*) has relative distance 2

^{-r}, their algorithm beats the Johnson bound for

*r*≥ 2. Moreover, as the list size could be exponential in

*m*at radius 2

^{-r}, their bound is optimal in the local setting.

## Undergraduate Projects

#### A Fourier Mellin based Communication System

*Advisor: Prof A K Gogoi, Indian Institute of Technology Guwahati, India*

A novel communication scheme for wideband linear time varying channels is suggested. Such channels are encountered whenever ratio of message signal bandwidth to the carrier frequency is significant and there is fast relative motion between transmitter, receiver and/or scatters present in the environment – like an underwater acoustic communication scenario. This relative motion leads to simultaneous time and frequency spread, in addition to the usual difficulties faced in typical narrowband communication like multipath and inter-symbol interference, making the system design much more convoluted. Motivated from the mammalian auditory system, the central idea behind the proposed communication scheme is a combination of Mellin and Fourier transforms which is able to compensate the concurrent Doppler and delay spread introduced by the channels. Also, the proposed method does not require large memory yet being computationally and spectrally attractive. Mathematical development of the proposed communication system is presented along with simulations signifying the success of the same.

#### Low Cost Intravenous Infusion System Controller

The aim is to design a cost-effective control system for conventional gravity-controlled infusion systems that guarantees a safe and accurate infusion process. The control procedure has been modeled by a Petri-Net and a digital controller has been designed. Since being developed for application in clinical settings, its functionality has to be verified rigorously using formal verification approaches suitable. The system will also have an online testing block inbuilt. Additionally test pattern is generated for successful testing of the fabricated chip. The controller was realized on a FPGA for development purposes.

#### Distributed Power Metering and Appliance Classification

The ability to monitor the power consumptions of electric appliances and switch their mains connection is a major enabler for smart environments, e.g. a detailed analysis of the individual contributors to a household’s energy bill or the detection of a user’s presence or location. However, existing commercial metering devices typically lack the functionality to monitor consumers at device-level granularity, and rarely provide the function to control the mains connection to the attached appliance. These shortcomings of existing solutions is addressed by presenting SmartMeter.KOM, our wireless sensor node capable of capturing current waveforms and in turn identifying and actuating appliances individually. The platform is based on low-power hardware and incorporates a reprogrammable microcontroller which also allows developers to easily deploy new algorithms. The versatility of SmartMeter.KOM is demonstrated by presenting prototypical implementations of smart applications, like non-intrusive distributive load monitoring based on pattern recognition techniques and identifying further research directions.

#### A Powering Scheme for Biomedical Implants

The aim is to design a contactless power transfer system with miniaturized receiving unit for charging the implanted rechargeable battery. For any biomedical implant, an important criterion is that the device should not be heated by more than 2 K above the body temperature; else it might harm surrounding living tissues. Thus a novel inductive power link with a simple secondary pickup and controller on primary side – with an attempt to dissipate most of the losses at the primary side i.e. outside the body–is built. With the proposed design, the control law boils down just to multiplication of voltages across two capacitors in the primary side, which has advantage of easy implementation. The effectiveness of the proposed power scheme has been demonstrated by both simulation and experiments.

#### Contactless Power Pickup using Saturale Core Reactor

A novel contactless secondary power pickup is designed for inductively coupled contactless power transfer systems. The proposed technique eliminates the need of tedious fine-tuning process associated with traditional fixed power pickup tuning methods and eases the component selection process, while still achieving full-range operation for power flow regulation and maintaining constant output voltage – even at a high quality factor. Moreover, it can overcome the online circuit parameter variations and automatically achieve the maximum power transfer capacity as required. In order to meet this dynamic load demands, a saturable core reactor is controlled with a simple algorithm to act as a variable inductor and change the tuning condition of the power pickup accordingly. The usefulness of the proposed power pickup in general wireless power transfer has been established by both simulations and experimental results.