1139 WordsNov 1, 20165 Pages

4 Online learning: Stochastic Approximation
Estimating the mixing density of a mixture distribution remains an interesting problem in the statistics literature. Stochastic approximation (SA) provides a fast recursive way for numerically maximizing a function under measurement error. Using suitably chosen weight/step-size the stochastic approximation algorithm converges to the true solution, which can be adapted to estimate the components of the mixing distribution from a mixture, in the form of recursively learning, predictive recursion method. The convergence depends on a martingale construction and convergence of related series and heavily depends on the independence of the data. The general algorithm may not hold if dependence is present. We have proposed a novel martingale decomposition to address the case of dependent data.
5 Measurement error model: small area estimation
We proposed [4] a novel shrinkage type estimator and derived the optimum value of the shrinkage pa- rameter. The asymptotic value of the shrinkage coefficient depends on the Wasserstein metric between standardized distribution of the observed variable and the variable of interest. In the process, we also estab- lished the necessary and sufficient conditions for a recent conjecture about the shrinkage coefficient to hold. The biggest advantage of the proposed approach is that it is completely distribution free. This makes the estimators extremely robust and I also showed that the estimator continues to

Related

## Network Estimation : Graphical Model

1222 Words | 5 Pagesestimation is robust to outliers and 3 applicable under general distributional assumptions. In the theoretical development, the graph estimation consistency result is also established. Along with standard MCMC, we have developed a fast posterior approximation technique based on variational method. Nonlinear multivariate regression with latent graph In this application, motivated by protein-protein residual interaction when modeled by covariates (RNA), multivariate regression is considered where the

## Estimating The Mixing Density Of A Mixture Distribution

951 Words | 4 Pagesdistribution remains an interesting problem in the statistics literature. Stochastic approximation (SA) provides a fast recursive way for numerically maximizing a function under measurement error. Using suitably chosen weight/step-size the stochastic approximation algorithm converges to the true solution, which can be adapted to estimate the components of the mixing distribution from a mixture, in the form of recursively learning, predictive recursion method. The convergence depends on a martingale construction

## Research Statement : Texas A & M University

1438 Words | 6 PagesResearch Statement Nilabja Guha Texas A&M University My current research at Texas A&M University is in a broad area of uncertainty quantification (UQ), with applications to inverse problems, transport based filtering, graphical models and online learning. My research projects are motivated by many real-world problems in engineering and life sciences. I have collaborated with researchers in engineering and bio-sciences on developing rigorous uncertainty quantification methods within Bayesian framework

## The Abstract Latent Factor ( Lf ) Models

1591 Words | 7 Pagesexploding number of entities, e.g., hundreds of thousands of users and items in a recommender system [1-5]. It is thus highly important to explore the full relationship among entities for various purposes, e.g., predicting potential user preferences in online stores for personalized recommendation [1-5] and estimating missing links among users in social networks for community detection [6, 7, 26]. Therefore, to predict missing entries of a HiDS matrix generated by industrial applications based on its known

## Optimized Dynamic Latent Topic Model For Big Text Data Analytics

7677 Words | 31 PagesAbbreviations BI&A Business intelligence and analytics LDA Latent Dirichlet Allocation SMS Short Messaging Service MCMC Markov Chain Monte-Carlo IT Information Technology BNP Bayesian Nonparametric MPI Message Passing Interface ML Machine Learning IR Information Retrieval NLP Natural Language Processing GMM Gaussian Mixture Model PLSI Probabilistic Latent Semantic Indexing AD-LDA Approximate Distributed LDA HD-LDA Hierarchical Distributed LDA CVB Collapsed Variational Bayes LSVB Latent

## An Analysis Of Recommendation Algorithms

3557 Words | 15 Pagesand the Libimseti online dating data and date ratings. For each of the datasets, we need to extract the information required for making the recommendations and create comma-separated value files. In addition, we intend to use 80% of each dataset as training data and take the remaining 20% as the testing data. We decided on using Apache Mahout [4] for implementing the various recommendation algorithms. Mahout is basically a Java library which implements scalable machine learning techniques like clustering

## The Science Of Data Mining

3544 Words | 15 Pagescredit cards is considered an anomaly. An abnormal IT system signals may indicate a hacked computer. However it is very hard to accurately detect anomalies in a dynamic environment. Thus efficient anomaly detection techniques require continuous learning systems. Also real time detection is a crucial criterion of the efficiency of the detection model. Outlier is another synonym for the word anomaly. Statisticians from the 18th century have started looking for outliers. Edgeworth, F.Y. (1887) studied

## Computers in Different Spheres of Lives

5456 Words | 22 Pagesredefined by the Internet. Newspaper, book and other print publishing have to adapt to Web sites and blogging. The Internet has enabled or accelerated new forms of human interactions through instant messaging, Internet forums, and social networking. Online shopping has boomed both for major retail outlets and small artisans and traders. Business-to-business and financial services on the Internet affect supply chains across entire industries. The origins of the Internet reach back to the 1960s with both

## Physics : High Dimensional Data

4727 Words | 19 PagesThis causes many problems. Algorithms that operate on high-dimensional data tend to have a very high time complexity. Many machine learning algorithms, for example, struggle with high-dimensional data. This has become known as the curse of dimensionality. Reducing data into fewer dimensions often makes analysis algorithms more efficient, and can help machine learning algorithms make more accurate predictions. Humans often have difficulty comprehending data in many dimensions. Thus, reducing data

## Google Case : Capital Structure

10166 Words | 41 PagesNew York, NY, 10012 Vahab S. Mirrokni Google Research 76 9th Ave New York, NY 10011 S. Muthukrishnan Google Research 76 9th Ave New York, NY 10011 mirrokni@google.com muthu@google.com narchak@stern.nyu.edu ABSTRACT Consider an online ad campaign run by an advertiser. The ad serving companies that handle such campaigns record users’ behavior that leads to impressions of campaign ads, as well as users’ responses to such impressions. This is summarized and reported to the advertisers

### Network Estimation : Graphical Model

1222 Words | 5 Pages### Estimating The Mixing Density Of A Mixture Distribution

951 Words | 4 Pages### Research Statement : Texas A & M University

1438 Words | 6 Pages### The Abstract Latent Factor ( Lf ) Models

1591 Words | 7 Pages### Optimized Dynamic Latent Topic Model For Big Text Data Analytics

7677 Words | 31 Pages### An Analysis Of Recommendation Algorithms

3557 Words | 15 Pages### The Science Of Data Mining

3544 Words | 15 Pages### Computers in Different Spheres of Lives

5456 Words | 22 Pages### Physics : High Dimensional Data

4727 Words | 19 Pages### Google Case : Capital Structure

10166 Words | 41 Pages