Biography

**A Generalized Online Convex Optimization Framework for Stochastic Nonparametric Learning**
In this paper, we propose a framework for optimizing the Fisher information (FIS) in a data-driven setting with a novel stochastic function estimation strategy. A key to performing this strategy is to take a local approximation (localization factor), which estimates the loss function of the data in terms of a stochastic Fisher information. We show the effectiveness of our framework using two simulated instances where the Fisher information is not available and the data is sampled from a distribution with no information about its distribution. In addition, we propose a simple alternative to our framework in which the information from the Bayesian distribution is available only for the training and regression instances, hence the training criterion is not specified. Finally, we provide a simple algorithm that allows us to evaluate the Fisher information of a data in terms of the stochastic Fisher information.

**Adaptive Reinforcement Learning for Maintaining Reliable Knowledge in Reinforcement Learning**
Conventional reinforcement learning systems are learning based on an iterative strategy. In this case, the goal is to maximize a relative value of the expected reward. Here, the goal is to make each action have a similar, yet distinct reward value in terms of the reward of the action. Based on a previous state of state process, the goal is to estimate a joint probability distribution on the value of the reward of each action. An application of this state process approach in robotics is to improve the performance of robot control. We propose a novel method that learns to predict the reward value of actions with only a small number of predictions for the reward valued by the robot. This approach uses a set of conditional probability distributions to predict the reward value of the action. We show that the reward value of actions can be used to model the behavior of the robot using a novel representation of the reward concept called the joint probability distribution.

**Deep Learning for Predicting Future Performance**
One of the challenges in machine learning is to perform well when its performance depends on the underlying data. In this paper, we propose and study a new class of neural network models, a model without bias. We propose a novel Deep Learning Learning (DL) method to automatically learn a model without bias. Our method performs well on the standard MNIST dataset (5-digit error rate) using a weighted Euclidean distance and a non-gradient method (from the Euclidean distance), while outperforming the conventional DL method using the same dataset. We evaluate our DL method on a classification task using MNIST and a multi-label classification task using Deep Learning (DL) from the MNIST dataset using a supervised learning technique.