Broad Research Goals and Agenda
My research has focused on various aspects of learning theory - both artificial and biological. The projects I have been working on include research of a more theoretical nature such as statistical machine learning, functional analysis, optimization theory as well as more experimental and application oriented real time and online learning in high dimensional movement systems like robots. Additionally, a third line of my research interest has been towards applying theoretical insights to more biologically relevant topics of sensorimotor control, visuo-motor learning and sparse neural coding. The goals of my current research have a bi-directional component to it: 1) to develop an analytical understanding of learning system capabilities-- going towards development of new algorithms and efficient solutions for machine learning problems-- with possible inspiration from biology and 2) to look into statistical modeling of biological information processing equipped with a deeper understanding of the computational capabilities and limitations of a particular learning architecture. |
Specific Projects & Topics
One of the primary foci in my research agenda has been to understand the analytical and statistical properties of learning systems. The ability to generalize learned results to a novel situation is a key requirement for a learning system. The framework of functional analysis is a good starting point for formalizing such concepts. |
(1) Reproducing Kernel Hilbert Space (RKHS) Based Learning Methods We have developed a method of formalizing the concept of learning as an inverse optimization problem employing techniques from functional analysis and Reproducing Kernel Hilbert Spaces. Then, by moving the analysis from the space of training samples to the approximating function space, we have devised a novel method of directly optimizing the generalization error. This method -- in contrast to techniques that optimize training error and then perform regularization to prevent overfitting -- is theoretically sound and provides better control over implicit assumptions one makes while solving such optimization problems. This formalization has enabled us to devise efficient methods of performing exact incremental learning with guaranteed optimal generalization for a particular solution space. This research has been a forerunner to the now popular large margin and kernel methods in the machine learning community. [For Related Publications, check here] An offshoot of this framework is that it has enabled us to provide solutions to ill-understood concepts of active learning from an analytical perspective. [For Related Publications, check here] |
Some example kernels |
Another area of my research is centered around the ability to learn incrementally, in particular, to perform online function approximation in real time from many sensory channels. As learning in real time with high dimensional systems often imposes different constraints than those that typical machine learning algorithms address, I have engaged in the development of learning mechanisms that are specifically targeted at real-time learning. |
(2) Efficient Real-Time Incremental Learning using Nonparametric Methods Adopting a theoretical framework of non-parametric statistics, we have developed a novel statistical tool - Locally Weighted Projection Regression (LWPR) for incremental learning in high dimensional systems. LWPR performs nonlinear function approximation in high dimensional spaces in the presence of redundant and irrelevant input dimensions. Dimensionality reduction techniques that employ very few projection directions in spite of large input dimensionality enable the algorithm to scale well to high dimensional problems. At its core, it uses locally linear models, spanned by a small number of univariate regressions in selected directions in input space. A locally weighted variant of Partial Least Squares (PLS) is employed for doing the dimensionality reduction. This non -parametric local learning system-- i) learns rapidly with second order learning methods based on incremental training, ii) uses statistically sound stochastic cross validation to learn iii) adjusts its weighting kernels based on local information only, iv) has a computational complexity that is linear in the number of inputs, and v) can deal with a large number of - possibly redundant - inputs, as shown in evaluations with up to 50 dimensional data sets. To our knowledge, this is the first truly incremental spatially localized learning method to combine all these properties. [For related publications, check here] |
3) Online Statistical Learning for High Dimensional Movement Systems Implementations on several robotic hardware at USC, the ATR laboratories and RIKEN BSI Institute, Japan including a 30 DOF humanoid robot have demonstrated the potential of this approach – it has been feasible for the first time to learn dynamics models for such high dimensional systems incrementally in real time. This scalability has enabled us to apply our learning framework for socially relevant projects on learning control for human augmentation (wearable robots), rehabilitation and complete autonomous learning systems for human-machine interaction. I believe that this research will have a significant impact on many other forms of real-time learning tasks including online planning, process control or adaptive guidance systems in unmanned vehicles. Some video clips of the online learning (with LWPR) in action:
[For related publications, check here] |
7DOF SARCOS Dexterous Arm
30DOF Humanoid Robot (DB) |
(4) Visuomotor Learning and Multimodal Attention The work on learning in high dimensional sensory space and motor control has led to a natural extension into using these techniques in the area of multimodal sensor fusion and interaction. I am currently heading a project on Multimodal Interaction using prototypic robotic vision head hardware at the RIKEN Brain Science Institute in collaboration with researchers at USC (look below for details). The aim of this work is to look at the sensory and motor paths as a strongly coupled system and to try to reproduce various oculomotor behaviors like VOR,OKR , smooth pursuit and other sensory (audio, gyroscopic etc.) driven responses on robotic hardware based on biologically plausible computations. One of the direct goals of this research is to understand how multimodal sensation guides the generation of coordinated action and how action guides the perception of the environment. However, I hope that research will shed light on the more general principles of information processing in sensor-rich environments. Visual Flow based Attention involves 3 main modules: (i) Sensory processing of input modalities based on neo-cortical interaction dynamics and saliency maps to determine attention locations of interest (ii) Motor plan for execution of the overt attention and (iii) Module which maintains the coordinate updates [For related publications, check here ] |
7DOF DB Vision Head with 4 cameras
Cameras - Peripheral and Foveal Vision |
MAVERic is a versatile robotic vision head developed for oculo-motor research at RIKEN . It has 7 DOF and is controlled using a real time operating system (vxWorks). MAVERic is equipped with multiple sensory modalities including position sensing (7DOFs), load sensing (3DOFs), stereo microphones, foveal and peripheral vision in each eye and a 6-axes gyroscope in addition to laser range finders in each eye.
|
(5) Sparse Function Approximation and Coding To understand information processing and computations in biological systems, I strongly believe that in addition to elucidating the pathways and connections at various scales, it is also essential to look at the underlying computational principles. I have worked (to a limited extent) on the principles of sparse representation for decomposition of natural images and neural codes as well as methods of minimizing the effect of noise variance in learning systems that have unreliable and stochastic noisy properties – not unlike our own neural system. [For related publications, check here ] |
(6) Bayesian Nonparametric Learning One of the recent thrusts has been to come up with parameter free learning - in other words, getting rid of the learning rate parameters of gradient descent approaches and the various initialization issues in the Max. Likelihood framework while maintaining the advantages of the local nonparametric techniques. Bayesian Nonparametric learning is a step towards these goals -- an approach where putting hyperparameters on the adaptive distance metric allows to automatically determine the optimum locality. This work will provide a much-needed bridge between theoretically sound Bayesian learning methods and the highly adaptive and efficient non-parametric learning techniques. [For related publications, check here ] |
Automatic Adjustment of the Kernel Distance Metric |
(7) Miscellaneous In addition to topics mentioned above, I am also interested (to a varying degree) in some additional topics listed below. See also the interesting video clips from associated research.
|
No comments:
Post a Comment