Research Interests

Machine Learning (Supervised and Semi-supervised) for Speech Processing

At a very broad level, most speech based information recognition systems, such as speech recognition, speaker recognition, language recognition, emotion recognition, etc., operate in a similar manner. A suitable machine learning system learns appropriate classification rules from domain specific labelled data (i.e., speaker labels for speaker recognition, emotion labels for emotion recognition, etc.). Generally these systems are developed and operate independently and do not make use of joint models across different tasks, even though the human brain makes extensive use of joint models. A significant hindrance to developing systems that incorporate information from multiple domains is the lack of suitably labelled data. While a number of large databases exist for each individual task, almost none exist with labels corresponding to multiple targets. Semi-supervised learning algorithms aim to learn structures in the data space and exploit information contained in these structures along with a small amount of labelled data to infer better classification rules than would be possible with only labelled data.

Specific Topics of Interests

  • Speech based emotion recognition

  • Detecting depression from speech

  • Speaker verification

  • Speech based cognitive load estimation

  • Language identification