Research with learning-from-data statistical methods for speech recognition started in the group already in the late eighties, it was continued and enlarged in the nineties for both speech and speaker recognition, and it got a strong increase in the first decade of the new century when a research line on machine translation was started in the group, and the speech synthesis work shifted to statistical machine learning techniques. In the last years most research areas in the group have incorporated deep learning with neural nets, so deep neural networks (DNN) have become a kind of backbone for our current and future research activities, similarly to what has occurred to a large portion of the main research groups from our area.
The goals in speech recognition are the development of new architectures based on end-to-end deep learning techniques, as an alternative to traditional HMM-based speech recognition systems, and the generation of new joint training procedures for the acoustic and language model, and more powerful language models based on recursive DNNs.
In speech recognition, the following lines of research have been undertaken:
One of the ultimate goals of this research is to improve the performance of large vocabulary automatic speech recognition systems to obtain high quality speech-to-speech multilingual translation systems. A specific application would be the translation of parliamentary speeches.