Environment for building text-to-speech synthesis for Icelandic

Text-to-speech (TTS) for Icelandic is a 24 month project conducted at Reykjavík University, funded by The Icelandic Language Technology Fund. The goal is to make Icelandic text-to-speech software available and open for the software development community, academics and the wider community. We will set up an environment to build a statistical parametric speech synthesis system for Icelandic and release an evaluated baseline system for future development.
Funding: The Icelandic Language Technology Fund
Article, data, code: https://github.com/cadia-lvl/SLT2018, “Bootstraping a Text Normalization System for an Inflected Language Numbers as a Test Case“
Contributors: Anna Björk Nikulásdóttir, Atli Þór Sigurgeirsson, Alexander Danielsson Moses
Timeline: September 2017 – March 2018
Voice patterns in air-traffic control

The aim of this project is to build a system that monitors voice patterns of air traffic controllers and assesses cognitive workload in real-time. The project is a collaboration between the CADIA, the Icelandic air traffic control provider Isavia, Tern Systems and Icelandair.
Contributor: Eydís Huld Magnúsdóttir
Cognitive workload monitoring using voice

The focus of this project is to discover relationship between cognitive workload, task performance, voice patterns and physiological signals. Cognitive workload experiments are set up to obtain voice and physiological data recording. Speech signal processing and pattern recognition algorithms are used to discover how cognitive workload affects the voice. This project is a collaborative project between The Speech Pattern Recognition Group and The Cognitive Signals Group.
Contributor: Eydís Huld Magnúsdóttir
ASR for parliamentary speeches

All parliamentary speeches have to be stored in a text format and kept accessible to the public.
As it is, a considerable time is spent on transcribing the speeches. By using automatic speech recognition the manual labor needed for the transcription of each speech can be reduced considerably.
In this project we are further developing the automatic speech recognizer used in the transcription process for the Icelandic parliament since 2018.
Funding: Althingi
Articles: Althingi ASR System (poster), Building an ASR corpus using Althingi’s Parliamentary Speeches, Frá stálþræði til gervigreindar (news article in icelandic)
Data: Malfong.is – data
Code: github – kaldi/egs/althingi/s5, lirfa
Contributors: Inga Rún Helgadóttir, Judy Fong
Timeline: May 2016 – August 2019
Free and Open Speech Recognition for Icelandic

The project Free and Open Speech Recognition for Icelandic is a 12-month project conducted at the Reykjavík University, funded by The Icelandic Language Technology Fund. The project is about building a free accessible environment where Icelandic speech recognition systems can be developed and customised.
Funding: The Icelandic Language Technology Fund and Menntamálaráðuneytið
Code: ICE-ASR code repository
Website: Free Online Automatic Speech Recognizer – Tal
Contributors: Anna Björk Nikulásdóttir, Róbert Kjaran Ragnarsson
Objective Assessment of Voice Quality

Human voice can assume various qualities such as breathy voice, modal voice, and pressed voice. The assessment of voice quality has been used clinically for the diagnosis and management of voice disorders. However, professional rating of voice quality can be labor-intensive and subjective.
This project aims to automate this rating procedure. The primary focus is on the use of signal processing algorithms to extract suitable features from different speech related signals.
Our current research is on using an inverse filtering algorithm to obtain glottal flow estimates from speech signals, from which features can be extracted and used in the training and test procedures for a voice quality predictor.
The secondary focus is on the use of machine learning schemes to construct assistant systems for medical practice. Our current research regarding this area is the use of Gaussian mixture models and deep neural networks for the purpose of speech modality classification.

Funding: This work was supported in part by The Icelandic Centre for Research (RANNIS) under the project Model-Based Speech Production Analysis and Voice Quality Assessment under Grant 152705-051.
Code: Source code for evaluation of inverse-filtering algorithms
Dataset: Data set of synthesized sustained vowels
Dataset: Data set of synthesized continuous speech
Contributors: Dr. Yu-Ren Chien, Michal Borský
Timeline: November 2015 – November 2018.