Former Projects

Named Entity Recognition – dataset and baselines

Named entities in Icelandic

Named entity recognition (NER) is the task of finding and classifying the named entities (names of people, places, organizations, events, etc.) that appear in text. This is a common preprocessing step before conducting various downstream tasks, such as question answering and machine translation. The aim of this research project is to create the first labelled corpus for Icelandic NER and to use machine learning methods for training a named entity recognizer for Icelandic. This involves labelling all named entities in a text corpus of 1 million tokens (MIM-GOLD), into the following categories: Person, Location, Organization, Micellaneous, Money, Percent, Time, and Date. Using this new data, different machine learning methods (both traditional and deep learning methods) will be tested, and the best performing models selected and combined into a new named entity recognizer for Icelandic. The project is carried out in collaboration with Nasdaq Iceland.

Funding: Strategic Research and Development Programme for Language Technology (Markáætlun í tungu og tækni), Nasdaq Iceland

Contributors: Ásmundur Alma Guðjónsson, Svanhvít Lilja Ingólfsdóttir, Hrafn Loftsson

Timeline: May 2019 – May 2020.


Eyra – speech data acquisition

Eyra is a free and open source project designed to provide tools to gather speech data for languages.

Speech data acquisition is particularly important for under-resourced languages. The data gathering is the most labour-intensive part of developing speech technologies such as automatic speech recognizers and synthesizers.

A screenshot from the Eyra software recording screen where the prompts are read

Eyra aims to make this task cheaper (and better), by  providing a free platform to handle the data acquisition process. Eyra analyzes incoming data and can provide feedback on the quality to attempt to get better quality data.

Designed with flexibility in mind, Eyra is a web app compatible with most browsers, and its offline setting is available by using a laptop as a server. It is open source so you can contribute, use only parts of it, or modify it to suit your needs (e.g. if you want to use pictures instead of prompts).

Currently, Eyra is being used to collect children’s speech data in collaboration with the University of Akureyri.

Funding: Google

Code: https://github.com/Eyra-is/Eyra
Article: SLTU 2016, Building ASR Corpora Using Eyra

Contributors: Matthías Pétursson, Róbert Kjaran, Simon Klüpfel, Judy Fong, Stefán Jónsson

Timeline: Autumn 2016 – Ongoing



Natural Language Understanding model for Airline Reservation System

This is a two-semester 60 ECTS Masters project. The goal is to create a Dialog System that can understand users’ requests, such as find a flight or to book it.

One of the core components of a Spoken Dialog System (SDS) is the natural language understanding (NLU) model.
The main functionality of the NLU is to create structured data from the users’ requests, which might, for example, be asking for flight information, airline information, etc.
This information is extracted using an intent classification and slot-filling model that has been trained on a dataset composing of user requests.
The ATIS dataset has been used as a standard benchmark dataset widely for the task of NLU for English.
This project is split up into the following tasks:

  • Create a Text Annotation Tool.
  • Create an Icelandic translated version of the ATIS dataset (ICE-ATIS), with the use of the Text Annotation Tool.
  • Run ICE-ATIS through variant already made models to see how it compares with the other NLU models trained on the ATIS dataset.

Funding: Icelandair

Contributors: Egill Anton Hlöðversson, Jón Guðnason

Timeline: Autumn 2019 – Spring  2020

Text Annotation Tool: https://github.com/egillanton/flask-text-annotation-tool
Live Server: http://egillanton.pythonanywhere.com/ (temporary)
ICE-ATIS: https://github.com/egillanton/ice-atis


Broddi: Voice-controlled Information Delivery

The goal of the project is to design a system that enables voice-driven delivery of web content, such as content from news sites, blogs or radio programs. The idea is to have an environment specifically designed for audio interaction and not tied to visual layout of a web page. The user can choose content with voice commands. The content is then presented to the user as audio, e.g. recorded speech or generated speech or a radio episode or podcast. A pure audio interface is useful for situations where hands and eyes are busy, such as when driving, cooking, running, etc., as well as for people with disabilities.

Funding: Tækniþróunarsjóður and menntamálaráðuneytið

iOS App: Broddi

Contributors: Kristján Rúnarsson, Róbert Kjaran, Stefán Jónsson

Timeline: Autumn 2017 – Winter 2020

News Article: Raddstýrður fréttalesari (Icelandic)



Environment for building text-to-speech synthesis for Icelandic

Text-to-speech (TTS) for Icelandic is a 24 month project conducted at Reykjavík University, funded by The Icelandic Language Technology Fund. The goal is to make Icelandic text-to-speech software available and open for the software development community, academics and the wider community. We will set up an environment to build a statistical parametric speech synthesis system for Icelandic and release an evaluated baseline system for future development.

Funding: The Icelandic Language Technology Fund

Article, data, code: https://github.com/cadia-lvl/SLT2018, “Bootstraping a Text Normalization System for an Inflected Language Numbers as a Test Case

Contributors: Anna Björk Nikulásdóttir, Atli Þór Sigurgeirsson, Alexander Danielsson Moses

Timeline: September 2017 – March 2018


Voice patterns in air-traffic control

Air traffic controller in Reykjavik Airport Tower.

The aim of this project is to build a system that monitors voice patterns of air traffic controllers and assesses cognitive workload in real-time. The project is a collaboration between the CADIA,  the Icelandic air traffic control provider Isavia, Tern Systems and Icelandair.

Contributor: Eydís Huld Magnúsdóttir


Cognitive workload monitoring using voice

Recording people performing the Stroop task.

The focus of this project is to discover relationship between cognitive workload, task performance, voice patterns and physiological signals.  Cognitive workload experiments are set up to obtain voice and physiological data recording.  Speech signal processing and pattern recognition algorithms are used to discover how cognitive workload affects the voice.  This project is a collaborative project between The Speech Pattern Recognition Group and The Cognitive Signals Group.

Contributor: Eydís Huld Magnúsdóttir


ASR for parliamentary speeches

All parliamentary speeches have to be stored in a text format and kept accessible to the public.
As it is, a considerable time is spent on transcribing the speeches. By using automatic speech recognition the manual labor needed for the transcription of each speech can be reduced considerably.
In this project we are further developing the automatic speech recognizer used in the transcription process for the Icelandic parliament since 2018.

Funding: Althingi

Articles: Althingi ASR System (poster), Building an ASR corpus using Althingi’s Parliamentary Speeches, Frá stál­þræði til gervi­greindar (news article in icelandic)

Data: Malfong.is – data

Code: github – kaldi/egs/althingi/s5, lirfa

Contributors: Inga Rún Helgadóttir, Judy Fong

Timeline: May 2016 – August 2019


Free and Open Speech Recognition for Icelandic

The project Free and Open Speech Recognition for Icelandic is a 12-month project conducted at the Reykjavík University, funded by The Icelandic Language Technology Fund. The project is about building a free accessible environment where Icelandic speech recognition systems can be developed and customised.

Funding: The Icelandic Language Technology Fund and Menntamálaráðuneytið

Code: ICE-ASR code repository
Website: Free Online Automatic Speech Recognizer – Tal

Contributors: Anna Björk Nikulásdóttir, Róbert Kjaran Ragnarsson


Objective Assessment of Voice Quality

Human voice can assume various qualities such as breathy voice, modal voice, and pressed voice.  The assessment of voice quality has been used clinically for the diagnosis and management of voice disorders.  However, professional rating of voice quality can be labor-intensive and subjective.

This project aims to automate this rating procedure. The primary focus is on the use of signal processing algorithms to extract suitable features from different speech related signals.

Our current research is on using an inverse filtering algorithm to obtain glottal flow estimates from speech signals, from which features can be extracted and used in the training and test procedures for a voice quality predictor.
The secondary focus is on the use of machine learning schemes to construct assistant systems for medical practice. Our current research regarding this area is the use of Gaussian mixture models and deep neural networks for the purpose of speech modality classification. 

Funding: This work was supported in part by The Icelandic Centre for Research (RANNIS) under the project Model-Based Speech Production Analysis and Voice Quality Assessment under Grant 152705-051.

Code: Source code for evaluation of inverse-filtering algorithms
Dataset: Data set of synthesized sustained vowels
Dataset: Data set of synthesized continuous speech

Contributors: Dr. Yu-Ren Chien, Michal Borský

Timeline: November 2015 – November 2018.