First milestone in the Language Technology for Icelandic project

The LVL team celebrating the first milestone in the Language Technology for Icelandic project. Ólafur Helgi Jónsson, Sunneva Þorsteinsdóttir and Steinþór Steingrímsson are missing from the picture.

Last week we celebrated achieving the first milestone in the Language Technology for Icelandic project with a cake!

After a lot of hard work the past few months we achieved the first milestone in Automatic Speech Recognition (ASR), Text-to-Speech (TTS) and Machine Translation (MT).

In ASR, the focus has mostly been on data creating and gathering. 55,000 utterances have been collected (donated by adults) via the crowd-sourcing platform samromur.is (based on Common Voice) with plans to reach 100.000 utterances for the next milestone. The process is being extended to include younger voices in collaboration with schools and authorities. Today we started working with Öldutúnsskóli in Hafnarfjörður. The goal is to reach 80.000 young voice utterances for the next mileston. Additionally, data has been gathered from RÚV (audio, video and subtitles) and CreditInfo (transcriptions). Along with data gathering, the team is also developing tools to post-process Icelandic ASR text for better readability.

In TTS, we successfully created a voice recording client (LOBE) and three reading scripts in order to collect high quality speech and corresponding text data. The reading scripts were created from Risamálheild and seek to maximize diphone coverage. So far 20 hours have been collected from two speakers, male and female. The aim is to finish collecting 20 hours from each speaker early this year. From the collected data two TTS prototypes have been created in Ossian, which extends the Merlin back-end. The current prototypes are quite naive but we have integrated a grapheme-to-phoneme model for the Icelandic language into the prototypes.

In MT, we successfully created a phrase-based statistical machine translation system using the open source tool Moses. Our collaborators at Miðeind created neural machine translation systems based on BiLSTMs and Transformers. The models were trained on the newly available English-Icelandic parallel corpus, ParIce. The systems were then evaluated w.r.t. training time, throughput and BLEU score. The code and 
systems are freely available but are still under development for milestone two. In milestone two we will continue to develop the systems further and adjust them to specific needs of the Icelandic language.

2019 Rannis Grants

Rannis (The Strategic Research and Development Programme for Language Technology) has awarded Hrafn two grants this year. Congratulations! The first project, Automatic Text Summarization (ATS) for Icelandic, will be worked on by a post-doctoral researcher and an Icelandic linguist in collaboration with mbl.is, Morgunblaðið’s news website. The second one is Named Entity Recognition (NER) for Icelandic. Svanhvít Lilja Ingólfsdóttir and Ásmundur Guðjónsson, two students from the Language Technology (Máltækni) masters program will work on the NER project in collaboration with the Icelandic Stock Exchange. Welcome to LVL!

Anna Björk has also been awarded a grant, for her company, Grammatek ehf., in cooperation with the city of Akranes. Congratulations and we wish you all the best with your new endeavor!

More information regarding the ATS post-doctoral research position can be found at https://lvl.ru.is/jobs.

 

 

Eydís has successfully defended her PhD thesis!

eydisphd-3
Eydís and the LVL members at her celebration.

 

We are pleased to announce that Eydís has successfully defended her PhD thesis on “Cognitive workload classification with psychophysiological signals for monitoring in safety critical situations” on 18th of January. Over the past few years, Eydís has worked on a dissertation studying the effect an increased cognitive workload has on acoustic and cardiovascular signals. She collected data from over 100 participants in a simulated environment, which she analyzed quantitatively and qualitatively. The key contribution of her thesis is in using a signal processing approach and showing that an involuntary response of the cardiovascular system can very accurately reflect one’s mental effort during a task. The thesis is a result of her cooperation with Isavia, and their effort to improve management of people working in an air traffic control environment.

Congratulations Eydís!

Language Technology Seminar this Saturday

The cooperation between LVL and other leading icelandic organizations is increasing. Tomorrow Reykjavik University and  Societas Scientiarum Islandica (Vísindafélag Íslendinga) are holding a seminar and panel discussion on the current progress and the future of implementing language technologies for Icelandic.

It will be held at Reykjavik University room M105. Hrafn Loftsson, of LVL, will be moderating the seminar starting at 13:30. It will consist of talks from a professor at University of Iceland, the chairman of Almannaromur, Jón Guðnason of LVL, and the director of Miðeindar ehf. Afterwards is the panel discussion.

We welcome everyone to attend the lively Saturday afternoon discussion!

Researchers’ Night

This Friday is Researchers’ Night (Vísindavaka Rannís 2018). It is an all ages event on the 28th of September, 2018 from 16:30 – 22:00 at Laugardalshöllin, Reykjavik.

We will be there with Reykjavik University demonstrating the possibilities of speech with tech: evaluating collected speech data (Eyra), testing the accuracy of an automatic speech recognizer(ASR) – https://tal.ru.is, listening to a text-to-speech synthesizer, and telling your phone to read the news to you. Come try out the state-of-the-art in Icelandic speech technology, and tell us what you think!

researcher
Researcher by Nick Youngson CC BY-SA 3.0 ImageCreator10

Student Projects Available

For the students of Reykjavik University or summer exchange students, we now have a list of student projects available. They are on  https://lvl.ru.is/student-projects/ or available from the Menu of the LVL website as Student Projects. They range from straight forward to difficult and are suitable for undergraduate final projects, Masters students, and PhD students. If you want to work on a one, please contact the people listed in the contact column, and they can give you more details to get you started. We look forward to hearing from you!

 

Using language technology to assist the hard of hearing

The Nordic association of the hard of hearing (Nordiska Hörselskadades Samarbetskommitté, NHS) had a seminar at Hotel Selfoss last week. On Friday, Anna gave a talk there on how language technology might assist people hard of hearing to communicate and access information in a predominately hearing world. Automatic transcription of live communication and automatic caption of video material is already working for English and some other languages, and the Nordic participants of the seminar were eager to see this technology advance in their languages. At LVL, we are working on open ASR systems, making the development of technology like this possible for Icelandic.

The rest of the slides can be viewed by selecting the first slide below.

NHS_cover_icon