This week LVL sat down with Mycroft to discuss the possibilities of collaborating and working together to bring more speech technology to Iceland. We discussed using Mozilla’s Common Voice to bring about another open source Icelandic speech dataset, and possibly an Icelandic voice assistant. The Mozilla project requires just 5,000 phrases which anyone can contribute, even you!
The complexity of neural network models increases every year and it takes a lot to keep up with computational hardware fast enough to train them efficiently. Earlier this year, we applied to the RANNIS Infrastructure Fund for funding to expand our current HPC cluster. We are happy that our proposal “Deep Learning Infrastructure for Speech and Language Technology” was selected to be funded. Only three grants were granted to Reykjavik University, and ours was one of them. This money will allow us to buy a fully equipped SuperMicro 4028GR-TR2 server with NVIDIA 1080Ti GPUs. We hope to sign the grant contract within several weeks and then order the machine. Next comes the process of assembling and integrating it into our current cluster. I can say that several group members can’t wait to have more power available to them.
This Thursday, Anna will be representing the LVL group at The Future of the Icelandic Language in New York. It is being hosted as part of the “100 ára fullveldi Íslands” (trans. 100 years of of sovereignty in Iceland) celebration. Anna will be a panelist during the “The Icelandic Language and Technology” discussion along with international leading figures in the language and technology community.
Photo from the event:
This year, Jón will be representing our LVL group at the LREC 2018 conference happening this week in gorgeous Miyazaki, Japan. Jón will be presenting the paper, “Open ASR for Icelandic: Resources and a Baseline System.” As co-author, Anna, says, “The paper describes the language resources used in the project Open ASR for Icelandic: the Málrómur speech corpus, the Leipzig Corpora Collection and the Icelandic pronunciation dictionary, and their processing for the utilization in the training of the ASR system. Furthermore, we experiment with different content of the acoustic training corpus to examine the impact of carefully selected speech data on the WER of the ASR system.” To learn more, read their paper and visit ASR_Resources_LREC2018_A0_portrait_final to get the details that they weren’t able to fit in.
We wish Jón a good conference and lots of fun on a different island!
For those who weren’t able to view the poster in person, we have a PDF of it below:
On the 27th of April, we will be presenting the web portal for our project, a Free and Open Speech Recognition for Icelandic. The ASR will be the opening topic of the Language Technology Seminar, followed by other language technology talks and presentations. The seminar takes place at Reykjavík University, room M101 on Friday, April 27th, 2018 12:00PM.
The automatic speech recognizer was previously demoed at UTMessan and University Day but after the seminar it will be publicly available at https://tal.ru.is/ for newer devices. To get a better idea of how our Open Icelandic ASR works, watch the following Icelandic news segment:
But if you don’t know any Icelandic and just want to use the API, then follow us for updates.
LVL grows even larger with Associate Processor, Hrafn Loftsson, joining our lab. His past work includes IceNLP. and the Almannaromur Icelandic speech corpus. Natural Language Processing is his primary focus.
Without further ado LVL would like to welcome Hrafn, and we hope that he’ll lend his expertise to the rest of the LVL projects. If you would like to read more about him, you can check out his bio or his Reykjavik University page.
On an unrelated note: We also have added another project to our projects page, TTS for Icelandic.
Last fall Michal was interviewed by a broadcast journalist for the BBC Arabic segment, BBC 4 Tech. The interview was regarding some early research he did with the Cognitive workload monitoring using voice project. The BBC 4 Tech segments have been uploaded onto YouTube with voice-overs in Arabic.
Thanks to the BBC team you’ll get the distinct pleasure of seeing one of our own speaking Arabic, albeit dubbed, if you select play: