INTERSPEECH 2015, one of the most prestigious international conferences about Speech and Language Processing, was held in Dresden, Germany, on September 6-10, 2015.
OCTAVE was part of the conference thanks to the contribution and the presence of some of the members of the Consortium: Eurecom, Fondazione Ugo Bordoni, Aalborg University, University of Eastern Finland, and ValidSoft.
Among the various topics concerning speech and language processing, a dedicated session on Automated Speaker Verification spoofing and countermeasures (ASVspoof) was held, remarking the interest of the research community for this area which is specifically dealt within WorkPackage 3 of the OCTAVE project.
The ASVspoof initiative was a follow-up of the first special session in Spoofing and Countermeasures for Automatic Speaker Verification (ASV) held during the 2013 edition of INTERSPEECH in Lyon, France. ASVspoof 2015 incorporates a standard challenge designed to support, for the first time, independent assessments of vulnerabilities to spoofing and of countermeasure performance. The initiative provides a level playing field to ease comparison of different spoofing countermeasures on a common dataset, with standard protocols and metrics. While preventing as much as possible the inappropriate use of prior knowledge, the challenge also aims at stimulating the development of generalized countermeasures with a potential to detect varying and unforeseen spoofing attacks.
Given a speech utterance, each spoofing detection algorithm produces a score that, if greater than a preset threshold, determines whether the speaker has been detected or not. It comes straightforward that genuine speech could be mistaken by the detection algorithm for a spoofing attack or vice-versa: the probability of the first scenario is called probability of missed detection, while the second one is called probability of false alarm. Both probabilities are function of the mentioned threshold that can be regulated, but the higher the probability of missed detection of attack, the lower the probability of false alarm and vice-versa. In a practical scenario, it could be decided to set the algorithm to rather have a high probability of missed detection and a lower probability of false alarm, in order to have a safe authentication system but that can be tricky for a user because of the high probability of missed detection. In the challenge, an objective metric to evaluate the efficiency of a generic spoofing detection algorithm is adopted and it is the Equal Error Rate (EER): the threshold is adjusted to have an equal probability of false alarm and missed detection, that probability is the EER. The lower EER, the better the spoofing algorithm.
As many as 16 teams from different countries took part to the challenge and submitted their results. The final values show interesting EERs (less than 1.5%) but what comes as a conclusion of the challenge is the need for a generalized countermeasure, which means that every algorithm performs differently according to the different type of spoof attack. In future scenarios text-dependent authentication methods will be considered.
This is really in line with current ideas in the OCTAVE project!
For further info, you may find it useful to check the following papers:
Zhizheng Wu, Tomi Kinnunen, Nicholas Evans, Junichi Yamagishi, Cemal Hanilçi, Md. Sahidullah, Aleksandr Sizov, “ASVspoof 2015: The First Automatic Speaker Verification Spoofing and Countermeasures Challenge”, Proceedings of INTERSPEECH 2015, 6-10 September 2015, Dresden, Germany;
Md. Sahidullah, Tomi Kinnunen, Cemal Hanilçi , “A Comparison of Features for Synthetic Speech Detection”, Proceedings of INTERSPEECH 2015, 6-10 September 2015, Dresden, Germany;
Cemal Hanilçi, Tomi Kinnunen, Md. Sahidullah, Aleksandr Sizov, “Classifiers for Synthetic Speech Detection: A Comparison”, Proceedings of INTERSPEECH 2015, 6-10 September 2015, Dresden, Germany.