Interspeech 2016 was organized around the topic “Understanding Speech Processing in Humans and Machines”  and was held in the Hyatt Regency San Francisco hotel in San Francisco, California. The OCTAVE project was well represented at Interspeech by studies from four OCTAVE partners, AAU, APL, EUR and UEF who presented the following seven individual studies, all related to the central themes of OCTAVE.

  • Kinnunen, A. Sholokhov, E. Khoury, D. Thomsen, M. Sahidullah and Z.-H. Tan, ”HAPPY Team Entry to NIST OpenSAD Challenge: A Fusion of Short-Term Unsupervised and Segment i-Vector Based Speech Activity Detectors”,  Proc. Interspeech 2016
  • Kinnunen, M. Sahidullah, I. Kukanov, H. Delgado, M. Todisco, A. sarkar, N. Thomsen, V. Hautamäki, N. Evans, Z.-H. Tan, ”Utterance Verification for Text-Dependent Speaker Recognition: a Comparative Assessment Using the RedDots Corpus”, Proc. Interspeech 2016.
  • Sahidullah, H. Delgado, M. Todisco, H. Yu, T. Kinnunen, N. Evans and Z.-H. Tan, ”Integrated Spoofing Countermeasures and Automatic Speaker Verification: an Evaluation on ASVspoof 2015”, Proc. Interspeech 2016
  • Sahidullah, R. González Hautamäki, D.A.L. Thomsen, T. Kinnunen, Z.-H. Tan, V. Hautamäki, R. Parts, M. Pitkänen, ”Robust Speaker Recognition with Combined Use of Acoustic and Throat Microphone Speech”, Proc. Interspeech 2016
  • Sarkar and Z.-H. Tan, “Text Dependent Speaker Verification Using Unsupervised HMM-UBM and Temporal GMM-UBM,” Proc. Interspeech 2016.
  • Thomsen, D. Thomsen, Z.-H. Tan, B. Lindberg, S.H. Jensen, “Speaker-dependent Dictionary-based Speech Enhancement for Text-Dependent Speaker Verification,” Proc. Interspeech 2016
  • Todisco, H. Delgado, N. Evans, “Articulation rate filtering of CQCC features for automatic speaker verification”, Proc. Interspeech 2016

Five different sessions from Interspeech are worth highlighting. On Friday 9th September there was a Special Session “The RedDots Challenge: Towards Characterizing Speakers from Short Utterances” that provided latest findings on the crowd-sourced text-dependent RedDots corpus also used by the OCTAVE partners. The session was attended by a large number of attendants and two of the above-listed OCTAVE studies were presented at the Special Session. Also on Friday there was another Special Session, “The Speakers in the Wild (SITW) Speaker Recognition Challenge” that represented results from another relevant recent speaker verification challenge.

Regular session entitled “Robust Speaker Recognition and Anti-Spoofing” on Saturday 10th September was chaired by two OCTAVE project members, Nicholas Evans (EUR) and Tomi Kinnunen (UEF). The session featured work on the latest developments in countermeasures and robustness for speaker verification. Two presentations were delivered by the OCTAVE project partners. A parallel “Voice Conversion Challenge” Special Session is also relevant to work conducted within the OCTAVE project.  Voice conversion is a threat to the reliability to speaker verification and one for which the OCTAVE consortium is developing spoofing countermeasures.

In addition, a Special Event entitled “Speaker Comparison for Forensic and Investigative Applications II” represented some of the current and ongoing challenges in forensic speaker comparison. Even if forensics is not in direct relevance to the OCTAVE project focusing on user authentication, the special event highlighted some of the relevant technical and legal challenges in forensic speaker comparison.

One of the OCTAVE project partners, Tomi Kinnunen, together with his co-authors, was awarded at the Interspeech closing ceremony for a best paper award for articles published in Speech Communication journal between 2013 to 2015. The award was given for a study carried out jointly between UEF, CRIM (Canada) and INRS-EMT (Canada). The specific publication (Md. J. Alam, T. Kinnunen, P. Kenny, P. Ouellet, D. O’Shaughnessy, “Multitaper MFCC and PLP Features for Speaker Verification Using i-Vectors”, Speech Communication, 55(2): 237–251, February 2013) deals with small-variance robust extraction of mel-frequency cepstral coefficient (MFCC) features used in automatic speaker verification systems.

In summary, both the OCTAVE project and the topic of speaker verification in general were well visible at Interspeech.