How well does a voice biometric authentication service work? How easy is it to use? How reliable is it? What performance can I expect when integrating this service in my application for my customers? All these questions cannot be answered by any open evaluation campaign, like those organized by international standardization bodies and scientific associations. Moreover, here we address the evaluation of a pre-commercial system, which is not available to third parties as ‘open’ software but as a service or, in some cases, as closed software. This document reports about an operational approach defined and adopted by OCTAVE, to answer all the issues above.

To do this, we have used both the classical approach of validating the OCTAVE platform by means of standard evaluation campaigns, and a novel approach based on the use of an ad hoc client application, named TEM (The Evaluation Machine), which massively tests the voice biometric authentication service provided by OCTAVE. Massive testing is performed by simulating, in all and for all, a very large class of users working under specific environmental conditions. In doing so, we have examined the other client applications developed by the Project for direct use by human users, to assess how smooth is the process of calling the OCTAVE Trusted Biometric Authentication Service (TBAS).

We had to develop an ad hoc solution, because nothing was found, in the open market, that would satisfy our specific needs. Therefore, we have designed and developed TEM, an application based on a new paradigm that leverages on the “corpus of corpora” database, developed in the first year of the Project. This is a database that collects and provides structured and open access to speech/audio signals uniquely described and managed through their metadata, whereas speech/audio data themselves can reside anywhere in the Web. TEM allows the evaluation of a voice authentication service based on biometrics by resorting to a description of the tests to be performed; its core task is to execute experiments by impersonating a class of users, and to store the results into another database for further analysis and for evaluation purposes. Signals in various datasets have been used to enroll thousands of virtual users with the TBAS and, thereafter, to have those virtual users access the TBAS under various conditions.

As a result of the approach, millions of speech sample comparisons have been executed on the TBAS, by exploring: different modalities (fixed passphrase, text independent, hybrid); different speech qualities (high quality, telephonic quality, speech encoded quality); different environmental conditions (noiseless and noisy with various noises at several signal-to-noise ratio levels); and different languages. All this exercise has allowed to objectively evaluate not only genuine speaker verification, but also the spoofing countermeasures modules included in the authentication by voice biometric service.

Detailed assessment procedures, for each dataset, have first demonstrated that the service is easy to use, reliable and affordable, since the TBAS has proved very resilient and highly scalable with the size of the user population. In addition, we have measured performance figures that in most cases are above the reference top-level. In rare cases when the system does not get better results than others, performance ranks anyway in the high range.

Source: WP 7 Test and Verification

Dissemination level: Confidential

To know more about the document, you may place a request in the ‘Contact’ section of this site. We reserve the right to decide how much we can disclose.