This deliverable focuses on the use of fusion to attain better performances than single-mode speaker verification engines in Automatic Speaker Verification (ASV) systems. An ASV system based on a single-mode operation engine has been shown to provide differences in performance depending on the operational mode. The three operational modes, fixed-phrase, text-prompted text-dependent and text-prompted, text-independent modes represent increasing difficulties for machine-learning algorithms to cope with respectively. However, the creation of hybrid engines by fusing the results of the three modes increases the chances of getting much better performances due to the higher screening criteria imposed by the hybrid. There are other benefits to using hybrid engines, notably in terms of spoofing robustness. This deliverable improves the evaluation procedures and provides an evaluation of best performance fusion approach. Standard approaches such as GMM-UBM (Gaussian Mixture Models with a Universal Background Model) and HMM-UBM (Hidden Markov Models with a Universal Background Model) and approaches that are more current such as i-vector and HMM-UBM using DNN (Deep Neural Networks) features and a hybrid thereof were used as single-mode verification engines and fused at the score level to determine best performances. Results indicate fusion of HMM-UBM based approaches topping performances for fixed-phrase and text-dependent modes and the i-vector approach being best for text-independent modes. In the case of spoofing robustness, the results were similar to the speaker verification results. Fusion proved to perform better than any single-mode approach. Of the approaches to the implementation of the fusion model, Multi-Layer Perceptron (MLP), Support Vector Regression (SVR) and Linear Regression (LR) were investigated with none performing significantly better than others do. We conclude that potential improvements could be found by considering the fusion of results from different standard approaches to the single-mode speaker verification engine instead of fusing just one standard approach.

Source: WP 4 Hybrid Voice Biometrics

Dissemination level: Confidential

To know more about the document, you may place a request in the ‘Contact’ section of this site. We reserve the right to decide how much we can disclose.