This deliverable, D23 details different hybrid architectures for fusing the scores from three different modes of speaker verification. The suitability of existing voice biometrics in the application scenarios within OCTAVE (D4) and single mode voice biometric engines (D11) are the main related deliverables to D23, and their outcome has been used for D23. Speaker verification engines are widely used in biometric authentication applications. Their major drawback is their sensitivity to spoofing attacks. Since in real-world authentication applications the use of different modes of operation is feasible, in this report we present a preliminary fusion methodology for improving the overall speaker verification performance. Specifically, we fuse in score level three modes of operation, namely the unique pass-phrase, text-dependent-prompted and text-independent, using both linear/nonlinear regression algorithms. We also investigate a knowledge-based (rule-based) method, based on biometrics and security knowledge, a data-driven method, based on machine learning fusion models and a combination of them. The experimental results indicate that the hybrid fusion architecture, which is the combination of knowledge-based and data-driven based fusion, offers an improvement in speaker verification performance. Proposed architectures can also be useful in case of spoofing attacks, but as RSR2015 does not contain spoofed speech we could not test its capability against spoofing attacks.

Source: WP 4 Hybrid Voice Biometrics

Dissemination level: Confidential

To know more about the document, you may place a request in the ‘Contact’ section of this site. We reserve the right to decide how much we can disclose.