Speech Technology Center Group, a Sber ecosystem company, has delivered superb performance in voice biometrics (voice recognition algorithms) evaluation organized by the National Institute of Standards and Technology (NIST), US.

Dmitry Dyrmovsky, CEO, STC Group:

"High-quality recognition of a person by voice improves corporate and government services, simplifying our lives. High-end speech technology helps create the best dialogue assistants - they streamline the work of contact centers, sales and service offices. Speech analytics helps to draw conclusions about customer satisfaction and the quality of a dialogue, which spells continuous improvement of the user experience. Speaking more broadly, the identification of people by voice is sought after in biometric systems nationwide.

"NIST SRE 21 is the fifth evaluation of 2021 where STC solutions receive a high score from a competent international jury. Recognition of STC in international competitions is not only a personal victory, but also a significant event for the entire industry. We are pleased to bring the solution of voice recognition issues, which are being worked on by the strongest teams from around the world, to the next level, credibly representing our key competencies in the global market."

STC technology performed remarkably in NIST SRE21 (Speaker Recognition Evaluation). The objectives of the evaluation were as follows:

  • speaker detection using audio over conversational telephone speech (CTS) and audio from video (AfV). Voice recognition was used to address this issue.
  • speaker detection using audioand video over conversational telephone speech (CTS), audio from video (AfV), and video alone. Voice and face recognition was used to address this issue.

What was special about SRE21 is that it offered two algorithm training options: fixed (the use of voice data provided by the organizers) and open (the use of any data). The challenging part was that the data were recorded both through the telephone (regular telephone conversations) and through microphones (recordings from video cameras), while the people on the recordings spoke different languages: English, Chinese, Arabic and more.

The STC research team was one of the first to successfully apply a combination of neural network architectures of the transformer type, which is popular in computer vision, natural language processing, and wav2vec, which is used in speech recognition, to address human recognition. This approach resulted in few errors in verifying a person by voice.

The STC Group team also participated in another competition - the NIST CTS Speaker Recognition Challenge - an ongoing competition going non-stop, with intermediate results unveiled periodically. In this competition, the STC Group team also delivered great results. The basic task in the CTS Challenge is speaker detection over the phone, while the person can speak different languages ​​- English, French, Arabic - and using different smartphone models. Thirty-three teams from leading universities and commercial companies take part in the challenge.

NIST events participants are the top research teams from the world's leading universities and teams from commercial companies from China, the USA, Japan, Italy, France, Spain, Israel, Singapore, and the Czech Republic.

The STC group of companies (part of the Sber ecosystem) is a global developer of products and solutions based on intelligent speech technologies, machine learning, and computer vision with 30 years of experience. A technology expert in speech technology, face and voice biometrics. The STC Group focuses on creating AI solutions for the B2B and B2G segments: more than 5,000 AI projects have been implemented around the world, including on a national scale in Mexico, Ecuador, and the Middle East. In Russia, STC solutions are used in the largest banks, telecom companies, the fuel and energy complex, the public sector, and to implement the Safe & Smart city concept. Voice forgery detection and speech recognition technologies developed by the STC Group hold a leading position in the world ratings of NIST, VOiCES, and CHiME.

Attachments

  • Original Link
  • Original Document
  • Permalink

Disclaimer

Sberbank of Russia published this content on 27 January 2022 and is solely responsible for the information contained therein. Distributed by Public, unedited and unaltered, on 27 January 2022 09:28:01 UTC.