Recognizing Voices With AI

No items found.
October 26, 2021

Voice-based digital assistants are on the rise. These systems process a stream of audio data and extract information from it. Such an audio stream often contains multiple voices. For example, think of a telephone conference held in a meeting room where several people speak into a single microphone.

While software which translates the speech into text is available today, many applications benefit from another piece of information - an answer to the question who spoke when. Xelera Technologies provides an AI module for speech processing systems which distinguishes voices and splits the multi-voice audio stream into separate stream according to the different speakers within a conversation.

The Speaker Diarization module distinguishes voices of unknown speakers (pre-training of known speakers not required). It also performs a speaker identification because of its ability to remember identified voice profiles. Downstream, the Speaker Recognition module (also referred to the Speaker Diarization module) can be combined with speech-to-text and natural language processing frameworks in order to assign a speaker label to the recognized text.

The application developers can connect to the module via a Python API, a REST API, and a C++ API. The module is available for on-premises deployments as well as a cloud service. In you are interested, request a live demo at by sending an email to


Further articles you might like

Accelerating Decision Tree-Based Predictive Analytics

Gradient boosting frameworks such as XGBoost, LightGBM and CatBoost, as well as Random Forest algorithms are often a part of winning machine learning models in Kaggle competitions...

Read more

Do you want your data center to be greener?

The European Union wants data centres to be greener. Following the EU publication “Shaping Europe’s Digital Future”, Telecommunications and data centres have a significant environmental footprint...

Read more