×
Google Speech Recognition

Price per Channel

$50.00

By using Google Speech Recognition (GSR) plugin to UniMRCP Server, IVR platforms can utilize Google Cloud Speech API via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.

Google Cloud Speech API performs speech to text conversion powered by machine learning providing the following main features.

Automatic Speech Recognition

Automatic Speech Recognition (ASR) powered by deep learning neural networking to power your applications like voice search or speech transcription.

Global Vocabulary

Recognizes over 110 languages and variants with an extensive vocabulary.

Streaming Recognition

Returns recognition results while the user is still speaking.

Word Hints

Speech recognition can be customized to a specific context by providing a set of words and phrases that are likely to be spoken. Especially useful for adding custom words and names to the vocabulary and in voice-control use cases.

Noise Robustness

Handles noisy audio from many environments without requiring additional noise cancellation.

Inappropriate Content Filtering

Filter inappropriate content in text results for some languages.

Google Dialogflow

Price per Channel

$60.00

By using Google Dialogflow (GDF) plugin to UniMRCP Server, IVR platforms can utilize Google Dialogflow API via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.

Google Dialogflow API allows to create conversational applications capable of natural interactions with users and powered by Google Cloud Speech API internally.

Natural Conversational Engine

Dialogflow is an end-to-end development suite for building conversational applications that are capable of natural and rich interactions with users. It is powered by machine learning to recognize the intent and context of what a user says, allowing your conversational interface to provide highly efficient and accurate responses.

Multi-Language Support

Dialogflow supports 20+ languages allowing to build a multilingual agent that works across multiple languages.

Powered by Google’s Machine Learning

Natural language understanding recognizes a user’s intent and extracts pre-built entities such as time, date, and numbers. You can train your agent to identify custom entity types by providing a small dataset of examples. You can also use 30+ pre-built agents as a template.

Powered by Google Cloud Speech

You can expand your conversational interface to recognize voice interactions with a single request sent from the client application. Powered by Google Cloud Speech, recognition is implemented in real-time streaming mode.

Kaldi Speech Recognition

Price per Channel

$50.00

By using Kaldi Speech Recognition plugin to UniMRCP Server, IVR platforms can utilize Kaldi Speech Recognition Toolkit via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.

The Kaldi plugin connects to the Kaldi GStreamer Server, which needs to be installed separately. This integration is primarily intended for teams experienced with Kaldi building their own speech recognition systems with a special attention to Deep Neural Networks (DNNs). The plugin allows both an easy integration and reuse of existing infrastructure.

PocketSphinx Speech Recognition

Price per Channel

$50.00

By using PocketSphinx Speech Recognition plugin to UniMRCP Server, IVR platforms can utilize PocketSphinx speech recognition engine via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.

PocketSphinx is a lightweight open source speaker-independent continuous speech recognition engine.

Julius Speech Recognition

Price per Channel

$50.00

By using Julius Speech Recognition plugin to UniMRCP Server, IVR platforms can utilize Julius speech recognition engine via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.

Julius is a high-performance, small-footprint large vocabulary continuous speech recognition (LVCSR) decoder. Based on word N-gram and context-dependent HMM, it can perform real-time decoding on various computers and devices from micro-computer to cloud server. The algorithm is based on 2-pass tree-trellis search, which fully incorporates major decoding techniques such as tree-organized lexicon, 1-best / word-pair context approximation, rank/score pruning, N-gram factoring, cross-word context dependency handling, enveloped beam search, Gaussian pruning, Gaussian selection, etc.

Watson Speech Recognition

Price per Channel

$50.00

By using Watson Speech Recognition (SR) plugin to UniMRCP Server, IVR platforms can utilize IBM Watson Speech to Text API via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.

IBM Watson Speech to Text API performs speech transcription powered by machine learning and supporting the following main features.

Powerful Real-time Speech Recognition

Automatically transcribe audio in real-time. Rapidly identify and transcribe what is being discussed, even from lower quality audio, across a variety of audio formats and programming interfaces.

Highly Accurate Speech Engine

Customize your model to improve accuracy for language and content you care most about, such as product names, sensitive subjects or names of individuals. Recognizes different speakers in your audio Spot specified keywords in real-time with high accuracy and confidence.

Built to Support Various Use Cases

Transcribe audio for various use cases ranging from real-time transcription for audio from a microphone, to analyzing 1000s of audio recording from your call center to provide meaningful analytics.

Languages

The speech recognition API currently supports 7 languages.

Lex Speech Recognition

Price per Channel

$60.00

By using Amazon Web Services (AWS) Lex plugin to UniMRCP Server, IVR platforms can utilize AWS Lex API via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.

Lex is an AWS service for building conversational interfaces for applications using voice and text. This is the same conversational engine that powers Amazon Alexa.

Natural Language Understanding

Lex provides the deep functionality and flexibility of natural language understanding (NLU) and automatic speech recognition (ASR) so you can build highly engaging user experiences with lifelike, conversational interactions, and create new categories of products.

Simplicity

Lex guides you through using the console to create your own chatbot in minutes. You supply just a few example phrases, and Lex builds a complete natural language model through which the bot can interact using voice and text to ask questions, get answers, and complete sophisticated tasks.

Deep Learning Technologies

Powered by the same technology as Alexa, Lex provides ASR and NLU technologies to create a Speech Language Understanding (SLU) system. Through SLU, Lex takes natural language speech and text input, understands the intent behind the input, and fulfills the user intent by invoking the appropriate business function.

Seamless Deployment and Scaling

With Lex, you can build, test, and deploy your chatbots directly from the Lex console. Lex enables you to easily publish your voice or text chatbots. Lex scales automatically so you don’t need to worry about provisioning hardware and managing infrastructure to power your bot experience.

Yandex Speech Recognition

Price per Channel

$50.00

By using Yandex Speech Recognition (SR) plugin to UniMRCP Server, IVR platforms can utilize Yandex Cloud Speech to Text API via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.

Yandex Speech to Text API performs speech transcription powered by machine learning and supporting the following main features.

Real-time Speech Recognition

Automatically transcribe audio in real-time using gRPC streaming.

Fault-free Service Infrastructure

The service infrastructure is designed with high loads in mind to ensure that the system is available and fault-free.

Numerous Models

Numerous recognition models such as maps, dates, names and numbers are supported.

Languages

The speech recognition API currently supports two languages.

GoVivace Speech Recognition

Price per Channel

$50.00

By using GoVivace Speech Recognition plugin to UniMRCP Server, IVR platforms can utilize GoVivace Speech APIs via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.

The GoVivace plugin connects to the GoVivace Server, which needs to be installed separately or be used as a service.

Azure Speech Recognition

Price per Channel

$50.00

By using Azure Speech Recognition (SR) plugin to UniMRCP Server, IVR platforms can utilize Microsoft Azure Speech API via the industry-standard Media Resource Control Protocol (MRCP) version 1 and 2.

Microsoft Azure Speech API performs speech to text conversion powered by machine learning and supporting the following main features.

Advanced Speech Recognition Technologies

Advanced speech recognition technologies from Microsoft that are used by Cortana, Office Dictation, Office Translator, and other Microsoft products.

Real-time Continuous Recognition

The speech recognition API enables users to transcribe audio into text in real time, and supports to receive the intermediate results of the words that have been recognized so far. The speech service also supports end-of-speech detection.

Customized Language and Acoustic Models

For user scenarios which require customized language models and acoustic models, Custom Speech Service allows you to create speech models that tailored to your application and your users.

Languages

The speech recognition API supports many spoken languages in multiple dialects.

Speech Recognition