Quickstart Guide for Streaming

We provide sample Speech Recognition clients in the following languages: C++, Python. Find everything on our Github page.

Here we discuss a simple integration step by step using C++. This way we use a Verbio provided gRPC proto file with any language that has a library with support for gRPC and generate code to run the speech recognitions. To do this you will need:

  • Speech Center proto file
  • Account in the Speech Center dashboard. Please contact our sales team through the contact form in the main Verbio website to have an account created.
  • Speech Center endpoint: us.speechcenter.verbio.com.
  • The steps needed are very similar to the ones described in the official gRPC guide, and are as follow for a simple C++ integration:

Step by step

Install Conan dependency manager


conan --version

if the version is < 1.3 or >= 2.* you need to check the documentation at: https://docs.conan.io/1/installation.html

on how to install conan with virtualenvs, and install a compatible version with a command like:

pip install conan==1.54.0

Install conan dependencies

It is recommended to create a /build subdirectory in the project root, and run all the following commands from there:

They are already writen in the conanfile.txt file in this repository.
The grp and protobuf packages are necessary to automatically generate from the .proto specification all the necessary code that the main code will use to connect with the gRPC server in the cloud.

To install, (from /build) run:

conan install ..

This will also set up the necessary files for CMake build inside the /build folder.

Generate the gRPC code for C++

There is already a step in the CMake configuration of this project that will generate the C++ code for gRPC. You do not have to worry about it.


From /build:

cmake ..

This will generate all the configuration files needed and will create all the necessary C++ from the .proto file that will allow your code to communicate with the Speech Center platform.


You can use cmake to compile the code:

cmake --build . --target all 

Once the project is compiled you can check that everything went as expected by executing the unit tests:


Run the client

The cli_client will be using the generated C++ code to connect to the Speech Center cloud to process you speech file.


./cli_client -a audiofile.wav -l en-US -t my.token -T generic -s 16000 -H us.speechcenter.verbio.com -V V1

Which will give an output along these lines:


[2022-12-15 11:43:39.022] [info] [RecognitionClient.cpp:88] Channel is ready. State 2
[2022-12-15 11:43:39.022] [info] [RecognitionClient.cpp:89] Channel configuration: {}
[2022-12-15 11:43:39.022] [info] [RecognitionClient.cpp:98] Stream CREATED. State 2
[2022-12-15 11:43:39.024] [info] [RecognitionClient.cpp:103] WRITE: STARTING...
[2022-12-15 11:43:39.024] [info] [RecognitionClient.cpp:105] Sending config:

Use the --help command for more options.

Speech Center integration information

Speech Center important information


Speech Center features a dashboard at https://dashboard.speechcenter.verbio.com.
Dashboard serves two main purposes:

  1. Allow users to follow up historical usage data of the service, separated by projects or as a whole.
  2. Allow users to retrieve client_id and client_secret credentials necessary to integrate with Speech Center, please look at authentication section to learn more about how this is performed.

Customer Credentials

All speech requests sent to Speech Center must have a valid authorization token in order for the system to successfully process the request. These tokens have a validity period of 1 hour and can be generated on demand using your customer credentials.

Your customer credentials can be retrieved through the Speech Center Dashboard by logging in with the customer account supplied to you by Verbio.

Authentication flow

To acquire a valid token submit an HTTP POST request to the authentication service at https://auth.speechcenter.verbio.com:444.

Token expiration management

As part of the JWT specification, we fully support the token expiration claim, so generated tokens will be valid for only a finite period of time of 1 hour to up to 1 day. It is the responsibility of the calling client to manage this token expiration time and, in a best case scenario, anticipate the refresh by a couple of minutes so the streaming session attempt never fails because of token expiration.

In order to refresh the token, the token refresh endpoint can be called with the same client_id and client_secret, and it will respond with a new JWT token with a renewed expiration time.

Authentication API

Method: POST



Request body:


Response body:

 "access_token": "new_access_token",
 "expiration_time": 1678453328

*expiration_time field contains the expiration time for the JWT token so there is no need to decode the token on the client request to know the token expiration time.

Status codes:

HTTP 200: OK
HTTP 400: KO - Provided request format was invalid.
HTTP 401: KO - There was a problem with the provided credentials.
HTTP 5XX: KO - There was a problem with the server.

Testing calls using curl

Example request

curl --header "Content-Type: application/json"   --request POST   --data '{"client_id":"YOUR_CLIENT_ID","client_secret":"YOUR_CLIENT_SECRET"}'   'https://auth.speechcenter.verbio.com:444/api/v1/token'

Example response

"access_token": "EXAMPLE_ACCESS_TOKEN",
"expiration_time": 1678453328

Client flags

Audio file


-a, --audio file

This argument is required, stating a path to a .wav audio in 8kHz or 16kHz sampling rate and PCM16 encoding to use for the recognition.


-T, --topic arg

Topic to use for the recognition when a grammar is not provided. Must be GENERIC | BANKING | TELCO | INSURANCE (default: GENERIC).



-l, --language arg

Language to use for the recognition: en-US, en-GB, pt-BR, es, es-419, tr, ja, fr, fr-CA, de, it (default: en-US).

Sample rate

-s, --sample-rate arg

Sampling rate for the audio recognition: 8000, 16000. (default: 8000).


-t, --token arg 

Path to the authentication token file. This file needs to have a valid token in order for the Speech Center to work.

In order for the client to work, the token argument is required in the following situations:

  • The client will authenticate just by using the available token file. The file provided in this argument needs to be a valid Speech Center JWT token so the transcription service can work.
  • The client will authenticate by providing their client credentials through the --client_id and --client_secret program arguments. In this case a token file must also be supplied even if it is a blank file. Client will check file to see if it is a valid token, if it isn’t it will refresh automatically the token and fill the file with a valid token. In this case, client_id and client_secret fields are also required.

Client id and secret

--client-id arg      Client id for token refresh (default: "")
--client-secret arg  Client secret for token refresh (default: "")

client-id and client-secret fields are required for automatic token refreshal. The arguments need to be written inline with no quotes for each field.


-H, --host arg

URL of the host where the request will be sent. Main endpoints will be expanded as the product is deployed in different regions. Please use us.speechcenter.verbio.com as the host.


-d, --diarization

This option enables diarization.

This option is oriented towards batch transcription only and its use for streaming and call automation is not recommended.


-f, --formatting

This option will enable formatting on the speech transcription.

ASR version

-A, --asr-version arg

This will select the asr version the speech center will use for transcriptions.

Please follow Verbio’s sales department recommendation on which version to use.


-L, --label arg

This option allows for a one word argument to be sent so that the speech transcription request is billed as part of a particular project for the customer. The argument will be a one word name that will classify the request under that project.

  • Project name must only consist of one word.
  • Argument must be the same each time for the same project. If there is a typo another project will be created.
  • There is no limit on the amount of projects that can be created.