Info Image

Contextual Voice – Creating an Edge for Voice Business

Mobile Network Operators (MNOs) have in the past years been grappling with the reality of their voice business. With OTTs taking more than their fair share of the voice market, and with dwindling margins from voice segments as data becomes the preferred denominator of MNOs’ mobile plans, the voice business’ uphill journey to remain a profitable business segment in the long run continues.

The arrival of VoLTE and Wi-Fi Calling this year hence is a much awaited development for MNOs. So is the progress taking place in Rich Communication Services(RCS) and Unified Communications(UC) that promises voice a more esteemed position in the realm of telecommunications business. With more than 16 VoLTE rollouts and Wi-Fi Calling services debuted in the past year as well as a handful of introductions of MNO RCS and UC services  during the period, MNOs’ voice seems to be very well poised to make a comeback.

What is Contextual Voice?

Amidst all these new developments in the voice business, there is another promising new technology – Contextual Voice - that may hold the key to one of the biggest shifts in MNO’s voice business, redefining its future value proposition and its monetization potential. Contextual voice, still in its conceptual stage, is a voice communication technology that is envisaged to transmit both what is spoken by parties in a voice call; and a wide range of other information that is transmitted to the human brain over wider frequency bands, that help build the ‘context’ for the call.

Hearing More Than What’s Spoken

The idea of Contextual Voice is one that is rather intriguing for those coming across the concept of ‘hearing more than what’s spoken’ for the first time as most people do not think that there is anything else that is exchanged during a voice call other than the words spoken by both parties. What they do not realize and what actually takes place during a call is an audio transmission over a frequency band (measured in Hz) that includes noises not ‘heard’ by the listener, but nevertheless transmitted over the call, captured by human ears and deciphered by the brain. 

Two people communicating on a long distance voice call are hence able to transmit more than just the words they speak, and where technology can squeeze in bigger frequency bands on each other’s devices, both parties can sense a lot more about each other – moods, emotions, the physical state of the person, things and persons in the background, conflicts, a sense of urgency, etc.

The Fast Mode this week spoke to Radhakant Das, Executive Vice President of BEDIGITAL, the digital services arm of Indonesia’s Bakrie Telecom on Contextual Voice as an upcoming concept in the voice communications market, that goes beyond the current focus on HD, IP-based voice and rich communications to a new level of voice interaction that moves closer to face-to-face communication.  A strong proponent of Contextual Voice, Radhakant defines Contextual voice as a technology that leverages a large number of real-time inputs and multiple algorithms that link time, space, emotional, physical and political attributes to present contextual information for parties making and receiving a voice call.  

Radhakant Das,

EVP of BEDIGITAL, Bakrie Telecom

According to Radhakant, the technology that captures a wider frequency band (ie beyond the 300 Hz to 3.4 kHz used in the traditional voice quality, and beyond the 50 Hz to 7k Hz or higher used for HD voice calls) is able to provide the brain information that we cannot hear, but can nevertheless decode, enabling humans to receive a wider set of information about the person and the environment on the other end of the line. Contextual voice aims to transmit in a much expanded band, going below 50 Hz and above 14 kHz (defined as ‘Super-wideband’ or Wideband HD), enabling humans to experience the ‘aura’ of a person on the other side of the call, from being able to hear the caller’s deep breath, quiet coughing, body movements, gestures and emotional state of being.  All these information will be transmitted in real-time along with the audible sounds when Contextual Voice is enabled on a voice call.

With the relentless quest among telecom technology companies to deliver an even superior communication experience for users, Contextual Voice does promise the emergence of a breakthrough service that will not only reinstate voice to its past glory, but will elevate it into a service that rivals the likes of augmented reality, delivering richer and almost ‘surreal’ experience for users. The Super-wideband is already in use in the music industry, added Radhakant where music listeners are enjoying sensual experiences that go beyond the music and lyrics they consciously hear, via the ‘silent’ sounds delivered on frequencies that the brain can actually ‘hear’ and decipher.

OTT To Take Lead, Again

Like any technology that is still in its nascent stage, Radhakant expects the eventual rise of Contextual Voice to hinge on a number of key factors, namely continuous and more concerted research and development that will help in firming up the use cases for the technology, the push from OTT service providers and the parallel enhancements in consumer devices that will enable retail consumers to start experiencing Contextual Voice.

Radhakant shared a number of use cases, including the ability for a phone call to be accompanied by a short message that reads the purpose of the call and in another scenario, how a smartwatch, for example, Apple Watch, is able to include information such as the caller’s hearbeat and blood pressure data which can be used to indicate the mood or emotional state of a person during a call.

In both the above cases, voice is accompanied by contextual information that is transmitted via text, images or icons, and the Contextual Voice in these cases is not entirely audio-based. These use cases derive contextual information from either manually entered data or information gathered automatically from connected nodes/sensors. Nevertheless, both use cases are still part of Contextual Voice in that they are able to deliver contextual information about both communicating parties to each other.

A very special application of the second use case is telemedicine where health and fitness information is subtly communicated via a call allowing medical practitioners to learn much more about a patient than what the patient voluntarily shares during remote diagnosis sessions, added Radhakant.

Expanding on these use cases, Radhakant said that leveraging big data and APIs, providers of Contextual Voice services can automatically build the situational context for a call in any environment by extracting and combining relevant information from a wide range of presently available sources such as Twitter’s trending topics, local news channels, location information, CCTV cameras and other sources of real-time data.

The Take Off – Traditional, VoLTE or OTT, Where First?

Deployment wise, Radhakant said that VoLTE stands out as the best platform for Contextual Voice to take off, although OTTs have the competitive advantage of rolling out the feature first on their voice applications due to their more agile operating models, and the entrepreneurial drive that often accompanies new OTT start-ups. In the longer run however, Contextual Voice will assume the same path taken by the likes of RCS, where proven business models will provide the push for MNOs to incorporate the technology as part of their service offering.

In the meantime, Contextual Voice has been slowly building its momentum, having been featured in the MWC in 2013 and with the likes of Vodafone, Telenor and TIM Italy running a few early initiatives in the area.  As a technology, it brings the promise of an experience that telecommunications has never seen before, and as a service it is expected to push voice business to a whole new growth phase. The onus is now on potential suitors to refine the concept, build the relevant applications and put the idea into practice. For the rest of us in the telecommunications community, it’s great to know that something big is brewing in the voice pipeline. 

NEW REPORT:
Next-Gen DPI for ZTNA: Advanced Traffic Detection for Real-Time Identity and Context Awareness
Author

Executive Editor and Telecoms Strategist at The Fast Mode | 5G | IoT/M2M | Telecom Strategy | Mobile Service Innovations 

Tara Neal heads the strategy & editorial unit at The Fast Mode, focusing on latest technologies such as gigabit broadband, 5G, cloud-native networking, edge computing, virtualization, software-defined networking and network automation as well as broader telco segments such as IoT/M2M, CX, OTT services and network security. Tara holds a First Class Honours in BSc Accounting and Finance from The London School of Economics, UK and is a CFA charterholder from the CFA Institute, United States. Tara has over 22 years of experience in technology and business strategy, and has earlier served as project director for technology and economic development projects in various management consulting firms.

Follow Tara Neal on Twitter @taraneal11, LinkedIn @taraneal11, Facebook or email her at tara.neal@thefastmode.com.

PREVIOUS POST

Smaato Shows How Mobile and Ad Industry Synergies Can Accelerate MNOs’ Foray into Mobile Advertising

NEXT POST

Smart Metering: Mobile Apps to Drive Mass Scale M2M Opportunities for Operators