Press or Say (Preview)

Press or Say enables flow builders to create a voice flow in which a caller can provide an input through speech or DTMF input using the dial pad on their phone. Whichever input they choose is then streamed directly to the IBM Watson speech recognition service. We’re still fine-tuning and taking feedback for Press or Say, so it has been launched in preview.


  • Lower latency: This action streams directly to speech recognition services, such as IBM Watson Assistant, for much faster speech recognition in the voice flow. This is an improvement from previous solutions, which involved chaining multiple actions together and relied on API calls to the speech recognition service.
  • Convert spoken numbers to digits: The backend improvements also allow for spoken numbers to be translated to digits (i.e. Spoken “Two four seven three” becomes “2473”).
  • Interrupt: Because this action streams to the speech recognition service directly, it is built to respond quickly to speech, which means that if the user wishes to interrupt while the audio prompt is playing, they can do so.


This action looks like Collect DTMF in most ways and allows for the same configuration of the following items:

Input Name Description
Termination Key The key a caller can press on their dial pad to signify that they've completed their input.
Timeout (Seconds) Specify the number of seconds the caller has to enter their input before the flow moves on.
Store Data in Atmosphere Insights Data Records

When checked, DTMF inputs will be stored for analytics purposes in Atmosphere® Insights.

Note: Due to the potential for sensitive or private data to be shared, we do not store any DTMF inputs without this box being checked.

Allow Interrupt Enable callers to interrupt audio prompts at any time.
Language Select the appropriate language for input.
Input Type Select the appropriate variety of input.


Press or Say creates a variable that contains the caller’s input (either DTMF digits or transcription of their speech). A Switch action can be used to route the flow based on the contents of the variable.