Press or Say

Press or Say enables you to create a voice flow in which a caller can provide a response through speech or DTMF input using the dial pad on their phone.

Using the Press or Say action, a caller can press a digit or speak words to make their selection (e.g., "say or key in your account number..."). In contrast, the Collect DTMF action only supports customers pressing a single digit (e.g., "press one to...") while the Menu Tree action supports customers pressing or saying a single digit to make their selection (e.g., "press one or say one to...").

Note: Press or Say is unable to collect a mix of DTMF and spoken input in one instance (e.g., if a caller is providing their zip code and says "12345" and then presses the # key, the caller's spoken input will not be captured correctly and the flow will follow the On Failure path).

Benefits

Lower latency: This action streams directly to speech recognition services, such as Watson Assistant, for much faster speech recognition in a voice flow. This is an improvement from previous solutions that involved chaining multiple actions together and relied on API calls to the speech recognition service.
Convert spoken numbers to digits: The backend improvements also allow for spoken numbers to be translated to digits (for example, spoken “Two four seven three” becomes 2473).
Interrupt: Because this action streams to the speech recognition service directly, it is built to respond quickly to speech, which means if the user wants to interrupt while the audio prompt is playing, they can do so.

Inputs

This action looks like Collect DTMF in most ways and allows for some of the same configuration items. All of the inputs associated with the Press or Say action are listed below, but some inputs are dependent on other fields and might not initially appear (e.g., the Sensitivity Level drop-down list only appears if you check the Allow Interrupt check box first).

Next Gen Voice

Use this toggle to enable Text-to-Speech (TTS) streaming, improve latency within your flow, and access greater customization for audio messages within your flow.

When the Next Gen Voice toggle is enabled, the following inputs appear (scroll down to view the fields that appear when the toggle is off):

Input Name	Description
Advanced Configuration	Use this toggle to create a fully customizable TTS or ASR app with raw XML data. We recommend turning on the Advanced Configuration toggle for the most comprehensive experience. When the toggle is enabled, you'll only see the App drop-down list and mrcp_model text box. When the toggle is "off", the following fields appear.
Termination Key	When the toggle is "on", a caller can press the default termination key (#) on their dial pad to signify they've completed their input
Store Data in Insights Data Records	When checked, DTMF inputs are stored for analytics purposes in Insights. Note: Due to the potential for sensitive or private data to be shared, we do not store any DTMF inputs without this box being checked
Set Custom Variable	Use this check box to create a custom variable that can be used downstream in your flow.
Custom Variable	Enter your custom variable in the text box. When your variable appears in the Available Variables section for actions downstream in your flow, it will be in the format "${CustomVariable}". Note: This field only appears when the Set Custom Variable check box is selected. You cannot include spaces or $ in your variable.
Allow Interrupt	Enable callers to interrupt audio prompts at any time.
Vendor	Select the appropriate provider from the drop-down list. Depending on which option you select, new fields will appear and vary. For example, if you select "Google-GDF", the Language field defaults to "English (US)" and the Project ID field appears. Each field is specific to the provider's requirements.

For more information regarding Deepgram specific requirements, check out Endpointing and Utterance End. For more information regarding Google-GDF, check out this page.

Non-Next Gen Voice

When the Next Gen Voice toggle is off, the following input fields appear:

Input Name	Description
Enable Voice Input	Use this toggle to enable spoken input. When the toggle is "off", callers can only provide input by pressing the appropriate input via their dial pad. Note: When the Enable Voice Input toggle is "off", the Allow Interrupt check box will automatically be selected.
Termination Key	When the toggle is "on", a caller can press the default termination key (#) on their dial pad to signify they've completed their input. Note: When the Termination Key toggle is "off", the Validation field and Enhanced Error Handling check box will not be available.
Timeout (Seconds)	The number of seconds a caller has between each key entry to provide their input. For example, if you expect callers to enter their zip code, you might set this field to 2 seconds between each of the 5 digits, but if you expect callers to enter their credit card number, you might set this field to 5 seconds to give them a bit more time between each of the 16 digits to check their card each time. If a key is not entered within the timeout period, your flow will move on without the full expected caller input (e.g., if you expect callers to enter their zip code and this field is set to 2 seconds, if they enter “8022” and then pause for more than 2 seconds the flow will move on without the last zip code digit). Note: This field is only relevant for caller input via dial pad entry. It is not relevant for spoken caller input.
Store Data in Insights Data Records	When checked, DTMF inputs are stored for analytics purposes in Insights. Note: Due to the potential for sensitive or private data to be shared, we do not store any DTMF inputs without this box being checked.
Set Custom Variable	Use this check box to create a custom variable that can be used downstream in your flow.
Custom Variable	Enter your custom variable in the text box. When your variable appears in the Available Variables section for actions downstream in your flow, it will be in the format "${CustomVariable}". Note: This field only appears when the Set Custom Variable check box is selected. You cannot include spaces or $ in your variable.
Allow Interrupt	Enable callers to interrupt audio prompts at any time.
Sensitivity Level	Select Low, Medium, or High depending on your callers environment and background sounds. Medium is the recommended setting to account for neutral backgrounds. Note: This field only appears when the Allow Interrupt check box is selected.
Language	Select the appropriate language for input.
Input Type	Select the appropriate variety of input. Note: This field only appears after a selection is made from the Language drop-down list. Learn more about each available Input Type option here.
Validation	Select the appropriate option from the drop-down list to verify your digit sequence: None: The default option. The caller's digit sequence will not go through any validation. Minimum and Maximum # of digits: The Min and Max fields appear. The caller's digit sequence must fall within the expected input length. If the digit sequence is expected to be an exact length (e.g., only 5-digit values are expected), enter that number in both the Min and Max field. Minimum and Maximum Value: The Min and Max fields appear. The caller's digit sequence must fall within the expected value range. The minimum value accepted in the Min field is "0" and the maximum value accepted in the Max field is "99999999999999999999". Depending on which Validation option you select, the caller's digit sequence will go through the appropriate check behind the scenes to verify if it is acceptable (success path) or unacceptable (failure path). Note: This field only appears if the "Digit Sequence" option is selected from the Input Type drop-down list. The Min and Max fields can contain digits or variables and are inclusive (e.g., if your Min is 4 and your Max is 8, 4-digit and 8 digit responses will be successful).
Max Attempts	Enter the appropriate number of retries a caller has to provide a response (1-10). Note: This field only appears when the Enhanced Error Handling toggle is "on".

Configure Audio

Click Configure Audio to build your audio options.

Audio Text (TTS): Use your default Text-to-Speech settings or select your preferred vendor and voice. Then type your message into the text box and click the blue “+” sign.
Audio Library: Add new or drag and drop existing audio files from the library.

For more information, see Configure Audio Settings.

Variables

Press or Say creates two variables that contain the caller’s response (either DTMF digits or transcription of their speech) and can be used downstream in your flow:

$DTMFv2_#.input: A translation of the caller's response to what your menu is looking for.
$DTMFv2_#.input_raw: The caller's exact response. You might prefer to use this variable if your caller's response is something more complicated (like complex product names) where a translation would yield worse results.

Note: # is a placeholder for the action ID associated with the Press or Say action. The action ID depends on the order you add actions to your flow (e.g., if Press or Say was the ninth action you added to your flow the variables would be $DTMFv2_9.input and $DTMFv2_9.input_raw).

Press or Say caller responses are saved as text. If you would like to use an audio file response downstream, we recommend using the Record Response action. A Switch action downstream in your flow can be used to route the flow based on the contents of the variable.

Check out How to Use Variables in SmartFlows for more information about using variables in your flow.

Action

Each Press or Say action includes three exit ports corresponding to the following outcomes:

On Success: The course of the flow if the caller's input is successfully collected (port 1)
On Timeout: The course of the flow if the caller doesn't provide input before the timeout period elapses (port 2)
On Failure: The course of the flow if the caller's input is not collected (port 3)

An action should be connected to each of the exit ports and the port order cannot be rearranged.