Voice DirectTM DATA SHEET
From the Interactive Speech™ Line of Products
FEATURE OVERVIEW
Voice Direct performs high quality speaker-dependent
speech recognition. The chip utilizes its neural network
recognizer to recognize discrete words or short phrases.
The chip performs three basic functions:
Train - Users train the chip to identify a specific word
by saying each word twice. After training, the two
patterns are averaged and a template is stored.
Recognize - The user speaks a word and the chip
compares the new pattern with the previously trained
templates to identify which word was spoken. The chip
then outputs the result of its analysis.
Erase - Users can delete previously trained words from
the set of recognition templates.
In each of these functions, Voice Direct features integrated
speech prompting providing a complete interactive user
interface.
EXTERNAL HOST MODE
Voice Direct’s external host operating mode provides a
complete speaker dependent recognition system that can
easily be controlled by an External Host processor (Host).
The Host communicates to Voice Direct using a 3-wire
serial bus. This high-level control interface allows the
Host to control the flow of operations and to initiate all of
its functions including training, recognition, or synthesis.
In external host mode, Voice Direct recognizes up to 60
words. To improve application flexibility these words can
be divided into smaller recognition sets, improving
accuracy and functionality.
STAND ALONE MODE
Voice Direct’s stand alone operating mode is designed to
provide a complete recognition system using only the chip,
external template storage memory, and a few passive
electronic components. All operations, including training,
recognition, and erase can be controlled by configuring
chip input pins. Output pins provide status information to
external devices. In stand alone mode, Voice Direct can
recognize one set of 15 words.
SPEECH PROMPTS
Voice Direct includes a standard English vocabulary of
over 100 phrases to guide the user through its functions.
This standard word list can be replaced with a customized
word list for English or foreign languages via an external
ROM chip.
RECOGNITION THRESHOLD
Voice Direct supports multiple acceptance threshold levels
during the recognition process. The acceptance level
determines how closely the spoken word must match a
pre-trained template in order to pass. The user adjusts the
level depending on the complexity of the recognition set.
More complex recognition sets should have a higher
acceptance level, while simpler sets can use a lower
threshold level.
INPUT AUDIO AMPLIFIER AND FILTER
Voice Direct requires an external pre-amplifier to
condition the input signal. When used with an
inexpensive omni-directional electret microphone, the
input audio amplifier and filter must provide
approximately 58 dB of low-noise mid-band gain, 2-bit
AGC controllability, and a first order bandpass response
with 3dB points at roughly 700 Hz and 3300 Hz.
AUDIO OUTPUT
Voice Direct can directly drive a 32-Ohm speaker from
the SP0 pin, providing approximately 0.15W of audio
power.
MEMORY INTERFACES
Voice Direct requires 8K bytes of dedicated external Serial
EEPROM memory for template storage. Each time a new
word is trained, Voice Direct automatically writes the
template to the memory device. During recognition, Voice
Direct reads the templates from the memory device and
compares them with spoken words or phrases. Voice
Direct communicates through a I2C 2-wire serial interface.
TSSP MODULE
The Voice Direct solution is also available as a complete
module. The module is a single 2” x 2” PCB that includes
all external components (e.g., preamplifier, memory)
required by Voice Direct, except microphone and speaker.
This module is ideal for prototype development or small
production runs.
Feature Summary Of Voice DirectTM
Maximum Number of
Recognition Words Multiple Recognition
Sets Supported Acceptance
Threshold
Levels
Custom
Synthesis Foreign
Language
Synthesis
External Host 60 Yes (up to 8) 5Yes Yes
Stand-Alone 15 No 3No Yes