Sie sind auf Seite 1von 1

Key Features

IN1 (Currently Amended) A computer-implemented system of automatic speech


recognition comprising:

IN2 at least one acoustic signal receiving unit to obtain audio data including human
speech;

IN3 at least one processor communicatively connected to the acoustic signal receiving
unit;

IN4 at least one memory communicatively coupled to the at least one processor; and

IN5 a WFST decoder operated by the at least one processor and to:

generate a static vocabulary weighted finite state transducer (WFST) having nodes
IN6 connected by arcs to propagate at least one token through the static vocabulary
WFST and at least one dynamic vocabulary trigger marker at at least one of the
arcs;

IN7 propagate a token through at least one dynamic vocabulary WFST upon the at least
one token reaching the trigger marker;

propagate a token through at least one grammar WFST having at least one
IN8 dynamic vocabulary class marker that indicates a type of dynamic vocabulary and
is associated with the dynamic vocabulary of at least one of the dynamic vocabulary
WFSTs with a propagating token;

provide a hypothetical word or phrase based at least in part on the obtained human
IN9 speech and depending, at least in part, on the WFSTs and comprising terms in the
static vocabulary, dynamic vocabulary, or both vocabularies; and

IN10 determine user intent based at least in part on output from the decoder based at
least in part on the hypothetical word or phrase; and

initiate a response or action based at least in part on the determined user intent, the
IN11 initiated response or action being implemented via speech output from a speaker
component, via visual output from display component, and/or via other action from
one or more end devices.