A start of an input speech
signal is detected during presentation of an output
audio signal and an input
start time, relative to the output
audio signal, is determined. The input
start time is then provided for use in responding to the input speech
signal. In another embodiment, the output
audio signal has a corresponding identification. When the input speech
signal is detected during presentation of the output audio signal, the identification of the output audio signal is provided for use in responding to the input speech signal. Information signals comprising data and / or control signals are provided in response to at least the
contextual information provided, i.e., the input
start time and / or the identification of the output audio signal. In this manner, the present invention accurately establishes a context of an input speech signal relative to an output audio signal regardless of the
delay characteristics of the underlying communication
system.