An Inroduction to Conversational Interface Technologies
In the past few decades conversational interface technologies have made great advances. For many situations they have become the preferred interface method. Generally, systems employ speech recognition, natural language processing, and speech synthesis. In a typical system a user's speech is translated to text, which is then processed by a natural language understanding engine. Once meaning has been extracted, the system produces a response. The response is often communicated back to the user using speech synthesis. With each of these technologies there are limitations from both users and computers caused by variety in human speech, processing power, and memory size. Solutions include changing microphone positions and types, offloading major processing, user specific recognition training, limiting vocabulary, and location specific design. We will review an introductory chapter on speech technologies including applications for recognition and synthesis in HCI. For each area we will discuss the current capabilities, design decisions, limitations, and successful examples of applications.