Speech Interface Design for Social Robots Assisting People with Developmental Disabilities

We are designing an effective speech interface for assistive robots helping people with developmental disabilities. Developmental disabilities are a set of permanent and severe problems that are challenging many people nowadays. As the 2012 Canadian Survey on Disability (CSD) revealed, “160,500 (0.6% of Canadian adults) were identified as having a developmental disability” (Bizier, Fawcett, and Gilbert 4). People with developmental disabilities often require more assistance to learn, understand or express information than others, because developmental disabilities can dreadfully affect their language and social skills (DSOntario). We believe social robots, often considered as assistive robots, can improve the life quality of clients with developmental disabilities. Since the clients talk and behave differently than ordinary people, and many of them have cognitive difficulties, social robots that are designed to communicate with them must have a good speech interface to achieve a good communication quality. A speech interface is a good way to present the conversation information and the status of the current conversation. This study is a preliminary investigation and analysis on how caregivers and their clients with developmental disabilities communicate. Through this process, we analyze the language patterns of both sides. The results obtained from this report will help us with our research and design at next stage.

What is a developmental disability?

Developmental disabilities are defined as “a set of abilities and characteristics that vary from the norm in the limitations they impose on independent participation and acceptance in society. The condition of developmental disabilities is developmental in the sense that delays, disorders, or impairments exist within traditionally conceived developmental domains” (Odom et al.).

Introduction

We are studying and researching the language patterns of both people with developmental disabilities and their caregivers, in order to design a speech interface solution for assistive social robots that can improve the life quality of clients with developmental disabilities. The two primary research problems that we want to investigate are how do people with developmental disabilities talk or behave differently than ordinary people, and how can we design a good speech interface for assistive social robots to help them achieve better life qualities. The Canadian Survey on Disability 2012 suggested that 90% of adults with a developmental disability needed assistance with some kind of everyday activity, and 72.7% of them reported some degree of unmet need for at least one of these activities. (Bizier, Fawcett, and Gilbert 10). Therefore, this research will be significant for designing a supportive companion robot that can improve the life qualities of people having daily challenges because of their developmental disabilities.

Conceptual Framework

Figure 1 shows the mind map of the initial research proposal. This overall research project is to study the social and affective impacts of social robots from two aspects: physical and non-verbal interactions. I would like to investigate how the physical form of the robot can influence the relationship between robots and patients with disabilities. To finish this research, I need to develop an approach to improve the communication between social robots and users with developmental disabilities. Therefore, this research consists of three primary research areas: robotics, social interaction, and interface design (Figure 2),

Concept Map

Figure 1. Research Concept Map

Venn Map

Figure 2. Three Central Research Areas

This project is essentially a study of human-robot interaction. Thus, robotics is not the primary focus, instead, the foundation of the research. To learn this connection between clients and robots, I need to understand the social interaction and design interactive verbal features. The interface, which can be seen as part of interaction, is a very significant factor of this research. This is the reason that I highlight it as a separate category. A careful investigation is needed on the speech patterns of ordinary people, clients with developmental disabilities, and their caretakers.

As described in the concept map, the following factors need to be considered and investigated: functions of social robots, purposes and objectives of robots, technologies needed for our designed robot, and possible problems and challenges. Robotics can cover the questions in technologies needed for this research, because this part is more about technical details. Engineering and computer science knowledge will cover most of the research in robotics. The section “social interactions” explores the connection between clients and robots, and we need to know what the robots are designed for. Thus, the purposes and functions of social robots have to be clarified here. Interface solves the communication problems. There is an overlap between social interactions and interface. However, interface should be considered as an individual category because of its significance. It is one of the main focuses of this project.

The application of social robots to elderly care was discussed in Broekens et al’s article (2009). This is a significant study because there is an increasing necessity for technologies that can improve the life quality of the elderly. Social robots are useful in eldercare because they can be an interface for the elderly with digital technology, and they can offer companionship to the elderly. Yet the effective of this application has not been investigated much in the past. The authors compare several different social robots to investigate the effects of assistive social robots on the health of the elderly. The results of this comprehensive research indicate that assistive social robots have many functions and effects, including increasing health by decreasing level of stress, decreasing loneliness, increasing communication activity with others and improving the sense of happiness. A survey in this study suggests that most elderly report liking the robots. This research shows he validity of using social robots for people with developmental disabilities. It is also significant to compare a number of other research projects before doing the research so that I can see the current progress of this research area. Since our focus on language patterns and speech interface at this stage, we need to design an efficient way for human-robot communication. Sugiura et al’s paper presents a new method for speech synthesis for robots based on Hidden Markov models (HMMs) (2015). It focuses on natural and friendly synthesized speech for robots, because speech communication has been a very popular research area for human robot interaction conferences and competitions. The authors compare traditional Text-to-Speech (TTS) systems to non-monologue speech synthesis systems and found that TTS systems tend to be very unnatural and unfriendly, partly because they are designed for text reading instead of communications. This result is inspiring for my research because TTS is the easiest and most commonly used method for speech generation in various devices, including robots. Also, Sugiura et al point out that monotonous intonation tends to prevent novice users from knowing that the robot is actually asking a question. Their research team built the new non-monologue speech synthesis system and the results have shown that the performance of their method almost approached the theoretical upper limit. They have also proved the effectiveness of applying non-monologue speech synthesis to robots. Furthermore, their system is based on cloud, making it easy to collect speech synthesis corpus for robots and to maintain the system in the future.

Our project is not simply about designing a speech interface. Instead, it is an exploration on how we can evaluate people’s perception on language in the verbal form. Cha et al stated that there seems to be a gap between robots’ true capability and perceived capability (2014). Their research focuses on the speech pattern. After conducting an online study to study the effects of speech on perceived capability, they found that physical failures of a social robot negatively influence participants’ evaluation of only the robots’ physical capability, and speech can positively affect participant’s ratings of the robots’ physical capability. Thus, the significance of speech is clear for social robots. Knowing the properties of robots can help designers expect user’s need and behaviors. This paper is an initial study to explore how robot behaviors influence users’ perception of robots’ capabilities, and the results have shown the impact of speech in this process as it has many levels and functions.

Proposed Research

The research goal at this stage is to understand the language patterns of people with different degrees of developmental disabilities. The eventual aim is to design an efficient speech interface for assistive robots so that people with developmental disabilities can get assistance or just interact with the robots. To achieve these goals, we need to collect enough data, by interviews or literature reviews, before starting the design process.

Methods

Focus group is used as the research method in this study. It is an effective method in terms of collecting data and having overviews of the interviewed group. The interview was held in a residential branch of DDA (Developmental Disabilities Association), where we could find residents with developmental disabilities and their caregivers. Due to the restricted ethical challenges, I did not to talk directly to residents there. A focus group “needs to be large enough to generate rich discussion but not so large that some participants are left out” (Associates). Our open-ended interview included 5 interviewees in total, including three caregivers, one office manager, and an external supervisor and counselor of this entire project. The interview was semi-restrictive. A general outline of interview questions was used, but some other questions were generated spontaneously during the interview and they were recorded as well. The interview lasted about 50 minutes. During the interview, I took notes and recorded the vital points that I was getting from the group. Video and audio recording brought us high degree of reliability, validity, and legal defensibility. They were helpful for identifying who is speaking and for improving the accuracy of the research by replaying sessions during analysis. The audio files were transcribed or coded as the textual data in addition to the notes recorded during the interview.

The focus group method was proven to be effective for research like ours. Wu et al conducted three focus group sessions in the process of designing an assistive robot for the elderly (2012). The proposed idea for these focus groups was to provide engineers with some suggestions when they design the robot. Their results showed that these focus groups helped designers consider the social context of the robots and the elderly, because these groups allow participants to share and discuss their ideas of assistive robots. Therefore, the focus group method used in our research project should be informational and helpful as well.

Validity

The data we collected from the interview is very informational will be a guide for our design. The counselors and employees working at the DDA home provide me with some insightful information and perspectives in terms of applying assistive robots to help residents with developmental disabilities. The data analysis results are invaluable for our future speech interface design. For example, I learned that the challenges of speech interface design for assistive or companion robots mainly come from two factors: accuracy and the interaction between human and robot. Unlike other mature interface technologies, speech interfaces incorporate some unique features which require designers to carefully investigate and think out the special relationship between users and the robot terminal.

Accuracy is not a technical challenge that only computer scientists who research the language processing engine need to be concerned with. Even if our robot employs a state-of-the-art speech recognition technology with a 5% word error rate (Novet), it is difficult to have the robot understand people with developmental disabilities. The current speech recognition technologies are generally built for users without accents and in regular tones, in an ideal and quiet environment. For people with developmental disabilities, it is hard to predict their tempers. As a caregiver at local DDA office told us, some residents in their house may start yelling at her out of no reason. Because of the special properties of speech interfaces, the state of the robot tends to be opaque to users. In that case, users will not be clear about if the robot has understood their words. Meanwhile, unlike GUI where there are buttons for commands, error recovery and correction are difficult for users as well (Rudnicky). It is important to make it easy for users to undo or correct a previous speech input. Since we are designing a speech interface for people with cognitive challenges, the interface needs to be simple and efficient.

The human-robot interaction is the primary topic that we at SIAT need to research. Learning the language patterns of both the residents and their caregivers at DDA were very helpful for researchers to understand the lives and communication patterns of people with developmental disabilities. We learned that caretakers like offering a few options when they ask residents questions, instead of giving a broad question that forces residents to think and choose. This way, residents only need to say “Yes” or “No”, or simply make a choice from the given options. We also found that people with developmental disabilities may change their minds fast. For example, maybe a resident just talked with you peacefully five minutes ago, but suddenly he would go to his room and lock the door, and yell at you, “Get out” or “Leave me alone!” Maybe he will be fine after another ten minutes. Therefore, a social robot needs to be designed to be capable of dealing with all this kind of situations.

There is also other useful information like these in the interview. We will make good use of the data for our next-stage research.

Contributions

Currently, there are not many robots specifically designed for people with developmental disabilities. Our robot can potentially have a great impact on their daily lives. The research on language interface does not have a big leap in the past few years. This speech interface design will contribute to this research area. Our results from studying the language patterns of people with developmental disabilities can possibly help other researchers or psychiatrists who are also trying to help them.

Summary

We have finished a focus group session successfully and collected some valuable information for further analysis. We learned about caregiver’s daily jobs and residents’ everyday needs. Their communication method is inspiring for our design. The language they used is often very simple and concise. We will need to take all points like this into our design consideration. The first stage of this research, which is basically learning about the background and collect data, is finished. We might need to have another focus group session in the future to get more details and new perspectives. This study is overall fruitful and we will continue our further research based on it.



References

  1. Associates, Eliot &. Guidelines for Conducting a Focus Group Defining a Focus Group. N.p., 2016. Web.
  2. Bizier, Christine, Gail Fawcett, and Sabrina Gilbert. “Canadian Survey on Disability , 2012 Developmental Disabilities among Canadians Aged 15 Years and Older.” 89–654 (2015): 3. Web.
  3. Breazeal, Cynthia, Atsuo Takanishi, and Tetsunori Kobayashi. “Social Robots That Interact with People.” Springer Handbook of Robotics (2008): 1349–1369. Web.
  4. Broekens, Joost, Marcel Heerink, and Henk Rosendal. “Assistive Social Robots in Elderly Care: A Review.” Gerontechnology 8.2 (2009): 94–103. Web.
  5. Cha, Elizabeth et al. “Effects of Speech on Perceived Capability.” Proceedings of the 2014 ACM/IEEE international conference on Human-robot interaction - HRI ’14 (2014): 134–135. Web.
  6. DSOntario. “What Is A Developmental Disability? - Developmental Services Ontario.” Developmental Services Ontario 14 Dec. 2016. Web.
  7. Gockley, Rachel et al. “Designing Robots for Long-Term Social Interaction.” 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS (2005): 2199–2204. Web.
  8. Heerink, Marcel et al. “Measuring the Influence of Social Abilities on Acceptance of an Interface Robot and a Screen Agent by Elderly Users.” BCS-HCI ’09 Proceedings of the 23rd British HCI Group Annual Conference on People and Computers: Celebrating People and Technology January (2009): 430–439. Web.
  9. Kröse, Ben J. A. et al. “Lino, the User-Interface Robot.” Ambient Intelligence. Springer Berlin Heidelberg, 2003. 264–274. Web.
  10. Leite, Iolanda, Carlos Martinho, and Ana Paiva. “Social Robots for Long-Term Interaction: A Survey.” International Journal of Social Robotics 5.2 (2013): 291–308. Web.
  11. Novet, Jordan. “Google Says Its Speech Recognition Technology Now Has Only An 8% Word Error Rate”. VentureBeat. N.p., 2015. Web. 1 Dec. 2016.
  12. Odom, Samuel et al. Handbook of Developmental Disabilities. New York: The Guilford Press, 2007. Web.
  13. Rudnicky, Alexander. “Speech Interface Guidelines”. Speech at CMU. N.p., 1996. Web. 1 Dec. 2016.
  14. Sharkey, Amanda. “Robots and Human Dignity: A Consideration of the Effects of Robot Care on the Dignity of Older People.” Ethics and Information Technology 16.1 (2014): 63–75. Web.
  15. Sugiura, Komei et al. “A Cloud Robotics Approach towards Dialogue-Oriented Robot Speech.” Advanced Robotics 29.7 (2015): 449–456. Web.
  16. “Voice User Interface”. En.wikipedia.org. N.p., 2016. Web. 14 Dec. 2016.
  17. Wang, Ning et al. “Recent Advances in Nonlinear Speech Processing.” Smart Innovation, Systems and Technologies 48 (2016): 275–283. Web.
  18. Wu, Ya Huei, Christine Fassert, and Anne Sophie Rigaud. “Designing Robots for the Elderly: Appearance Issue and beyond.” Archives of Gerontology and Geriatrics 54.1 (2012): 121–126. Web.
  19. Zhao, Shanyang. “Humanoid Social Robots as a Medium of Communication.” New Media & Society 8.3 (2006): 401–419. Web.