03

IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO.
1, FEBRUARY 2010
187
[10] F. Lin and R. Brandt, An optimal control approach to robust control of robot manipulators, IEEE Trans. Robot. Autom., vol. 14, no. 1, pp. 6977, Feb. 1998. [11] Y. Tang, M. Tomizuka, G. Guerrero, and G. Montemayor, Decentralized robust control of mechanical systems, IEEE Trans. Autom. Control, vol. 45, no. 4, pp. 771776, Apr. 2000. [12] Y. Choi, W. Chung, and I. Suh, Performance and H optimality of PID trajectory tracking controller for Lagrange systems, IEEE Trans. Robot. Autom., vol. 17, no. 6, pp. 857869, Dec. 2001. [13] C.-H. Choi and N. Kwak, Robust control of robot manipulator by modelbased disturbance attenuation, IEEE/ASME Trans. Mechatron., vol. 8, no. 4, pp. 511513, Dec. 2003. [14] M. Oya, C.-Y. Su, and T. Kobayashi, State observer-based robust control scheme for electrically driven robot manipulators, IEEE Trans. Robot., vol. 20, no. 4, pp. 796804, Aug. 2004. [15] M. Corless and G. Leitmann, Continuous state feedback guaranteeing uniform ultimate boundedness for uncertain dynamic systems, IEEE Trans. Autom. Control, vol. AC-26, no. 5, pp. 11391144, Oct. 1981. [16] G. Leitmann, On the efciency of nonlinear control in uncertain linear system, J. Dyn. Syst., Meas., Control, vol. 103, pp. 95102, 1981. [17] G. Liu and A. Goldenberg, Uncertainty decomposition-based robust control of robot manipulators, IEEE Trans. Control Syst. Technol., vol. 4, no. 4, pp. 384393, Jul. 1996. [18] P. Rocco, Stability of PID control for industrial robot arms, IEEE Trans. Robot. Autom., vol. 12, no. 4, pp. 606614, Aug. 1996. [19] L. Bascetta and P. Rocco, Revising the robust control design for rigid robot manipulators, in Proc. IEEE Int. Conf. Robot. Autom., 2007, pp. 44784483. [20] J. Doyle, K. Glover, P. Khargonekar, and B. Francis, State-space solutions to standard H 2 and H control problems, IEEE Trans. Autom. Control, vol. 34, no. 8, pp. 831847, Aug. 1989. [21] M. Green and D. Limebeer, Linear Robust Control. Englewood Cliffs, NJ: Prentice-Hall, 1995. [22] S. Boyd, L. E. Ghaoui, E. Feron, and V. Balakrishnan, Linear Matrix Inequalities in System and Control Theory. Philadelphia, PA: SIAM Stud. Appl. Math., 1994. [23] G. Colombina, F. Didot, G. Magnani, and A. Rusconi, External servicing testbed for automation and robotics, IEEE Robot. Autom. Mag., vol. 3, no. 1, pp. 1323, Mar. 1996.
A Model of Proximity Control for Information-Presenting Robots

Fumitaka Yamaoka, Takayuki Kanda, Member, IEEE, Hiroshi Ishiguro, Member, IEEE, and Norihiro Hagita, Senior Member, IEEE
AbstractIn this paper, we report a model that allows a robot to appropriately control its position as it presents information to a user. This capability is indispensable, since in the future, many robots will function in daily situations such as shopkeepers presenting products to customers or museum guides presenting information to visitors. Psychology research suggests that people adjust their positions to establish a joint view toward a target object. Similarly, when a robot presents an object, it should stand at an appropriate position that considers the positions of both the listener and the object to optimize the listeners eld of view and establish a joint view. We observed humanhuman interaction situations, where people presented objects, and developed a model for an information-presenting robot to appropriately adjust its position. Our model consists of four constraints to establish O-space: 1) proximity to listener; 2) proximity to object; 3) listeners eld of view; and 4) presenters eld of view. We also experimentally evaluate the effectiveness of our model. Index TermsHumanrobot interaction, joint attention, proximity.
I. INTRODUCTION Humanoid robots can communicate with humans. Their human-like bodies enable humans to intuitively understand their gestures, which prompt humans to unconsciously behave as if communicating with other humans. If a humanoid robot effectively uses its body, people can communicate naturally with it. This can allow humanoid robots to perform communicative tasks in society, such as presenting exhibitions or products. Much research in psychology has shown the importance of human positioning during conversation. For example, Hall discovered that human distance changes to match the intimacy of the communication [2]. Kendon studied spatial arrangement during peoples conversations and found that O-space was established [3]. He coined the term O-space because the shape of an area to which two persons pay attention is a circle. After O-space is established, they can look at the object together. He found that people formulate O-space in various ways, which are categorized into three types of spatial formations called F-formations: vis-` -vis, L-shape, and side-by-side. a
Manuscript received December 19, 2008; revised June 24, 2009 and October 25, 2009. First published December 28, 2009; current version published February 9, 2010. This paper was recommended for publication by Associate Editor B.-J. Yi and Editor J.-P. Laumond upon evaluation of the reviewers comments. This work was supported by the Grants-in-Aid for Scientic Research (A), Grants-in-Aid for Scientic Research by Japan Society for the Promotion of Science (KAKENHI) under Grant 21680022. This paper was presented in part at the 2008 Association for Computing Machinery/IEEE Annual Conference on Human-Robot Interaction, Amsterdam, The Netherlands. F. Yamaoka was with the Intelligent Robotics and Communication Laboratories, Advanced Telecommunications Research Institute International, 6190288 Kyoto, Japan. He is now with the Mobility Laboratory, Nissan Research Center, Nissan Motor Co. Ltd., Kanagawa 243-0123, Japan (e-mail: f-yamaoka@mail.nissan.co.jp). T. Kanda and N. Hagita are with the Intelligent Robotics and Communication Laboratories, Advanced Telecommunications Research Institute International, 619-0288 Kyoto, Japan (e-mail: kanda@atr.jp; hagita@atr.jp). H. Ishiguro is with the Department of Adaptive Machine Systems, School of Engineering, Osaka University, Osaka 565-0871, Japan. He is also with the Intelligent Robotics and Communication Laboratories, Advanced Telecommunications Research Institute International, 619-0288 Kyoto, Japan (e-mail: ishiguro@ams.eng.osaka-u.ac.jp). Digital Object Identier 10.1109/TRO.2009.2035747
1552-3098/$26.00 2009 IEEE
188
IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 1, FEBRUARY 2010
We believe that communication robots should also establish spatial relationships such as O-space during interaction with people. Several researchers have studied robots that control their position toward humans. For example, Nakauchi and Simmons have developed a robot that stands in line with humans using a model of personal space learned from people standing in a line [4]. Tasaki et al. utilized Halls proximity theory to determine an optimal combination of sensors and robot behavior based on the current distance from the interacting person [5]. Sisbot et al. proposed a human-aware motion planner for a robot that explicitly takes into account its human partners by reasoning about their accessibility, their vision eld, and their preferences in terms of relative humanrobot placement and motions. [6] Walters et al. studied the humanrobot distance at which people feel comfortable in interaction [7]. Dautenhahn et al. investigated how a robot should best approach a seated person and found that most subjects disliked a frontal approach and preferred to be approached from either side [8]. Gockley et al. investigated the best approaches for a person-following robot and found that direction-following behavior is more natural and humanlike than path following [9]. Pacchierotti et al. investigated passing strategies of a robot in a narrow corridor [10] and showed that entering peoples sphere of intimacy increases anxiety. Koay et al. investigated participant preferences for a robots approach distance with respect to its approach direction and appearance. Their results show that participant preferences change over time as they become accustomed to the robot [11]. All of these studies revealed how a robot should control its position in relation to the person. A few researchers have studied the spatial relationship when humans talk with robots about objects. Kuzuoka et al. studied a guide robot and suggested that an information-presenting robot must head body orientation toward people and exhibits [12]. Huettenrauch et al. found that Kendons F-formation arrangement formed by humans indicates an object to a robot [13]. Although their observations revealed that humans formulate O-space toward a robot, they did not study how a robot should formulate its own spatial relationships. This paper reports a communication robot that stands at an appropriate position when presenting an object to a person. This ability is important for information-presenting robots to work in daily situations such as a shop or a museum. We established a model for a robot to appropriately control its position when it presents information to a user based on observations of humanhuman interaction in providing information on objects. We implemented this model in the humanoid robot Robovie [14] using a motion-capturing system and then veried its effectiveness through an experiment. II. MODELING OF INTERACTION A. O-Space Constraints We believe that a robot that presents objects needs the capability to formulate O-space. We explored the necessary constraints for the robot to make O-space based on observations of humanhuman interaction. Several possible constraints exist concerning the positions of a listener and an attentional object. For example, a presenter needs to determine the distance between the listener and the object. The presenter should also be aware of both his/her own and the listeners elds of view because he/she needs to look at both the listener and the object while providing information and listening. We decided to directly establish O-space rather than referring to Fformation, since the scenario where a robot presents an object is rather restricted by the spatial arrangements of objects such as walls, shelves, and desks. To summarize, there are four possible constraints, which are as follows.
Fig. 1. Five analysis items for spatial relationship among a listener, a presenter, and an object.
1) C1: Should the presenter stand close to the listener? 2) C2: Should the presenter stand close to the target object? 3) C3: Should the presenter and the object be simultaneously in the listeners eld of view? 4) C4: Should the presenter simultaneously look at both the listener and the object? In the next section, we describe our process of observing human human interaction situations and evaluate the validity of our constraints and how we should parameterize and prioritize them to develop a robot that presents information. B. Assumptions and Simplications We need to simplify the actual phenomena in humanhuman interaction because too many variables complicate the model. Therefore, we focus on nding the essential constraints that are needed for spatial arrangement. First, we limit our observations to the positions and body orientations when a person presents information on an object to another person. After having established the model, we believe that there are still many possible directions of enhancement, such as including three or more persons, two or more target objects, and more obstacles. C. Observation of Human Behavior We observed the spatial relationship between a presenter, a listener, and an object. 1) Method of Observation: Nine paid undergraduate students (four men and ve women, average age 21 years) participated in this experiment. We created several situations in which participants presented an object to a listener. The experimenter played the role of the listener. Participants were asked to move from a preset start position to a suitable position to perform their presentation. The direction of the listeners body faced the target object in all trials. The details of several situations are described later. We used a motion-capturing system to accurately measure the position data. We analyzed the following ve factors for the positional relationship between the presenter and others (see Fig. 1). 1) distance to a listener: distance between center of a presenter and center of a listener; 2) distance to an object: distance between center of a presenter and center of an object; 3) listeners eld of view: angle between a vector from a listener to an object and a vector from a listener to a presenter; 4) presenters eld of view: angle between a vector from a presenter to an object and a vector from a presenter to a listener;
189
Fig. 3. Fig. 2. Setting 1: A listener stands near an object.
Setting 2: A listener stands too far from an object.
5) presenters body orientation: angle between a frontal vector of a presenter (direction presenters body is facing) and a vector from presenter to listener. 2) Setting1 (When a Listener Stands Near an Object): We tested whether a presenter simultaneously satises constraints C1 and C2 in the simplest setting, i.e., when a listener stands near the target object. We set three standing positions of a listener toward the target object (see Fig. 2). The three standing positions are right front, front, and left front of the computer. Since the presenters initial position is 1700 mm from the object, he/she needs to approach the listener from behind. We analyzed a total of nine presenter positions after they approached and stopped at a certain point to perform their tasks. The analysis results indicate that the presenters stood near the listener at an average distance of 1185.44 mm (standard (Std.) 158.24 mm). The presenters stood close to the object as well. The average distance between presenters and objects was 1037.84 mm (std. 160.83 mm). These results show that the presenters position simultaneously satised constraints C1 and C2. Based on this nding, we established a simple model for the presenters position Value(P x) = Listener(P x) + Object(P x) Listener(P x) = 1, 0, 1100 mm < Dist between P x and Listener < 1300 mm otherwise
Object(P x) = 1, 0, 1000 mm < Dist between P x and Object < 1200 mm otherwise. (1) This formation shows that the areas around both a listener and an object have high value as optimal standing positions for an informationpresenting robot. Here, P x is a possible position for a presenter, which represents any position in the area where the robot can stand. For simplicity, we dene the area where a presenter is close enough to a listener or an object by a circular band around the listener or object, and we use only an average distance of 100 mm as a parameter. Listener(P x) is a function that indicates whether presenter position P x is close enough to the listener. Based on the average distance to a listener, we set 1200 mm as the optimal interpersonal distance and model that area with the optimal interpersonal distance 100 mm as a value of 1. Object(P x) is a function that indicates whether presenter
position P x is close enough to an object. When a listener stands near an object, the presenter maintains a distance to it. Based on this average distance to an object, we set 1100 mm as the optimal distance to an object and then model that area with optimal interpersonal distance 100 mm as a value of 1. There are some cases when a presenter cannot stand near either a listener or an object because the listener is standing far from the object. Accordingly, we dened Value(P x) not as the product but as the sum of Listener(P x) and Object(P x). Position P x with maximum value should be chosen as the optimal standing position. 3) Setting2 (When a Listener Stands Too Far From an Object): However, formula (1) was too simple. Multiple different areas with maximum value exist when a listener is far from the object. We do not know whether the presenter should be more concerned about being close to either the listener or the object. Both cases exist. People, i.e., both presenters and listeners, sometimes talk about an object from a distant place. In a store setting, to increase the attention toward an object, a shopkeeper sometimes addresses himself/herself to it, even though the listener is far from the object. Perhaps a presenters nal position changes, depending on his/her initial position. To observe this, we set the ve initial presenter positions shown in Fig. 3(a)(e): (a) close to the listener; (b) relatively close to the listener; (c) between the listener and object; (d) relatively close to the object; and (e) close to the object. The presenter was asked to move to the position to provide information about a computer to a listener. The direction of the listeners body faced the target object in all trials. Observation revealed that the initial presenter positions affected the nal positions. When the initial presenter position was close or relatively close to a listener, the presenter provided information about the object close to the listener. When the initial presenter position was close or relatively close to an object, the presenter talked about the object near it, i.e., a presenter followed (1) while minimizing his/her own effort or movement. Thus, we chose the simplest way to adopt this nding. We modied (1) to decrease on the basis of the distance between P x and the presenters current position as Value(P x) = Listener(P x) + Object(P x) Dist (2)
Dist = dist(P x, P current).
Here, we chose dist(A, B) as a function to calculate the distance between positions A and B. P current is a current position of a presenter. 4) Setting3 (When a Listeners or Presenters Field of View is Blocked): We observed whether the presenter position satises constraints C3 and C4 by intentionally arranging the presenters initial
190
Fig. 5. Constraints of listeners and presenters elds of view. (a) Listeners eld of view. (b) Presenters eld of view.
The average of the value was 0.57, and the standard deviation was 0.18. This result shows that the presenter half-faces the listener and the object. We dened the presenters body orientation as follows:
Fig. 4. Setting 3: A listeners or presenters eld of view is blocked. (a) Presenter hide object from listeners view. (b) Listener hides object from presenters view.
Presenters body orientation = Presenters eld of view 0.5. Based on this, the presenter can see both the listener and an object. When the listeners eld of view exceeds the limit, the presenter cannot simultaneously look at both the listener and the object [see Fig. 5(b)]. D. Position Model of a Presenter To summarize the observation analysis, we established the following model for the robot to appropriately control its position: Value(P x) = ((Listener(P x) + Object(P x)) L View(P x)P View(P x)) Dist 1, 0, 1100 mm < Dist. between P x and Listener < 1300 mm otherwise
position to block either the listeners or the presenters eld of view. Note that, technically, the listener may not even look at the presenter (e.g., the presenter stands behind the listener), or the presenter may not even simultaneously look at both the listener and the object (e.g., standing side-by-side) while discussing the object. Such scenes sometimes happen in reality. Our question is whether a presenter tries to avoid these situations, even when constraints C1 and C2 are satised. We rst set the presenters initial position where he/she hides the object from the listeners eld of view [see Fig. 4(a)]. Alternately, we also set the initial position of the listener where he/she hides the object from the presenters eld of view [see Fig. 4(b)]. For both the situations presented in Fig. 4(a) and (b), the presenter moves to the place that satises constraints C3 and C4. Based on these ndings, we modied (2) as follows: Value(P x) ((Listener(P x) + Object(P x)) L View(P x)P View(P x)) = Dist L View(P x) = P View(P x) = 1, 0, 1, 0, Listeners eld of view < 90 otherwise Presenters eld of view < 150 otherwise. (3)
Listener(P x) =
Object(P x) = 1, 0, 1000 mm < Dist. between P x and Object < 1200 mm otherwise 1, 0, 1, 0, Listeners eld of view < 90 otherwise (Presenters eld of view < 150 ) otherwise.
L View(P x) = P View(P x) =
Here, L View(P x) or P View(P x) is a function that indicates whether the position of the presenter satises the constraint C3 or C4. We dened (3) as a product of (2) and L View(P x) and P View(P x) because the presenter must always satisfy constraints C3 and C4. We analyzed the averages of the maximum listeners eld of view in all trials from setting 1 to 3. Based on the results (average (Ave.) 88.3 , Std. 11.8 ), we set 90 as the limit of the listeners eld of view. When the listeners angle exceeds the limit, the listener cannot simultaneously look at both the presenter and the object [see Fig. 5(a)]. Moreover, we analyzed the averages of the maximum presenters eld of view throughout all trials from settings 1 to 3. Based on the results (Ave. 148.8 , Std. 17.6 ), we set 150 as the limit of the presenters eld of view. However, over 90 , the presenter cannot look at both the listener and the object while facing either the listener or the object. Throughout all trials in settings 1 to 3, we analyzed the value of the presenters eld of view divided by the presenters body orientation.
Position P x with maximum value must be chosen as the optimal standing position. Moreover, when the robot is in the optimal standing position, its direction is set by the following formula: Presenters body orientation = Presenters eld of view 0.5. Consequently, when the presenter follows our model, it can control the spatial relationship, as shown in Fig. 6. III. IMPLEMENTATION We implemented our proposed model for a humanoid robot. In this paper, we are interested in the robots capability for spatial arrangement. Since robot recognition is often unstable in recognizing speech, we adopted a Wizard of Oz (WoZ) method and used a motion-capturing system to stably recognize peoples positions and body orientations.
191
Fig. 6.
Example of positioning based on proposed model.
Fig. 7.
System conguration. Fig. 8. Position Controller. (a) Example of searching for robots optimal position. (b) Example of searching for a path from robots current position to optimal position.
A. System Conguration Fig. 7 shows the system conguration, which consists of a humanoid robot, a motion-capturing system, and a robot controller (software). We used a humanoid robot named Robovie, which is characterized by its human-like body expressions [14]. Its body consists of eyes, a head, and arms that generate complex body movements required for communication. In addition, it has two wheels for movement. Markers are attached to the robots and a listeners head and arms. The motioncapturing system acquires their body motions and outputs the position data of markers to the robot controller. Based on the position data and the command from an operator, the robot controller plans the robots behavior and makes the robot behave. 1) Recognition Controller: This controller recognizes a listeners behaviors, such as about which object the listener has a question. However, it is difcult for the current robot to stably hear what the listener is asking. In this paper, we adopted a semiautonomous system to focus on where a robot should stand to optimize its position to present an object. Instead of this controller, the operator informs the target object to the robot controller when the listener asks the robot to present an object. Moreover, the operator gives the utterance to the robot controller based on the listeners request. 2) State Controller: This controller controls the state, which is either Present or Listen, based on the situation. When the operator conveys the target object to the robot controller, the state changes from Listen to Present. When the operator gives a Listen command to the robot controller, the state changes from Present to Listen. 3) Position Controller: Based on the position data and the output of the state controller, this controller decides the position and orientation of a robot. The position controller consists of two modules: adjust standing position and maintain the direction. Based on the state, this controller chooses the module. When the state is Present, it chooses the
adjust standing position module. When the state is Listen, it chooses the maintain direction module. a) Maintain direction module: This module makes the robot maintain its direction toward the listener. Based on an order from this module, the robot faces the listener. b) Adjust standing-position module: This module decides the optimal standing position for the information-presenting robot based on our proposed model. We set the search area for this module in Fig. 8(a). This search area contains the optimal standing position for the presenter. The position exists in this area because it should be 1.1 m around the listener or 1.3 m around the object. The search area is divided by a grid into possible 10-cm2 standing positions, which are indicated by P x in our model. This module estimates the values of all the grids, which are indicated by Value(P x) in our model. Finally, the module selects the position with the highest value as the optimal one, as shown in Fig. 8(a). This module decides a path for the calculated target position based on the positions of computers and the listener as obstacles. We set the obstacle area as 1.0 m around each obstacle and the search area for this module, as shown in Fig. 8(b). This search area, which contains the optimal standing position for the presenter, is divided by a grid into possible 20-cm2 standing positions. To search for a path, this module uses an A-star algorithm that is guaranteed to get the fastest path. An example of a path using this algorithm is shown in Fig. 8(b). Finally, this module moves the robot based on the calculated path. 4) Gesture Controller: The gesture controller manipulates the robot based on the state, the results of the position controller, and the position data. When the state is Present, this controller makes the
192
Fig. 10. Experimental scene. (a) Setting 1: Listener stands near PC. (b) Setting 2: Listener stands far from PC.
Fig. 9.
Gesture controller.
robot maintain eye contact with the listener and points at the object after the robot arrives at the target position (see Fig. 9). 5) Utterance Controller: For the experiment, we simplied this function. A human operator chose the sentences to utter from prepared candidates. When a system is prepared as fully autonomous, the utterance controller will have a complex form to optimize explanations to people. It must integrate the results from the speech recognizer, the previous conversation with the target person, and appropriate strategies to discuss products and exhibits them depending on the target person. IV. EVALUATION We conducted an experiment to verify that our proposed model based on observations of humanhuman interaction is useful for an information-presenting robot. The experimental protocol was reviewed and approved by our institutional review board. A. Method 1) Experimental Conditions: To verify the effectiveness of our proposed model, we set three conditions. In one, the informationpresenting robot stands based on our proposed model. For comparison, we also prepared two other conditions where the robot only stands near the listener or near an object, since most existing robots seem only to care about the distance to the listener. 1) Near-listener condition: The robot only stands near the listener. The position is obtained from the search standing-position module to satisfy only the constraint of distance to a listener and is closest to the current robot position. After arriving at the optimal position, the robot faces the listener. 2) Near-object condition: The robot only stands near the object. The position is obtained from the search standing-position module to satisfy only the constraint of distance to an object and is closest to the current robot position. After arriving at the optimal position, the robot faces the listener. 3) O-Space condition: The robot stands based on our proposed model. The experiment had a within-subject design, and the order of all experimental trials was counterbalanced. Every participant experienced all the three conditions. 2) Procedure: Twenty-two paid undergraduate students (13 men and nine women, average age 21 years) participated in this experiment. None of them had robotics as an academic major. The experiment was performed in a 7.5 m 10.0 m room. Due to the limitations of the motion-capturing system, participants interacted only with the robot
within a 3 m 3 m area. Four laptop computers were set in the area shown in Fig. 2. In this situation, the listener/customer (participant) enters the shop and the presenter/shopkeeper robot presents four laptop computers. To verify the effectiveness of the model in various situations, we prepared two settings. In setting 1, the listener was instructed to listen to a nearby computer when asking for an explanation [see Fig. 10(a)]. In setting 2, the listener was instructed to listen from a distant computer when asking for an explanation [see Fig. 10(b)]. In setting 1, the listener moves in front of the computer about which he/she would like to ask a question. After arriving at the position, the participant asks the robot to explain the product. In setting 2, the participant moves somewhere in our established limited area and asks the robot to present information on a computer far from his/her current standing position. The following procedure is identical in settings 1 and 2. The participant asks the robot please explain this/that computer and points to it. After receiving the participants request, the robot begins to move toward the target position based on each condition, while giving a brief introduction of the computer, for example, That computer is a new SONY model. After the robot arrives at the position, the participant asks about the characteristics of the computer and points to it. The robot presents its characteristics, for example, One characteristic of this computer is high specs. If you want to play games, this computer is suitable for you. After the explanation, the robot asks, Would you like to hear about some other computers? The participant moves to the next position and asks the robot about another computer. In this way, the participant requests and receives information four different computers. After participants experience the interaction with the robot in settings 1 and 2, they evaluate the robot once under each condition. 3) Evaluation Method: We administered a questionnaire to obtain participant impressions. Participants answered the following questions on a 17 scale, where 7 is the highest. 1) Comfortable for speaking: Were you comfortable with the robots standing position when you were speaking? 2) Comfortable for listening: Were you comfortable with the robots standing position when you were listening? 3) Likable: Did you like the robot? The comfortable for speaking and comfortable for listening questions measured how the participants felt while speaking or listening to the robot during interaction. The likable question measured how much they liked the robot during interaction. After all the experiments, we also asked the participants about the best condition. 4) Which condition do you think was the best?
193
Fig. 11. Result ( denotes signicant difference at the p < 0.05 level, and +denotes a marginally signicant difference at the p < 0.1 level. Error bars show the standard error of the mean). (a) Comfortable for speaking. (b) Comfortable for listening. (c) Likable.
TABLE I WHICH CONDITION IS THE BEST?
B. Hypothesis and Prediction Since the robot in the near-listener or near-object condition is only concerned with one of the O-space constraints, a listener will evaluate the robot based on our proposed model that is higher than the other two conditions. Based on this hypothesis, we predict the following experimental results. Prediction: Participants will feel that the O-space condition is the most comfortable to speak and listen and more likable among all three conditions. C. Verication of Our Prediction Fig. 11 shows the results of the subjective impressions. 1) Comfortable for Speaking: We conducted a repeated-measure analysis of variance (ANOVA) that showed a signicant difference between conditions (F(2, 42) = 11, p < 0.01). We conducted multiple comparisons with Bonferroni methods that showed that O-space was preferred over near-object (p < 0.05) and near-listener (p < 0.05). 2) Comfortable for Listening: We conducted a repeated-measure ANOVA that revealed a signicant difference between conditions (F(2, 42) = 14.90, p < 0.01). We conducted multiple comparisons with Bonferroni methods that showed that O-space was preferred over nearobject (p < 0.05) and near-listener (p < 0.05). The results also showed a trend where near-object was preferred over near-listener (p < 0.1). 3) Likable: We conducted a repeated-measure ANOVA that showed a signicant difference between conditions (F(2, 42) = 4.64, p < 0.05). We conducted multiple comparisons with Bonferroni methods that showed that O-space was preferred over near-object (p < 0.05). The results also showed a trend where O-space was preferred over near-listener (p < 0.1). Table I shows the results for the condition chosen as the best. 4) Which Condition was the Best?: We conducted a chi-square test (x2 (2) = 14, p < 0.01) and multiple comparisons with Ryan methods. The results showed that O-space was preferred over near-object (p < 0.05) and near-listener (p < 0.05). Participants gave the highest evaluation to the O-space condition for comfortable for speaking and comfortable for listening. Moreover, they gave higher evaluations for the O-space condition than the near-object for likable. Because there was a marginally signicant
difference between the O-space and near-listener conditions for likable, we found that participants marginally liked the O-space condition better than the near-listener condition. We believe that they preferred the O-space condition to the near-listener condition because most chose the O-space condition as the best condition. Based on these results, we veried our prediction that the participants would feel that the O-space condition is most comfortable when speaking and listening and was most likable among the three conditions.
V. DISCUSSION A. Contributions This paper reports the development of a robot that presents objects to people and appropriately controls its position toward the partner and the object. This capability is expected to be indispensable, since in the future, many robots will be functioning in daily situations as shopkeepers presenting products to customers or museum guides presenting information to visitors. Since several robots have already worked in real-world sites such as a museum [15], a station [16], and shopping mall [17], our proposed model could be useful for such information-presenting robots. Psychology usually focuses on phenomena and the factors behind the phenomena. Consequently, how humans assume their positions during conversation involving objects has not been studied. In this study, we established a positioning model for when a presenter explains an object to a listener. We believe that this models ndings illuminate the human-positioning mechanism.
B. O-Space Constraints for a Presenter Our proposed model consists of four constraints to establish O-space: proximity to a listener, proximity to an object, listeners eld of view, and presenters eld of view. We consider the reason for each constraint as follows: Concerning proximity to a listener, humans usually keep a certain distance from a partner, as Hall has already shown [2], and a presenter tries to approach a listener to make it easier for the listener to listen to his/her utterance. Concerning proximity to an object, approaching an object makes it easier for a presenter to explain while using pointing gestures. Concerning the listeners eld of view, looking at both the presenter and the object makes it easier for the listener to understand the presenters explanation. Concerning the presenters eld of view, looking at both the listener and the object makes it easier for the presenter to explain the object to the listener.
194
Based on our observations of humanhuman interaction situations, we found that a presenter must ensure both the listeners and presenters elds of view. This nding shows that it is important for a presenter and listener to simultaneously look at both the partner and object during conversation. The experimental results also show the importance of both the listeners and presenters elds of view. The robot in near-listener and near-object conditions sometimes prevented a listener from looking at an object because the robots sometimes stood between the listener and the objects. In contrast, the robot based on our proposed model did not do this because the robots stood to ensure both the listeners and presenters elds of view. We suppose that these differences caused the differences in participant evaluation among the three conditions. C. Parameter of the O-Space Constraints Because we simplied the position model for our experiment, one future work is to improve it for more realistic daily settings. First, our current model should be extended to use continuous values; currently, it is limited to discrete values, which may cause some problems. For example, if a robot is just slightly further away than the maximum on each dimension concerning distance to listeners and objects, the value of its position is suddenly lost. Thus, when the robot cannot stand on each dimension due to some obstacle, it cannot decide its optimal position. To solve this problem, we must extend the model for continuous values in future work, which will require further modeling by combining different constraints. Second, although we focus on the static relationship between the positions of a presenter, a listener, and an object, we must be more concerned about developing a dynamical model in future work. The constraint of distance to listener must be improved because the interpersonal distance of the constraint is set as a constant value. Many previous studies have revealed that a robot must adjust its distance to humans based on its personality or the situation. The interpersonal distance should be adjusted based on the partners situation, etc. In addition, these distances, the orientations, and the positioning of bodies might be affected by the nature of the object being discussed, the importance of its identifying features, and many other task-specic factors. For example, if the presenter wants to emphasize a particular feature of an object, he/she might need to approach or face the object. D. Effectiveness of Our Proposed Model The experimental result shows that our proposed model improves the participants impression toward the robot. We suppose that we could show a more powerful effect of our model by doing some behavioral or task-oriented operations. For example, we might show that a robot using our model helps people remember information longer or increases desire to buy an object more than in the other conditions. There are good human information presenters who make customers like them and convince the customers to buy the products. Of course, a good presenter should have exceptional speaking skills. However, we believe that good positioning skill is also one of the important skills for a good information presenter. E. Other Modalities in Presenting Information Other modalities in explaining scenes have already been studied, such as pointing control [18] and gaze control [19]. We believe that positioning control strongly relates to such modalities in informationpresenting situations. Pointing control is important for a robot to present information to a listener. The pointing gesture of the robot makes it easier for a listener
to understand which object the robot is talking about. Since the robot based on our model stands at a position to look at a person and the target object, the robot can point at the object while making eye contact with a partner without readjusting its body orientation. Gaze control is also important for an information-presenting robot. In this experiment, the robot always maintained eye contact with the listener. However, depending on the situation, humans look at target objects, as well as the listener. Kuno et al. suggested that robot head movement encourages interaction with museum visitors [20]. Thus, our future works will also include controlling robot eye gaze, which is enabled by our model as the robot stands at a position to look at both a person and the target object. F. Limitations Since we only tested this system with a humanoid robot, i.e., Robovie, the ndings may not apply to other robots. We do believe, however, that similar behavior will result even with other robots because the experimental method is mostly independent of Robovies appearance, except that it has an anthropomorphic head and arms. Robovie has a relatively simple appearance compared with other humanoid robots, such as Asimo [21]. In nonverbal interaction, Kanda et al. demonstrated that peoples responses were similar for different humanoid robots or humans [22]. Thus, we believe that people will behave in a similar way, even if we use a humanoid robot with a different appearance, which will result in similar impression trends. In contrast, we should consider developing a robot whose size is different from Robovies. Even though Robovie can hide the object from participants, despite being much shorter than most humans, some participants did not care. Perhaps people are not concerned if a short robot stands in front of the object. VI. CONCLUSION We established a model for information-presenting robots to appropriately adjust their position. The model consists of four constraints to establish O-space: proximity to a listener, proximity to an object, listeners eld of view, and presenters eld of view. Through observation of humanhuman interaction, we found that ensuring both the listeners and presenters elds of view is especially important. We implemented our model for a humanoid robot with a motion-capturing system. The experimental results veried its effectiveness and showed that an information-presenting robot using our model presents an object better than by using simpler models. We believe that this model could be useful for information-presenting robots in our future daily life. ACKNOWLEDGMENT The authors would like to thank B. Mutlu of the HumanComputer Interaction Institute, Carnegie Mellon University, for his advice. This paper reports additional explanations about the experiments and discussions. REFERENCES
[1] F. Yamaoka, T. Kanda, H. Ishiguro, and N. Hagita, How close? A model of proximity control for information-presenting robots, in Proc. ACM/IEEE Annu. Conf. Human-Robot Interact., 2008, pp. 137144. [2] E. T. Hall, The Hidden Dimension: Mans Use of Space in Public and Private. London, U.K.: Bodley Head, 1966. [3] A. Kendon, Conducting InteractionPatterns of Behavior in Focused Encounters. Cambridge, U.K.: Cambridge Univ. Press, 1990. [4] Y. Nakauchi and R. Simmons, A social robot that stands in line, in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 2000, pp. 357364.
195
[5] T. Tasaki, K. Komatani, T. Ogata, and H. Okuno, Spatially mapping of friendliness for humanrobot interaction, in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 2005, pp. 52526. [6] E. A. Sisbot et al., A human aware mobile robot motion planner, IEEE Trans. Robot., vol. 23, no. 5, pp. 874883, Oct. 2007. [7] M. L. Walters, K. Dautenhahn, K. L. Koay, C. Kaouri, R. te Boekhorst, C. L. Nehaniv, I. Werry, and D. Lee, Close encounters: Spatial distances between people and a robot of mechanistic appearance, in Proc. IEEERAS Int. Conf. Humanoid Robots, 2005, pp. 450455. [8] K. Dautenhahn et al., How may I serve you?: A robot companion approaching a seated person in a helping context, in Proc. ACM SIGCHI/SIGART Conf. Human-Robot Interact., 2006, pp. 172179. [9] R. Gockley, J. Forlizzi, and R. G. Simmons, Natural person-following behavior for social robots, in Proc. ACM/IEEE Int. Conf. Human-Robot Interact., 2007, pp. 1724. [10] E. Pacchierotti, H. I. Christensen, and P. Jensfelt, Evaluation of passing distance for social robots, in Proc. IEEE Int. Workshop Robot Human Interact. Commun., 2006, pp. 315320. [11] K. L. Koay, D. S. Syrdal, M. L. Walters, and K. Dautenhahn, Living with robots: Investigating the habituation effect in participants preferences during a longitudinal humanrobot interaction study, in Proc. IEEE Int. Conf. Robot Human Interact. Commun., 2007, pp. 564569. [12] H. Kuzuoka, K. Yamazaki, A. Yamazaki, J. Kosaka, Y. Suga, and C. Heath, Dual ecologies of robot as communication media: thoughts on coordinating orientations and project ability, in Proc. SIGCHI Conf. Human Factors Comput. Syst., 2004, pp. 183190. [13] H. Huettenrauch, K. S. Eklundh, A. Green, and E. A. Topp, Investigating spatial relationships in humanrobot interaction, in Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., 2006, pp. 50525059. [14] T. Kanda, H. Ishiguro, M. Imai, and T. Ono, Development and evaluation of interactive humanoid robots, Proc. IEEE, vol. 92, no. 11, pp. 1839 1850, Nov. 2004. [15] R. Siegwart et al., Robox at expo. 02: A large scale installation of personal robots, Robot. Auton. Syst., vol. 42, pp. 203222, 2003. [16] K. Hayashi, D. Sakamoto, T. Kanda, M. Shiomi, S. Koizumi, H. Ishiguro, T. Ogasawara, and N. Hagita, Humanoid robots as a passive-social mediumA eld experiment at a train station, in Proc. ACM Annu. Conf. Human-Robot Interact., 2007, pp. 137144. [17] K. Nohara, T. Tajika, M. Shiomi, T. Kanda, H. Ishiguro, and N. Hagita, Integrating passive RFID tag and person tracking for social interaction in daily life, in Proc. Int. Symp. Robot Human Interact. Commun., 2008, pp. 545552. [18] O. Sugiyama, T. Kanda, M. Imai, H. Ishiguro, and N. Hagita, Humanlike conversation with gestures and verbal cues based on a three-layer attentiondrawing model, Connect. Sci., vol. 18, no. 4, pp. 379402, 2006. [19] B. Mutlu, J. K. Hodgins, and J. Forlizzi, A storytelling robot: Modeling and evaluation of human-like gaze behavior, in Proc. IEEE-RAS Int. Conf. Humanoid Robots, 2006, pp. 518523. [20] Y. Kuno, K. Sadazuka, M. Kawashima, K. Yamazaki, A. Yamazaki, and H. Kuzuoka, Museum guide robot based on sociological interaction analysis, in Proc. SIGCHI Conf. Human Factors Comput. Syst., 2007, pp. 11911194. [21] K. Hirai, M. Hirose, Y. Haikawa, and T. Takenaka, The development of the Honda humanoid robot, in Proc. IEEE Int. Conf. Robot. Autom, 1998, pp. 13211326. [22] T. Kanda, T. Miyashita, T. Osada, Y. Haikawa, and H. Ishiguro, Analysis of humanoid appearances in humanrobot interaction, in Proc. IEEE/RSJ Int. Conf. Intell. Robot. Syst., 2005, pp. 6269.
Path Planning for Improved Visibility Using a Probabilistic Road Map

Matthew Baumann, Simon L onard, Elizabeth A. Croft, e and James J. Little
AbstractThis paper focuses on the challenges of vision-based motion planning for industrial manipulators. Our approach is aimed at planning paths that are within the sensing and actuation limits of industrial hardware and software. Building on recent advances in path planning, our planner augments probabilistic road maps with vision-based constraints. The resulting planner nds collision-free paths that simultaneously avoid occlusions of an image target and keep the target within the eld of view of the camera. The planner can be applied to eye-in-hand visual-targettracking tasks for manipulators that use point-to-point commands with interpolated joint motion. Index TermsComputer vision, path planning for industrial manipulators, sensor positioning, visual servoing.
I. INTRODUCTION Integrating vision guidance into industrial-robot systems that are currently deployed on assembly lines and factories poses a signicant challenge, especially where the nature of the task is not entirely structured (e.g., random bin picking and general assembly, as opposed to predened pick and place, and line-following operations). Existing methods, such as visual servoing, aim to replace the position controllers that are common to most commercial systems with controllers that close the loop with visual feedback. Typical industrial robots, however, are accessed through proprietary interfaces that only accept point-topoint commands. Upgrading these interfaces to accept the velocity commands used in visual servoing requires substantial investments in programming the communication interface and the accompanying safety-monitoring systems. Thus, despite signicant interest in deploying vision-guided manipulators on existing assembly lines, robotics companies are hesitant to replace current controllers with visual-servoing systems. Given the many existing robotics systems that are already deployed, manufacturers are interested in approaches that allow the current systems to be upgraded with a vision-guidance module. In this study, we present a modular approach that addresses the problem of integrating vision guidance into semi-structured industrial tasks, where several practical challenges, such as visual occlusions and whole-arm collision avoidance are, for the most part, ignored by current visual-servoing methods. This paper introduces a vision-based probabilistic road map (VBPRM) that harmonizes the constraints that are inherent to visionguided robotics, with the reality of the manipulators found on assembly lines. The VBPRM nds paths that avoid collisions in the workspace and maintain the visibility of a target during the motion. To achieve this, a VBPRM uses the hard constraints, which are imposed by a
Manuscript received July 9, 2009; revised October 22, 2009. Current version published February 9, 2010. This paper was recommended for publication by Associate Editor F. Lamiraux and Editor L. Parker upon evaluation of the reviewers comments. This work was supported by Precarn, Inc., by the Natural Sciences and Engineering Research Council of Canada, and by Braintech, Inc. M. Baumann and J. J. Little are with the Department of Computer Science, The University of British Columbia, Vancouver, BC V6T 1Z4, Canada (e-mail: mabauman@cs.ubc.ca; little@cs.ubc.ca). S. L onard and E. A. Croft are with the Department of Mechanical Engie neering, The University of British Columbia, Vancouver BC V6T 1Z4, Canada (e-mail: sleonard@mech.ubc.ca; ecroft@mech.ubc.ca). Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/TRO.2009.2035745
1552-3098/$26.00 2009 IEEE

03

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

03

Hochgeladen von

Copyright:

Verfügbare Formate

IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO.

A Model of Proximity Control for Information-Presenting Robots

1552-3098/$26.00 2009 IEEE

IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 1, FEBRUARY 2010

IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 1, FEBRUARY 2010

Fig. 3. Fig. 2. Setting 1: A listener stands near an object.

Setting 2: A listener stands too far from an object.

Dist = dist(P x, P current).

IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 1, FEBRUARY 2010

IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 1, FEBRUARY 2010

Example of positioning based on proposed model.

IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 1, FEBRUARY 2010

IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 1, FEBRUARY 2010

TABLE I WHICH CONDITION IS THE BEST?

IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 1, FEBRUARY 2010

IEEE TRANSACTIONS ON ROBOTICS, VOL. 26, NO. 1, FEBRUARY 2010

Path Planning for Improved Visibility Using a Probabilistic Road Map

1552-3098/$26.00 2009 IEEE

Das könnte Ihnen auch gefallen