Sie sind auf Seite 1von 7

Proceedings of the 1993 I E S International Conference on Intelligent Robots and Systems Yokohama, Japan July 2630,1993 E WJ

Robot-Sensor System Integration by Means of 2-D Vision and Ultrasonic Sensing for Localization and Recognition of Objects
Anders Nilsson and Per Holmbeg Department of Physics and Measurement Linkdping University 581 83 Linkiiping, Sweden

Abstract - This paper presents a robot control model consisting of a stable 2-D vision camera combined with an ultrasonic sensor mounted in the robot gripper. The calibration procedure for the robot-sensor system is presented. The pqxsed- sensor configuration is also used for object idenfification. The object models used for identification as well as some practical experiments are presented. The plepesed sensor configuration is alternative to the more complex and costly 3-D vision systems for industrial robots.

/ ,
Camera model era model used is the so-called pin hole camera [3,4,5,6], illustrated in figure 1. The coordinate [X,, Y,, Z, ] [WiXwi, WiYwi, w i z w i , Wi 1 tO the corresponding image point [ui ,vi] with homogeneous coordinates [Xi. yi, Wi 1, can in matrix form be described a~ [4]
(1) where C is a 3 by 4 matrix representing the camera. Calibrating the camera means calculating (approximating) the C matrix, which is done i n the following way. If C is represented

I . Introduction Camera systems have been used (more or less successfully) for some time in manufacturing industry, performing in harsh conditions. Such systems offer great flexibility for a manufacturing cell, particularly for such tasks as automated assembly. I.e. a robotcamera system allows a work scene to be viewed from many different positions, enabling a vision system to triangulate points in three dimensions (3D) from multiple images [l]. The information derived can be used for object recognition, part dimension inspection, position relation among objects and general workpiece manipulation. Robot position control can be made if the sensors observe the location of calibration marks set on the robot arm. A system rclying on onc sensor only, however, isnt always enough [2], i.e. some vital part cant be seen by the camera. Our approach to this problem is to use several sensors and by fusing the information from these sensors get a more reliable result for object recognition and general workpiece manipulation. In our case a stable 2-D vision camera mounted above the work table and an ultrasonic range sensor mounted in a three-fingered robot gripper, in conjunction with tactile matrices mounted on the robot gripper. This sensor configuration is an alternative to more complex and costly 3D vision systems for industrial robots. The requirements for a fully flexible sensor-robot system are an accurate knowledge of the sensors positions and models for the sensors mathematical transformations. To relate external sensor information and robot to each other high accuracy is needed. In this application the vision system gives the dlrection (direction vector from camera) to found objects or features. Since the vision system is only 2-D, the distance from the camera to the object or feature is needed in order to estimate the correct object parameters. This distance is given by the ultrasonic range sensor, mounted in the robot gripper in such a way that it is measuring the distance parallel or approximately parallel to the optical axis of the camera system. The information derived can be used for object recognition, part dimension inspection, determination o l position relations among objects and general workpiece manipulation. Since a grasp is performed by opening the gripper wider, with respect to the object, and then closing the grip, we can allow an estimation error of about k 1-2 nim. The paper is organized as follows. Section 2 presents the camera model and the model for the ultrasonic sensor, and shortly describes the calibration procedure. In section 3, the modelling of the objects and identification algorithms are presented. Section 4 gives some experimeirtal results with these models applied on a real problem.
2. Robot-sensor system integration The Sensors used in this application is a 2-D vision camera and an ultrasonic range sensor, which is described a little more in detail. Outlines of the calibration procedures are given for each Sensor and the robot-sensor system.

as

C CO1 CO2 CO3 W

c20 C2l c22 c23

then

(3)
(4)

Equations (1) and (2) represent a homogeneous linear system in variables ci , each object point [Xi ,Yi , Zi] and matching image point [Ui ,vii providing two linear equations of the system. To solve for the values of Cij the system must be made nonhomogeneous. This is achieved by setting a nonzero variable within the matrix C equal to one. Since the term c23 has a scaling effect for 3-D honiogeneous coordinate points and the calibration is done in a 2-D plane this is a suitable choice. Thus

c,x;

+COJ

+ Co2Z; C03 +
-c20x,It;- c 2 / ~ u ; - c22ziu, U ; =

(5)
(6)

clox;+ c1/q+ c/2z; c/j t -c20x;v,- c 2 / p ; c,,z,v; = v; -

Equations ( 5 ) and (6) form the basis for the estimation of the C matrix. An estimation of the C matrix can be done using many measured points and using the least square method to find a best fit solution [71. The simple camera model derived here, linearly maps homogeneous world coordinates to homogeneous image coordinates. If the transformation cannot be accurately described by a linear transform, higher order polynom functions can be used to adjust the observed image coordintes [U, v] back to the original image coordinates [x , y]. This will not be described here, but the process is similar to the above described, see [lo]. In this application the camera gives the x and y coordinates, the direction vector to search in, while the height (z coordinate) is retrieved from the ultrasonic sensor. This means that calibration of the camera in a plane approximately perpendicular to the optical axis would be sufficient.

0-7803-0823-9/93/$3.0 (C) 1993 IEEE

1793

2.2.Ultrasonic range sensor The image depth is determined by use of an ultrasonic range sensor [8]. The ultrasonic sensor is mounted in the robot gripper in such a way that it is measuring the distance parallel or approximately parallel to the optical axis of the camera system, given that the gripper is in a predetermined orientation. Figure 2 shows the schematic placement of the ultrasonic sensors in the robot gripper. With notation according to figure 2 the distance zUis equal to.
Zu(k) =

where

(7)

where

zu(k) is the distance between the plane of the sensor elements and that of the work table at time k c is the speed of sound in ar =343m/s at 20 C. i )&t is the measured time delay between emission and reception of signal at time k d is the half distance between the transducen

A calibration procedure is used to estimate the distance function zu . The estimate is

x c , y c , ~ the coordinates of A in the camera are coordinate system. Xi, yi is the corresponding image point in the image plane. zu - H is the image depth; where z is measured by u the ultrasonic sensor and H is the height of the camera above the work table. If object As position can be determined in the camera coordinate system by the sensors, the 2-D vision camera and the ultrasonic range sensor, and transformed into world coordinates (XW,YW,ZW). the robot gripper can be moved in such a way that it is possible to grasp object A. The Droblem is consequently to determine the transformation matrix betwcen the camera cobrdinates ( x c , y c , ~ and the world coordinates ) (xw,yw,zw). A calibration procedure solves this problem by a calibration mark set on the robot gripper and observing its position in the camera coordinate system by the sensors (camera and ultrasonic range sensors) at known world coordinates given by the robot controller. Given these measured calibration points it is possible to estimate the rotation matrix R (3x3) and the translation vector T (3x1). A typical description is given by the following matrix equation [9],

where a0 and a1 are unknown constants. The least mean squares (LMS) estimate of the constants is obtained by minimizing the sum of squarcs S [7].

(9)
The standard deviation of the range estimation is better than 0.3 mm within the calibration interval, 10 - 650 mm. For more details see

[81.
2.3.Robot-Sensor system integration An example of a model for robot control, using a stable camera and an ultrasonic range sensor mounted in the robot gripper is shown in figure 3. Where W is the world coordinate system , B is the base coordinate system ( the table upon which the robot is to work), C is the stable camera coordinate system and I is the corresponding image plane coinciding with the base coordinate system B. G and U are the robot gripper and the ultrasonic range sensors coordinate systems, respectively. The position and orientation of the robot gripper are given by the robot controller, in world coordinates. A is an object arbitrarily located at an unknown position in the world coordinate system. To provide depth information the ultrasonic range sensor is used. It is mounted in the gripper in such a way that it measures the range parallel or approximately parallel to the optical axis of the 2-D vision camera. Consequently, the following relation between a camera coordinate (xc, yc, z, ) and the corresponding image point (xi, yi) in the image plane is given, using projection corrections due to depth information from the ultrasonic sensor, se fig 4. The euclidean equivalent trianglesand in figure 4 yield

where rmn are components of the rotation matrix R describing how the world coordinate system is oriented relative to the camera coordinate system. (Tx.Ty,Tz) are the components of the translation vector T describing how the world coordinate system is translated relative to the camera coordinate system. With an Euler angles representation the rotation matrix is given by the following matrix representation [91.
0

The relation between the image point (Xi,yi) combined with the depth information from the ultrasonic range sensor and the world coordinate (XW,YW,ZW). now be described by the following can equation by substituting (11-13) into (14).
Xi(1+)

y;( I Z ,

g) t T
H

-H

or in a more compac form

and the equivalent is true for the y coordinate. The coordinates could consequently be written as
Xc=Xi(I--)

XC,

yc, zc
(11)
(12)

zu

(13)

Given (xw,yW,zw), (Xi,yi), zu, and the cameras position above the work table H, there are 3 orientation angles (a$,y) and 3 translation components (Tx,Ty,Tz) which have to be estimated; altogether 6 parameters. This is a nonlinear estimation problem due to sine and cosine components of the rotation matrix R.To simplify the estimation process we instead estimate the individual components rmn of the rotation matrix R.Now it is a linear estimation problem but to the cost of more unknown parameters to estimate, totally 12 parameters. The rotation matrix R and translation vector T are estimated by the well known least mean squares method [7] which means that the sum of squares S is minimized.

1794

r132

r19 + r 2 9 + r322 = I + r232 + r332 = I

( r i i r21 r31l (ri2 r22 r32) = 0 ( r i i r21 r3i) (ri3 r23 r33) = 0 (r12 r22 1'32) (ri3 r23 r33) = 0

where denotes the scalar product. We derive the components for Rsm as r32 = sin (T) r33 = sin (A) r13 = cos(A) sirdo) r23 = cos(A) cos(0)

The solution to the least mean squares problem is given below.

[ A ?] = XI4 X c T [ X c
where

-1

XcTJl

(19)

Xw is the observed world coordinates given by the robot controller Xc is the measured camera coordinates, see (1 I)-( 13)
-1

(34) When calculating r22 and r12 the constraints (24) and (28) should be satisfied.

r12 =

r32r33 - r22r23 1'13

XcT [ X c X c T ] is called the pseudoinverse of XC if the inverse matrix exists. Only four different calibration points are necessary to find an estimate of the matrix R T (3x4). It is a requirement that the pseudoinverse is defined over the entire work space. To this end, use more calibration points and choose them to be distributed uniformly over the whole work space. Up to now the models are based on theoretic assumptions. In the following the correctness of our models will be verified by practical experiments. See also ref. [lo]. The rotation matrix R and translation vector T are used to get the objects position on the work table, which enables us to guide the robots tool center point PCP) to the objects centre. In order to rotate the gripper in position for grasping or iidentification the rotation of the TCP must be calculated.
2.4. Gripper orientation In order to grasp, or identify (with ultrasonics), objects in the right position and angle, the correct 0, A and T angle of the robot tool coordinate system I l l ] must be calculated. Where the 0 (orientation) angle is the angle formed between the world Y axis and a projection of the tool Z axis on the world XY-plane, the A (altitude) angle is the angle formed between the tool Z axis and a plane parallel to the world XY-plane and the T (tool) angle is the angle formed between the tool Y axis and a plane parallel to the world XY-plane. Given the rotation and translation matrix from the calibration procedure, the 0, A and T angles can be calculated for a grasp or/and ultrasonic identificationaccording to the following [9]: (20) Rsmrt (RstopJ-' = R Robject where R s m is the rotation of the gripper at the starting point given by the 0, A and T angles from the robot controller Rst, is the relative rotation change of the gripper neecfed to obtain the desired rotation for a grasp orland ultrasonic identification. R is the rotation between camera and robot &bjxt is the rotation of the object given by the camera system where the R*Robject is the desired rotation expressed in world coordinates and Rstop is the rotation change relative to R s m needed to obtain the desired rotation : (21) RSIOP (R R0biectj-l Rstort =
r l 1 r12 r13

[XI=z]x[ [ ir;]

(35)

where X denotes the crossproduct. The components of Rswt are estimated by (29)-(35) given Oswt, As- and TSm from the robot system, to obtain the Rstop matrix use (21) and then the Ostop,Asmp Tstopangles are estimated by (29)and (31). Where the calibration matrix R and the rotation of the object b b j e c t are given by the calibration procedure and the camera system respectively. 3 Models for Object Identification The essential in this section is how to integrate sensor information from a stable 2-D vision camera and ultrasonic sensing for object identification, not to derive new identification algorithms and feature extraction methods. Therefore standard identification algorithms and straightforward feature extraction methods based on insight to the problems are used. Since selection of feature vectors are important and strongly affects the design of the identification algorithm, it is preferable to find the most effective feature vector representation that classifies samples into correct classes. If this is obtained, the design of the identification algorithm can be done easily and with good performance. However, the selection of features are very much problem-oriented, figure 7 shows the identification task. These objects are randomly placed on the work table and the goal is to classify not just the object but also which view it is, as well as its location. Object A has four different classifiable views: top view, bottom view, side view with hole and three identical side views without hole. Object B has three different classifiable views: .top or bottom view, side view with hole and three identical side views without hole. Object C has three different classifiableviews: top view,bottom view and side view. Object D has two different classifiableviews: top or bottom view and side view. Altogether there are 6*6*3*2 = 216 views of interest. These objects are designed in such a way that 2-D vision data are not enough for reliable object view classification. The side view of object B (object B lying down not showing the hole) and side view of object C (object C lying down) are identical, in the sense of 2-D vision information. This means that there are 4*2*2*2 = 32 classifiable views based on 2-D vision information. To increase the classifiability, ultrasonic sensing can be used for surface shape determination in the sense to determine if the surface is Dlanar or curved. If 2-D vision data and ultrasonic sensor data are Gombined all the objects are identifiable but not all views. Our approch to identify the remaining views are simply to grasp the object and rotate it and perform a new classification. This procedure is repeated until the view is classified. One natural feature vector representaion for our classification problem, based on practical experience, for the stable 2-D vision camera information are the following.

(22)
and using the algebraic constraints: r112 i r212 i r312 = 1
13 !r32 r33

(23)

1795

Object: area, perimeter, max radii, min radii and hole area (no hole means that the hole area is zero). The feature extraction for the ultrasonic sensor mounted in the robot gripper is a little bit different. Signature analysis of the received echo signal from the object of interest are used. In this application we are using the fact that the structure of the received echo signal has different shape for planar and curved surfaces respectively. Upper frame of figure 8 shows the received echo signal from a plane surface, object B lying down on the work table and the lower frame shows the received echo signal from a curved surface, object C lying down on the work table. The main difference between these two frames is that the main echo energy comes from the flat surface of object B and for object C the main echo energy comes from the background. In this typical application it is sufficient to use the distance du between the two main peaks of the received echo signal as a feature parameter. The distance du is positive if the main recieved echo energy is from the nearest surface otherwise it is negative. Estimation of the reference set of feature vectors for each object views are performed in a learning phase. If the sample size is n, for the vision data, the feature vector is definded as, for object j and view k

jjk =

t a r e a j d i ) , n fperi,n.jk(i, , fnlarrradiiik(i) , . r=I r=I r=l

' $.n i n r a d ; $ d i ) ,;$hole areojdi)} n (36) !=I !=I Once the set of reference feature vectors are estimated the class separability of the collected data is measured. That is performed in the following way, if the Euclidean distance d between two feature vectors exceeds a threshold E the vectors are separable, if the threshold E is not exceeded, the measured data does not carry enough classification information. I.e. these feature vectors are almost identical. In our application this will be the case for object B (view: lying down) and object C ( view: lying down). It means that the feature vector (vision data) representing these two views are interpreted as the same object view with "identical" feature vector and by using ultrasonic sensing they can be classified. Finally, the 2-D vision sensor are used for main classification, all nonseparable feature vectors are interpreted as the same feature vector indicating that ultrasonic sensing must be used for final classification. This means that the robot gripper has to be moved in such a way that the ultrasonic sensors are pointing down at the object of interest. In our case it is sufficient to choose the identificationalgorithm to be the smallest Euclidian distance d between the set of reference feature vectors and the observed nonclassified vector. If the smallest Euclidian distance d exceeds a threshold value Et the observed object view will be classified as unknown. The well known Euclidian distance

the camera and the range sensor at known world coordinates (xw,yW,zw) given by the robot controller the transformation matrix [RT] is estimated. The chosen calibration points, altogether 15, form a truncated pyramid with its base plane covering the whole image plane (=700x700 mm) and with a maximum height of 650 mm. A A A plot of the calibration error E = X - [RTIX, for the individual , x,y and z components are shown in figure 6. The maximum deviation is approximately f 1.4 mm which corresponds to = f l image pixel. This is reasonable, due to the digitalization process of the image. The accuracy of the calibration method is however dependent on: 1. the accuracy of the reference data given by the robot controller 2. how well the assumption is satisfied that the ultrasonic sound axis is parallel to the optical axis The next task is to identify the objects given in figure 6, randomly placed at the work table. In order to grasp the objects their position and orientation in world coordinates must be known. By use of A A equation X, = [RTIX, an estimate of object position is given and k Rot(q,mbject) gives the estimation of object orientation. Where CXobject is the angle of the major axis of inertia of the object measured by the vision system. By use of equation R s m (RslopJ-1 = k ROt(zc-aobject) the robot can be controlled in a correct way, Where R s m t is the rotation of the gripper at the reference position and Rstop is the relative rotation change of the gripper needed to obtain the desired rotation corresponding to the rotation of the object. After a learning phase of 40 samples for all object views an estimate of the reference set of feature vectors (36) are obtained. As classifier the Euclidian distance d between the set of reference feature vectors and the observed nonclassified vector are used. In this application, the probability for correct identification is 97% due to the combination of sensor data from 2-D vision and ultrasonic sensing. The accuracy of the position and orientation estimates in world coordinates of the objects are within k1.4 mm and f 1.0" respectively.

5 Conclusions In this paper we have presented a robot control model consisting of a stable 2-D vision camera combined with ultrasonic sensing attached to the gripper. The calibration procedure for the robot-sensor system is derived. The calibration accuracy of the proposed system configuration is better than k 1.4 mm or k 1 image pixel in the work space 7 0 0 ~ 7 0 0 ~ 6 mm. This sensor configuration are also used for 50 "3-D" object identification with very promising result. The accuracy of position and orientation estimates in world coordinates of the identified objects are within k1.4 mm and f 0.5" respectively. These encouraging results are obtained due to the method of sensor data integration. This sensor configuration is consequently a vigorous alternative solution to the more complex and costly 3-D vision system.

where

djk is the Euclidian distance between object j view k and the observed feature vector is the reference feature vector for object j view k f&sew@j is the observed nonclassified feature vector Related references to this section are given in [12] - [14].
4 Experimental Results The experimental platform is shown in figure 5. It consists of a Puma 562 robot equipped with a 3-finger gripper, with an ultrasonic sensor attached. Above the work space at a height of 1450 mni a stable 2-D vision camera is mounted. For more details see [15]. In order to use this sensor system configuration for object identification and determination of objects location and orientation, the robot-sensor system must be calibrated. The robot-sensor system calibration is equivalent to estimating the transformation matrix [RT], equation (19), provided that the vision camera and the ultrasonic range sensor are already calibrated. The calibration procedure looks like the following. By mounting a calibration mark on the robot gripper and measuring its coordinates (xc,yC,%) with

Acknowledgements This work has been carried out at the laboratory of Measurement Technology, Linkoping University, headed by professor Alexander Lauber, whose support and criticism has been of great value. The authors would also like to thank Per Holmbom for software aid and valuable discussions and Ingemar Grahn for help on the hardware. This projekt is part of a research programme sponsored by NUTEK (grant n 87-01956P),whose support is greatfully acknowledged. r

1796

References Gremban, K.D., Thorpe, C.E.,Kanade, T., 'Geometric Camera Calibration using System of Linear Equations', Proc. IEEE Int. Conf. on Robotics and Automation ,1988, ~01.1, 562-567. pp Holmbom, P., Pedersen, O.,Sandell, B.,Lauber, A.,'Fusing sensor systems : promises and problems', in Sensor Review, July, 1989, pp 143-152. Ito, M.,'Robot Vision Modelling - camera modelling and Camera calibration', in Advanced R.obotics, vo1.5 no. 3 ,1991, pp 321-335. Bowman, M.E.,Forrest, A.K.,Transformation calibration of a camera mounted on a robot', in Image and Vision Computing vol. 5 no. 4,1987. Len2,R.K. and Tsai, R.Y.'Calibrating a Cartesian Robot with Eye-on-Hand Configuration Independent of Eye-toHand Relationship'JEEE Trans. on Patt. Rec. and Machine Intell.,vol.ll no. 9, September 19139, pp 916-928. Paquette, L.,Stampfler, R.,Davis, W., Caelli, T., 'A New Camera Calibration Method For Robotic Vision', Proc. Close-Range Photogrammetry Meets Machine Vision, Zurich,Switzerland, Sept. 3-7,19910. h t r i i m , K.J. and Wittenmark, B.,'Identification in Computer Controlled Systems, Theory and Design', Prentice Hall, Inc., 1984, pp 324-342.

[8] [9] [lo]

[ 111

[I21

[13]
[ 141 [ 151

Holmberg,P., 'Robust ultrasonic range finder', in Measurement Science and Technology, no.3, 1992, pp 1025-1037. Craig, J.J., 'Introduction to Robotics', Addison-Wesley Publishing. Company, Inc.,1986. Nilsson, A. and Holmberg, P. , 'Robot-Sensor Integration by Means of a Stable 2D Vision Camera and an Ultrasonic Range Sensor', Proc.IEEE Instrumentation and Measurement Techn. Conference, Irvine, CA, USA, May 18-20. 1993. PUMA Mark J Robot 500 Series, Equipment Manual I I ,Unimation, 1986. Lindstedt, G. and 0lsson.G. 'Using Ultrasonics for Sensing in a Robotic Environment', Project Reports-Industrial Robots Part 11. T199253,Swedish National Board for Industrial and Technical Development. Lach, M. and Ermert, H., 'An Accoustic Sensor System for Object Recognition', Sensors and Actuators A, 1991, pp. 541-547. Fukunaga, K., 'Introduction to Statistical Pattern Recognition', Academic Press, New York, 1990. Holmbom,P., Holmberg,P., Nilsson,A. and Odeberg,H., 'Multi-Sensor Integration - The Sensor-Guided Wine Server', Proc. Int. Conf. on Intelligent Robots and Systems, Rayleigh, NC,USA, July 7-10, 1992, pp 1147-1154.

view plane

ZW
z c

Fig. 1. Camera model with its view plane and the effect of radial distortion of the lens.

robot arm

Camera coordinate

sensors
\ k I I
I I

Gripper and Ultrasonic coordinate system. G,U Base coordinate system, B

\ \

4 zb
>.__._

\I

Object A

reflecting work table

)yb -/Xb

system. W

Fig.2. Schematic view of the placement of the ultrasonic sensors in robot gripper and of the ultrasonic sound wave transmission

Fig.3. A robot control model with a stable 2-D

vision camera combined with an ultrasonic range sensor. The corresponding coordinate systems are also shown.

1797

Fig .4. Projection correction due to depth informationfrom an ultrasonic range sensor.xi is the uncorrected image point of the object A, xc is the corrected image point of the object due to depth information from the range sensor. zu is the object's height above the work table measured with the range sensor and H is the height of the camera above the work table.

ULTRASONIC FORCE. TACTILE & ULTRASONIC

ROBOT CONI'ROLLER

Fig. 5. The experimental platform.

Calibration error in world [mm1 coordinates

1.3

5 10 5 Calibration point number Fig 6. Plot of the calibration error for the x, y and z components in world co0rdinates.x errors a the solid line, y errors are the dashed line and z x errors are the dashdotted line.

-1.

1798

c.

A '

Fig 7. The upper part of the figure shows 4 different objects which are used as test objects when sensor data integration are studied for identification tasks. The lower part of the figure shows the assembled object.

Fig 8.

2 3 Time [ms] (Offset added in order to separate the curves.) The upper frame is an echo signal from a plane surface, object B lying down.not showing the hole. The lower frame is an echo signal from a curved surface, object C lying down. 1

1799

Das könnte Ihnen auch gefallen