Beruflich Dokumente
Kultur Dokumente
E( y) M ( y ) sT
T
M ( y ) sT ( 2)
Let Km be a probability density function on Rm. We
assume that Km satisfies the following conditions:
where M ( y ) denotes the forward kinematics mapping
Km is continuous, symmetric, bounded.
from the state space to the observation space Assuming that the underlying density is continuous
and sT denotes the desired sensory output expressed in and smooth enough (first derivatives of evaluated at
the output task space. any x are small), and based on a set of p observation
samples ( xi , yi ) i 1,.., p
, the joint probability density network for which a hidden unit is centered at every
training sample. The RBF units of a GRNN
estimator ( x , y ) using the non parametric Parzen s architecture are usually characterized by Gaussian
window method can be formulated as follows: kernels. The hidden-to-output weights are identified to
the target values, so that the output is a weighted
1 p
(x xi ) T Wx ( x xi ) (y yi )T ( y yi ) average of the target values of the training samples
( x, y) Km Km ( 4)
i 1 2( i ) 2 2( i ) 2 close to the given input case. The only parameters of
the networks are the widths of the kernels associated to
where is a normalizing factor, Wx a positive the RBF units. These widths (often a single width is
diagonal matrix used as weighting coordinates of used) are called smoothing parameters or
vector x , and i the local bandwidth of the kernel bandwidths [17] [18]. They are usually chosen by
Km centered on sample ( xi , yi ) . In general a Gaussian cross-validation or by ad-hoc methods not well-
described. GRNN is a universal approximator for
kernel is chosen such that Km is identified to the smooth functions, so it should be able to solve any
exponential function. smooth function-approximation problem provided
Substituting equation (4) into equation (3) we get: enough data is given. The main drawback of GRNN is
that, like kernel methods in general, it suffers badly
p
(x xi )T Wx ( x xi ) from the lack of learning data.
yi .K m
i 1 2( i ) 2 (5)
E (Y / x ) p
(x xi )T Wx ( x xi ) 4.1. Learning SMC maps using GRNN
Km
i 1 2( i ) 2
To estimate the normalized gradient of the error, the
or: following map f is defined:
p p
yi .K m xi , x yi .wi
(6) y s f ( y, s ) (8)
i 1 i 1
E (Y / x ) p p Where s is the 3D directional vector towards the
K m xi , x wi task sT specified in the sensory space, y the vector of
i 1 i 1
xk 2 yi .K ( i , )
is bounded in a neighborhood around x (and thus we ys E( y / ) i 1 (9)
C
can evaluate the bias and variance of ) where [ y , s ]T , i [ yi , si ]T , C is a
(A4) for all i, i 0 as p normalizing factor, and K a variable Gaussian kernel:
(A5) for all i, p. i as p
T
Under assumptions (A1) to (A5), the consistency of i W i
the kernel estimator is established, as the estimator K ( i , ) exp (10)
E (Y / x ) tends in probability towards E (Y / x ) .
W is a weighting diagonal matrix used to balance the
GRNN calculates the Nadaraya-Watson estimate
GRNN is a normalized Radial Basis Function (RBF)
weighting of sensory information s with state
information y , is a parameter that scales the local Each arm is composed of three joints with six degrees
density, both in the state space and in the sensory of freedom, and each finger is composed of three joints
space: if the density is low, is increased and with four degrees of freedom.
conversely, if the density is high, is lowered. The arms are controlled by (DSMC) controllers,
where the mapping between y and s is learned on the
4.2. Naive GRNN learning algorithm basis of a gradient descent strategy. The learning
processes are carried out for increasing values of the
is selected empirically, since an optimum value number of learning samples p. For each process, a
cannot be determined from a set of observations. thousand of 3D spatial target positions and initial
conditions have been selected randomly to test the
- Initialisation: select a small value , an integer value correctness of the learning process. For these 1000
conditions, the error rate (number of cases where the
p and set i to 0 ( can be a function of p).
arm is not able to reach the target) is calculated.
- Select randomly a state vector y , position the multi-
The experimental settings for this test are the
joint system according to y , and observe the following. A target is considered to be reached when
corresponding sensory outputs s . the residual distance between the arm end-point and
- Select a small normalized change y , position the the target is below 1% of the total length of the
multi-joint system according to y y , and observe extended hand-arm chain. The size of is selected
such that at least 40 neighbors can be provided to
the change in sensory outputs s .
evaluate y .
- Calculate y using [ y , s ]T according to The results of this test are reported in Figure 3. For
equations (9) and (10). about 60000 learning samples, the map f is apparently
If y y , save the association [ y , s ], y well modeled, since the residual error rate is low
(about 0.5%) and very few improvements are gained
as a new learning sample i , yi , create a when increasing p.
corresponding neuron and increment i. 40
- If i<p, loop in 1), stop otherwise Error rate (%)
35
30
4.3. Implementation issues
25
20
When estimating the expectation of the state update
15
y given , the computations of distances in the d-
10
dimensional space are required (d = dimension of
5
the state space + dimension of the sensory space). 0
When summing Gaussian kernels (eq. (9) and (10)), 1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81
only the i vectors belonging to the neighborhood of
Figure 3. Error rate as a function of the
are retained. To speed up the computation process, a number of learning samples p.
kd-tree [14] for identifying neighborhoods in
logarithmic time with p can be advantageously used. The fact that p can be chosen very low while
(The kd-tree representation of the stored data leads to maintaining good performances is a major result.
reconsider the architecture of GRNN to implement Generally, for estimating a multivariate function with 9
similar neighborhood search). variables (e.g. 6 degrees of freedom and 3D
coordinates), a kernel density estimator requires above
5. Results 500,000 samples adequately selected. In our case,
p=80,000 seems to be sufficient for the considered
The learning approach is applied on a simulated task. One reasonable explanation is that the sensory-
mechanical system composed of two arms and two motor loop performs a time average over successive
hands, submitted to successive reaching tasks. The gradient estimate values which compensates small
mechanical systems are modeled with ODE (Open errors due to the coarse estimation. A rough gradient
Dynamic Engine)[19], with a 3D custom rendering mapping estimation is consequently quite accurate for
(see Fig. 4.b for the visualization of the 3D character). the reaching task considered in our experiments.
After the learning phase, a simulation process was Model Based on Minimum Torque Criterion. Biological
carried out, for a tracking task which consisted in Cybernetics, vol. 62, 1990, pp. 275-288.
following discrete targets extracted from a motion [2]Wolpert D.M., Miall R.C., Kawato M. Internal models in
capture hand trajectory, according to an adaptive sub- the cerebellum. Trends in Cognitive Science, vol. 2, n°9,
1998, pp. 338-347.
sampling algorithm [20] (see figure 4 a). Furthermore, [3] Spoelstra J., Schweighofer N., Arbib M.A. Cerebellar
the simulation of (DSMC) is linked to this tracking learning of accurate predictive control for fast reaching
task, and applied to a virtual character (see figure 4 b). movements. Biological Cybernetics, 82, 2000, pp. 321-333.
[4] Bullock D., Grossberg S., Guenther F.H. A Self-
Organizing Neural Model of Motor Equivalent Reaching and
Tool Use by a Multijoint Arm. Journal of Cognitive
Neuroscience, vol. 54, 1993, pp. 408-435.
[5] Jordan M.I. Computational motor control. In M. S.
Gazzaniga (Eds.), The cognitive neurosciences. Cambridge,
MA: MIT Press, 1995 pp. 587-609.
[6] Werbos P.J. An overview of neural networks for control.
IEEE Control Systems Magazine, January 1991.
[7] Duda, R. O., & Hart, P. E. Pattern classification and
a) b) scene analysis. New York, NY: Wiley, 1973.
[8] Schaal, S. (in press). Nonparametric regression for
Figure 4. a) Trajectory of the human wrist in learning nonlinear transformations. In: Ritter et al. eds.
the Cartesian space with the localization of Prerational Intelligence in Strategies, High-Level Processes
and Collective Behavior. Kluwer Academic.
the targets; b) Simulation by a virtual
[9] D. E. Rumelhart, G. E. Hinton, and R. J. Williams. 1986.
character for a tracking task Learning Representation by Back-Propagating Errors. Nature
323:533-536.
For this hand-tracking task, the resulting hand [10] Bishop, C.M. Neural Networks for Pattern Recognition,
trajectories obtained through dynamical simulation can Oxford University Press, Oxford, UK, 1995.
be superimposed with the captured trajectories, as [11] D.F. Specht, A General Regression Neural Network
illustrated in figure 4 a). IEEE Trans. Neural Networks, Vol.2, No.6, p568-576, 1991.
[12] D.F. Specht, Probabilistic Neural Networks, Neural
Networks, 3, 1990, 109-118.
6. Conclusion [13] Churchland, P. S. and SeSejnowski, T. J, The
computational brain, MA:MIT Press, 1992.
In this paper, we proposed a Dynamical Sensory- [14] Friedman, J.H., Bentley J.L. and Finkel R.A., An
motor controller (DSMC) for controlling a dynamical algorithm for finding best matches in logarithmic expected
hand-arm system. The controller combines both the time , ACM Trans. Math. Software, 3, 209-226, 1977.
inversion of the kinematics model, from the learning of [15] Gibet S. and Marteau P.F., A Self-Organized Model for
sensory-motor mappings, and the inversion of the the Control, Planning and Learning of Nonlinear Multi-
dynamical system using classical PID controllers. The Dimensional Systems Using a Sensory Feedback, Journal of
Applied Intelligence, Vol.4, 1994, pp. 337-349.
learning of sensory-motor mappings was performed
[16] Gibet S., Marteau P.F. Expressive Gesture Animation
with non parametric learning approaches (GRNN), Based on Non Parametric Learning of Sensory-Motor
based on a variable kernel density estimator and the Models, CASA 2003, Computer Animation and Social
use of a kd-tree architecture to simulate neuron Agents, 7-9 mai 2003.
activation according to a near neighbor search. Despite [17] Watson, G. S. (1964). Smooth regression analysis.
the apparent high memory requirement needed by this Sankhya, Series A, 26, 359-372
kind of estimator, the proposed learning scheme [18] Nadaraya, E. A. (1964). On estimating regression.
behaves properly when used to control articulated Theory Probab. Applic., 10, 186-190.
systems with six degrees of freedom simulated in a [19] Open Dynamics Engine, 2000-2003 Russell Smith.
http://opende.sourceforge.net/
dynamical environment. This result is obtained even if
[20] P.-F. Marteau, S. Gibet. Adaptive Sampling of Motion
the number of learning samples is reasonably low. Trajectories for Discrete Task-Based Analysis and Synthesis
of Gesture. In Gesture in Human-Computer Interaction and
7. References Simulation, 6th GW, Revised Selected Papers, LNCS,
Volume 3881, pp.168-171, 2006.
[1] Kawato M., Maeda Y., Uno Y., Suzuki R.. Trajectory
Formation of Arm Movement by Cascade Neural Network
This document was created with Win2PDF available at http://www.win2pdf.com.
The unregistered version of Win2PDF is for evaluation or non-commercial use only.