Sie sind auf Seite 1von 13

Restricted maximum likelihood procedures for the estimation of additive and

nonadditive genetic variances and covariances in multibreed populations

M. A. Elzo

J Anim Sci 1994. 72:3055-3065.

The online version of this article, along with updated information and services, is located on
the World Wide Web at:
http://jas.fass.org

www.asas.org

Downloaded from jas.fass.org by on June 17, 2010.


Restricted Maximum Likelihood Procedures for the Estimation of
Additive and Nonadditive Genetic Variances and Covariances
in Multibreed Populations'

M. A. Elzo
Animal Science Department, University of Florida, Gainesville 32611

ABSTRACT: Restricted maximum-likelihood proce- were modeled in regression form (any value between
dures were developed to estimate additive and nonad- and including zero and one in the design matrices).
ditive genetic and environmental covariances for Computational requirements will be larger than for
multiple traits in multibreed populations. The com- intrabreed analyses. Appropriate simplifying assump-
putational procedure follows the expectation-maximi- tions and numerical techniques (e.g., sparse and
zation ( E M ) algorithm, where the set of equations in iterative numerical techniques) will be required for
the maximization step is solved by successive approxi-
the implementation of these multibreed covariance
mations. This computational procedure does not guar-
estimation procedures. Number of iterations ( 5 to 12)
antee convergence to a symmetric positive-definite
covariance matrix. Thus, computer programs will need and computing times (57 to 113 min) to achieve
to incorporate restrictions in the maximization step to convergence when estimating 21 genetic and environ-
ensure positive definiteness of each covariance matrix. mental covariances in five small simulated multibreed
Additive genetic and environmental covariances were data sets (two breeds, 25,200 to 50,400 calves, 120 to
modeled in subclass form (zeros and ones in the 135 unrelated bulls) suggest that these procedures
design matrices). Nonadditive genetic covariances are computationally feasible.
Key Words: Maximum Likelihood, Variance Components, Genetic Parameters, Population Structure

J. h i m . Sci. 1994. 72:3055-3065

Although crossbreeding is widely practiced in the additive and nonadditive genetic effects will need to be
United States, the active genetic' basis of the beef accounted for (Elzo and Famula, 1985). In a mul-
industry is formed by a large number of breeds (e.g., tibreed genetic evaluation, each breed and crossbred
Angus, Brahman, Hereford, Limousin, Simmental) group may have different values for additive genetic
that act independently of one another. Consequently, variances and covariances (Elzo, 1990a; Lo et al.,
genetic evaluation and selection of parents are still 1994). Similarly, each breed group combination may
formally carried out within each breed, as evidenced have different values for nonadditive genetic variances
by the Guidelines for Uniform Beef Improvement and covariances (Elzo, 1990b). Environmental vari-
ances and covariances may also differ across breeds
Programs (BIF, 199 0). Unfortunately, intrabreed
and crossbred groups. The large number of sets of
EPD cannot be used t o compare bulls of the same or additive, nonadditive, and environmental covariances
ddferent breeds for crossbreeding purposes because that need to be estimated simultaneously can be
they consider only additive genetic effects (each breed drastically reduced if multibreed covariances are
has a different additive genetic base) and they ignore assumed to be linear functions of a small set of
nonadditive genetic effects (defined here as the covariances.
combining ability of a bull when mated to dams of The current method of choice to estimate covari-
various breed compositions). If bulls are to be ances using animal breeding data is REML. However,
compared across breeds and crossbred groups both existing REML procedures can only estimate a single
set of covariances. Thus, the objective of this research
was to develor, REML Drocedures for multibreed
populations that 1) account for heterogeneity of
'Florida Agncultural Experiment Station Journal Series KO.R- covariances genetic groups of animals, 2
03642. The author thanks R. L. Quaas for his help and comments
during the course of this research and D. D. Hargrove and T. A.
express additive and nonadditive genetic and environ-
Olson for reviewing the manuscript. mental covariances of genetic groups as linear combi-
Received February 7, 1994. nations of a small number of covariances, and 3 )
Accepted June 20, 1994. simultaneously estimate all the sets of covariances

3055

Downloaded from jas.fass.org by on June 17, 2010.


3056 ELZO

used to compute all the additive and nonadditive more breeds within a nonadditive configuration.
genetic and environmental covariances of any genetic Nonadditive configurations are to nonadditive genetic
group given a set of base breeds. covariances as breeds are to additive genetic covari-
ances.
Environmental intrabreed genetic covariance: a
Development of the Restricted covariance due to environmental effects within a
Maximum Likelihood Procedure to Estimate breed.
Covariances in Multibreed Populations Environmental interbreed genetic covariance: a
covariance arising from differences between in-
The computational procedure is based on the trabreed means of environmental effects.
expectation-maximization ( EM) algorithm (Dempster Environmental multibreed genetic covariance:
et al., 1977), where the maximization step is accom- an environmental covariance equal to either an
plished by iteration. intrabreed environmental covariance (straightbred
The description of this procedure requires the use of animals) or a weighted sum of intrabreed and
unfamiliar terminology as well as definitions of interbreed environmental covariances (progeny of one
multibreed additive, nonadditive, environmental, and or two crossbred parents).
residual covariances. Thus, these preliminary aspects Residual intrabreed, interbreed, and multibreed
will be explained first. genetic covariances: weighted sums of additive and
environmental intrabreed, interbreed, and multibreed
Definition of Terms covariances.
Multibreed population: a population composed of
Assumptions
straightbred and crossbred animals.
Breed group: a group of animals whose genetic The following assumptions are made: 1) traits are
composition falls within a range of fractions of breeds; determined by a large number of unlinked loci, 2 )
for example, if five breed groups are constructed to random segregation and assortment of alleles occur
group animals in a two-breed population (A = breed 1, during meiosis, 3 no inbreeding, and 4 ) covariances
B = breed 21, the group ranges could be as follows: remain constant over time.
group 1 = (1.0 to .81)A ( . O to .19)B, group 2 = (.80 t o
.61)A (.20 to .39)B, group 3 = (.60 to .41)A (.40 to Additive Multibreed Genetic Covariances
.59)B, group 4 = (.40 to .21)A (.60 to .79)B, and
group 5 = (.O to .19)A (1.0 to .81)B. Additive genetic covariances for each breed group
Regression model: a model that defines multibreed combination are assumed to be different. These
bull nonadditive effects in terms of intra- and inter- covariances are equal to the sum of two terms. The
breed interactions between alleles at E loci, I = 1, ... , L. first term is equal to the weighted sum of the
Bull model: an abbreviation of sire-maternal grand- intrabreed covariances for traits Y and Z, where the
sire model. weights are the expected frequencies of each breed in
Additive intrabreed genetic covariance: a covari- the gth breed group combination (Elzo, 1983, 1990a;
ance due to additive genetic effects within a breed. Lo et al., 1994). The second term is equal to the
Additive interbreed genetic covariance: a covari- weighted sum of the interbreed covariances for traits
ance arising from differences between intrabreed Y and Z, where the weights are the sum of the product
means of additive genetic effects; it is equal to twice of the expected breed frequencies in the parental breed
the segregation covariance (Lo et al., 1994). groups (Lo et al., 1994). This second term was
Additive multibreed genetic covariance: an addi- assumed to be zero by Elzo (1990a). Inclusion of the
tive covariance for animals in a multibreed popula- second term in the computation of multibreed additive
tion; equal to either an additive intrabreed genetic covariances does not affect the rules to compute the
covariance (straightbred animals) or a weighted sum matrix of covariances among bull additive genetic
of additive intrabreed and interbreed genetic covari- effects ( G,) or its inverse ( G,-I).
ances (progeny of at least one crossbred parent). Thus, the additive genetic covariance between traits
Nonadditive configuration: a representation of I Y and 2 for an animal in a noninbred multibreed
loci using the breed of origin of the alleles. For two population is
breeds, A and B, there are four configurations at one
locus: MA, A/B, B/A, and B/B; three configurations nb
result if A B and B/A are defined as one configuration.
A possible set of configurations for one and two loci is b=l
shown in Elzo (1990b).
Nonadditive intraconfiguration genetic covari- nb- 1 nb
ance: a covariance due to nonadditive genetic effects
caused by the interaction between alleles of one or

Downloaded from jas.fass.org by on June 17, 2010.


GENETIC COVARIANCES IN

where the superscripts i, s, and d correspond to


individual animal, sire, and dam, the subscripts b and
b’ represent two breeds, and nb = number of breeds; - .0625 c o ~ , ( Y , Z ) ~ g+ ~cov,(Y,Z)’, [3]
pbx = expected fraction of breed b in animal x, x = i, s,
d; ( aayz)b = additive intrabreed covariance for breed where the superscripts i, s, and mgs refer to an
b; and ( aayz)bb’ = additive interbreed covariance for
animal, its sire, and its maternal grandsire, the
the pair of breeds b and b’.
subscripts v, a, and e represent residual, additive
Nonadditive Multibreed Genetic Covariances genetic, and environmental, and 6x = indicator equal to
1 if animal x is not identified and to 0 if animal x is
Nonadditive genetic effects were accounted for by identified, x = s, mgs; cov,(Y,Z)’ = COVa(YD,zD)i,
means of regression procedures. Thus, nonadditive where the subscript D = direct genetic effects;
intraconfiguration covariances between traits Y and Z COV,(Y,Z)~ = COV,(YD,ZD) ‘; and ~ o v , ( Y , Z ) ~=g ~co-
at I loci, ( anyz)I, I = 1, ... , L, need t o be estimated. va(yD,zD) rngs.
These covariances are not a function of any other set Additive genetic covariances in Equation [31 are
of nonadditive covariances. This characteristic makes computed using Equation [l] and environmental
nonadditive covariances in regression models different
covariances using Equation [21. If a model includes
from additive genetic covariances, which are assumed
sires and dams, then additive dam covariances (and
to be a function of intra- and interbreed additive
covariances. multiplying factors 6d and . 2 5 ) will be substituted for
those of the maternal grandsire.
Environmental and Residual
Multibreed Covariances Model
Multibreed environmental covariances could be Let a bull model be
assumed 1) t o be equal for all breeds and crossbred
groups, 2 ) to be different for each breed and crossbred y = x b + zu + v
group (i.e., given an environment each genotype
reacts differently), and 3 ) something in between ZGZ’ + R ZG R
alternatives 1 and 2. GZ G 011
If multibreed environmental covariances were as- R 0 R [41
sumed 1) to be different across breed groups, and 2 )
t o behave in an additive fashion, then their computa-
where y = vector of observations on all traits
tion would be similar to the procedure used to compute
recorded per calf; b = vector of fixed effects
additive genetic covariances. Thus, the multibreed
environmental covariance between traits Y and Z for the nt traits being considered; u = vector
would be of additive and nonadditive bull genetic effects
for nt traits; v = vector of residuals; X = inci-
dence matrix relating records to elements of b;
COVe(Y,Z) = c
nb

b=l
Pb (aeYZ)b Z = incidence matrix relating records to ele-
ments of u; G = matrix of covariances among
nb-1 nb
elements of u; and R = matrix of covariances
among elements of v.
Bulls are assumed to be unrelated. Bull genetic
effects due to different nonadditive configurations are
where the superscript i represents an individual assumed to be uncorrelated among themselves and t o
animal, the subscripts b and b’ represent two breeds, additive genetic effects. Thus, the matrix G is block
( aeyz)b = environmental intrabreed covariance for diagonal (one block per bull), with blocks
breed b, and ( aem)bb’ = environmental interbreed
covariance for the pair of breeds b and b’. 0 0 ... 0
The structure of the residual covariances will -0sg
depend on 1 ) the additive model used (animal, 0 GOnl 0 ... 0
reduced animal, sire-dam, bull, sire model), 2 ) the 0 0 GOn2 . . . 0
ancestors identified on an animal with records, and 3 ) . .
the assumptions made with respect to multibreed 0 0 0 ... GOnK
environmental covariances. -
An expression for the multibreed residual covari-
ance between traits Y and Z for a bull model is the
following: where

Downloaded from jas.fass.org by on June 17, 2010.


3058 ELZO

GOag = nt x nt matrix of additive direct genetic Step 0. 1 ) Define a set of initial covariance values.
covariances for bulls of parental breed 2 ) Compute the matrices of derivatives of
combination g, g = 1, ... , nbgcom, where additive, nonadditive, and residual covari-
nbgcom = number of different breed ance matrices with respect to 4.
group combinations (nbg(nbg + 1112, nbg E-Step. 1) Compute the additive and nonadditive
= number of breed groups); covariances genetic and environmental sets of mul-
in Goag are computed using Equation [l]; tibreed covariance matrices needed in the
GOnk = n t x nt matrix of nonadditive direct construction of the MME of the bull model.
genetic covariances; k = 1, ... ,K, where K 2 1 Compute the predicted values of u, and
L u, (by solving the MME of the bull model)
= nl, and nl = number of interaction and v (by Equation [71, Appendix). Also,
1=1 compute the EVE' of u, and u, (using
effects at I loci. elements of the inverse of the MME) and v
(by Equation [SI, Appendix).
Calves are assumed to be related only through their 3 ) Compute SO,. ( P ) (Equation [41, Ap-
sires and(or) maternal grandsires. Thus, residual F
pendix), Sank.( P (Equation [51, Appen-
effects are correlated only within a calf. Consequently, dix), and Sorn.(P) (Equation [61, Appen-
the matrix R is block diagonal, with blocks equal to dix).
Ram*, for m = 1, ... , M, and M = number of residual M-Step. 1 ) Compute 4( p l) by successive approxi-
+

subclasses (defined according to the various intra- and mations (i.e., Scoring iterations) using
interbreed genetic and environmental factors con- Equations 1121 (Appendix). If the differ-
tained in them). The Ro," are nt x nt matrices of ence between the absolute values of the
residual covariances with zeros in the rows and estimates of $( P+ and 4( P) are less than
columns corresponding to missing traits in a calf. or equal to a vector of small values E ( E -
Elements of Ram* are computed using Equation [31. quations [131 and [141, Appendix), then
To simplify the notation, let 4 = [4, 4, 4J' where 4, stop; otherwise, go back to the E-step and
is a vector of additive genetic covariances ( t h e aayz in continue with the EM iterations.
Equation [l]),4, is a vector of nonadditive genetic
covariances ( t h e ( unyz)l), and is a vector of Analyses of Simulated Data Sets
environmental covariances ( t h e aeyz in Equation [2]).
Also, let the number of 1 ) additive genetic covariances Five data sets were simulated and covariance
be N,, 2 ) nonadditive genetic covariances be Nn, 3 ) components estimated using the methodology
environmental covariances be Ne, and 4 ) N,= N, + N, presented here. The purpose of these analyses was to
+ Ne. Thus, each covariance in Goag, GOnk, and Ro," obtain some information on the computer times and
is a linear function of elements of d. the number of EM and Scoring iterations required to
achieve convergence in small data sets. Computations
Computational Procedure were carried out in a n IBM RS6000 workstation,
model 580, using a computer program (written in
The derivation of the computational procedure used FORTRAN and compiled using the AIX XL FORTRAN
to obtain REML estimates of covariances ( 4 ) in Compiler/6000 without any optimization) based on
multibreed populations is described in the Appendix. the procedures described here. This program used the
The computational procedure makes use of the Expec- FSPAK sparse-matrix routines (Perez-Enciso and
tation Maximization ( E M 1 algorithm (Dempster et Misztal, personal communication) to invert the left-
hand side of the MME.
al., 1977). The EM algorithm is a n iterative procedure
that has a n expectation step ( E-step) and a maximi- Simulation of Data
zation step (M-step) in each iteration. The E-step
requires the computation of sums of products of Two breeds ( A and B ) , two traits per calf, and only
predicted values of random effects plus their cor- direct genetic effects were considered. Three additive
responding error variances of predictions ( EVP) . In genetic effects, one nonadditive genetic effect, three
the M-step, 4' P + is computed by iteration, where environmental effects and two sex effects were used in
+( P+ is the value of 4 that maximizes Q ( 4 I r$ P) ) the simulation of calf records. All effects, except sex of
calf, were simulated as random effects. The additive
(Equation [33, Appendix). Thus, at convergence, the
genetic effects were additive intrabreed A, additive
M-step produces the covariance estimates for the intrabreed B, and additive interbreed AB. The nonad-
( ~ + EM l >iteration.
~ ~ The M-step is accomplished by ditive genetic effect was intraconfiguration 11 (one A
iteration because the differentiation of Q ( 4 I q$ p) ) allele and one B allele a t one locus). Environmental
with respect to 4 results in a nonlinear set of effects were intrabreed A, intrabreed B, and inter-
equations. The computing algorithm is as follows: breed AB. Sex effects were male and female.

Downloaded from jas.fass.org by on June 17, 2010.


GENETIC COVARIANCES IN MULTIBREED POPULATIONS 3059
Five breed groups in 20% intervals were defined. breed AB nonadditive genetic covariances, three
Between 120 and 135 unrelated bulls were generated environmental covariances for breed A, three environ-
by mating individuals of these five breed groups ( 1 5 mental covariances for breed B, and three interbreed
breed group combinations). Progeny (between 25,200 AB environmental covariances.
and 50,400) were generated by mating these bulls to All genetic and environmental covariances for the
dams of all 15 breed-group combinations. There were two traits were estimated simultaneously. The conver-
between eight and nine bulls for each one of the 15 gence criterion used for the Scoring and the EM
breed-group combinations. Maternal grandsires were iterations was that the maximum absolute difference
chosen at random among the generated bulls (except between two iterations had to be less than a preset
for the avoidance of inbreeding). small number. This number was .01 for both the
The first line of each covariance in Table 1 is the Scoring and the EM iterations.
values of the priors of the covariance matrices used t o The only set of computations that is done only once
simulate calf records. Although no trait names are in the computer program is the construction of 1) six
needed, Trait 1could represent birth weight and Trait intrabreed and three interbreed matrices of deriva-
2 weaning weight; thus, numbers in Table 1 would be tives of additive genetic covariances for each one of the
in squared kilograms. 15 breed-group combinations (15 x 9 = 135 matrices),
2 ) one matrix of derivatives of nonadditive genetic
Estimation of Covariance Components covariances, and 3 ) six intrabreed and three inter-
breed matrices of derivatives of environmental covari-
The simulated data were analyzed using a bull ances resulting from the mating of sires of 15 breed-
model that had sex as a fured effect and sire and group combinations to dams of 15 breed group
maternal grandsire additive as well as nonadditive combinations (15 x 15 x 9 = 2,025 matrices).
genetic effects and residual effects as random effects. The computations carried out in every round of EM
The vector of unknown covariances qb had 21 elements: and Scoring iterations were as follows.
three additive genetic covariances for breed A, three E-Step. Inverses of the matrices of covariances of
additive genetic covariances for breed B, three inter- bull additive and nonadditive genetic effects and
breed AB additive genetic covariances, three inter- environmental effects were computed. The number of

Table 1. Covariance priors, means, and range of values of REML estimates


for two traits from five stimulated data sets

Pairs of traits

Additive
Intrabreed A 4.0a 3.0 40.0
3.gb 2.7 40.8
(1.7, 5.7)' (-.6, 5.6) (17.0, 55.0)
Intrabreed B 6.0 4.0 60.0
6.2 5.2 52.4
(3.9, 7.6) (3.9, 6.8) (41.7, 60.3)
Interbreed AB 2.0 4.0 20.0
1.9 1.8 27.0
(1.2, 3.2) (-3.3, 5.8) (11.1. 48.7)
Nonadditive ( 1 locus)
Intraconfiguration 11 3.0 4.0 30.0
4.8 5.8 38.9
(4.1, 5.3) (5.0, 7.6) (32.2, 53.1)
Environmental
Intrabreed A 6.0 7.0 90.0
6.2 7.0 87.1
(4.9, 7.6) (5.7, 9.0) (72.4, 109.8)
Intrabreed B 14.0 10.0 240.0
14.1 8.8 238.9
(12.9, 15.6) (7.5, 9.4) (233.1, 249.3)
Interbreed B 4.0 8.0 60.0
4.3 10.1 67.8
(3.5, 4.7) (7.5, 12.7) (50.4, 85.0)
aCovariance prior.
bMean of five REML estimates.
'(Smallest, largest) value among five REML estimates.

Downloaded from jas.fass.org by on June 17, 2010.


3060 ELZO
matrices computed were 1) 15 additive genetic (one animals of other breed groups (straightbred or
for each breed-group combination), 2 ) one nonadditive crossbred). In general, multibreed procedures can be
interbreed, and 3 ) 225 residual (one for each sire applied whenever animals from several distinct sub-
breed-group combination x dam breed-group combina- populations interbreed.
tion subclass). These matrices were used in the E- and
M-steps of this procedure. Modeling Aspects
The predicted values of the vectors u, and u, and
the matrices of their EVP were computed for all bulls. The model used to present these procedures used a
Also, the predicted values of the v vectors and the subclass approach to additive genetic effects, whereas
matrix of their EVP were computed for all calves. a regression approach was used to account for
These vectors and matrices were then used to compute nonadditive genetic effects. Another alternative could
matrices containing the sums of products of predicted have been to use a regression approach to explain
values plus their EVP. The number of these matrices additive genetic effects. In the case of multibreed
were 1) 15 matrices for bull additive genetic effects populations of two breeds, each bull would have up to
( Soag.(P) , ag = 1, ... , 15, Equation [41, Appendix), 2 ) two predicted additive genetic values: one due to its
one matrix for bull nonadditive genetic effects as alleles from breed A and another from alleles of breed
regressors ( Sonlo(P) , Equation [51, Appendix), and 3 B. A third one could also be predicted for additive
225 matrices for residual effects ( Som.(P) , Equation interbreed AI3 genetic effects. The matrix Z for
[61, Appendix). additive genetic effects will have up to three values
M-Step. Matrices Baa, B,,, B,,, and Bee, and per bull (between 0 and 1). The Equation for
vectors d,, d,, and d, were computed here. These are &(, r#J I 4 ( P ) ) will be similar to the one given for
the submatrices and subvectors of system of Equations nonadditive genetic covariances in regression models
[12] (Appendix). Matrices Baa, Bae, and Be, were ( 9 x (&,( #J I #J(P)1, Equation [51, Appendix).
9). Matrix B,, was 3 x 3. Vector d, was 9 x 1, d, was Residual covariances were explained in terms of
3 x 1, and d e was 9 x 1. The estimate of #J for the their additive and environmental components. An-
( p +l)thEM iteration was obtained by solving the other option would have been to define residual
resulting system of equations by successive approxi- covariances as part of the r#J vector. In this case 1) the
mations. Within each Scoring iteration, Equations [121 contributions of the residual function to additive
(Appendix) were solved by direct inversion of the {Bij} genetic covariances will be zero, and 2 ) the set of
Equations 1121 (Appendix) will become block di-
matrix and subsequent multiplication by vector (ai}.
agonal. Because of the large number of different
The number of Scoring iterations to achieve conver-
residual covariances possible per trait, simplifying
gence ranged between one and five.
assumptions would need to be made. Possible alterna-
Starting values were those of the covariances used
tives could be 1) calves from each breed-group
to simulate the data set. These were 1) the nine
combination would have a unique set of residual
additive genetic covariances (intrabreed A, intrabreed
covariances, 2 ) calves from each breed group would
B, and interbreed AB, 2 ) the three covariances for have a different residual covariance matrix, 3 )
nonadditive intraconfiguration 11, and 3 ) the nine
residual covariances could be treated as additive
environmental covariances (intrabreed A, intrabreed genetic covariances, where each covariance would be a
B, and interbreed AB). It took from 8 to 12 EM linear function of intrabreed and interbreed residual
iterations and between 57 min and 113 min to reach covariances, and 4 ) a single set of residual covariances
convergence. is used for all calves.
The mean and the (smallest, largest) values of the Although the presentation of this methodology
REML estimates of covariances in 4 of the five made use of a sire-maternal grandsire model for direct
simulated samples are shown in lines two and three of effects only, 1) more complete models (sire-dam,
each covariance in Table 1.The means indicate that in reduced animal and animal) can be programmed with
these few small data sets seven covariances were multibreed features, and 2 maternal effects can also
underestimated, two were equal t o the parameter be included in the model and in computer programs,
values, and 12 were overestimated. The smallest and Incorporation of additional animals to be evaluated in
the largest values of the covariance estimates as a the model and of maternal effects will increase
percentage of the parameter values were 7.8 and computing times substantially. In a bull model, the
143.5%. The average absolute difference as a percent- programming of direct and maternal genetic effects is
age of the parameter values was 17%. considerably more complex than the programming of
direct genetic effects alone; however, this should be
simpler in the sire-dam and (reduced) animal models.
Discussion
Computational Aspects
The REML methodology presented here will be
useful not only in multibreed populations in which The analyses of the small simulated data presented
breeding animals are both straightbred and crossbred here included only one source of nonadditive variation
but also when straightbred animals are mated to (intralocus interbreed). However, real data will

Downloaded from jas.fass.org by on June 17, 2010.


GENETIC COVARIANCES IN MULTIBREED POPULATIONS 306 1
probably contain additional sources of nonadditive data sets (tens of thousands to millions of animals)
variation (e.g., intralocus intrabreed). Ideally all iterative procedures will be needed to compute predic-
sources of nonadditive variation should be accounted tions of u’s and v7sas well as suitable approximations
for in the model. However, assumptions relative to the to the EVP of the u7s and v’s.
number of loci considered for nonadditive genetic
covariances will usually need t o be made because of
cost or computational feasibility. Furthermore, which Implications
nonadditive genetic effects will be able to be included
in the mixed model used to predict the u’s and the v’s The procedures to estimate covariance components
will largely depend on the dependencies and multicol- developed here make possible the prediction of addi-
linearity that exist among them. These two factors tive and nonadditive genetic values of animals in
need to be closely monitored in unbalanced data sets multibreed populations, in systematic crossbreeding
because missing data can cause both confounding and programs, semen importation and, in general, when
multicollinearity. If this happened, further simplifying animals of several distinct subpopulations interbreed.
assumptions may be needed to analyze those data Although their computational requirements may be
sets. substantially larger than intrabreed covariance esti-
The numbers of bulls and progeny per bull in field mation procedures, the number of covariances to be
data sets will probably be substantially larger than estimated can be largely decreased by using an
the eight or nine bulls considered in the simulated appropriate set of assumptions.
data sets. Thus, the values of the covariance estimates
should be closer to the parameter values than the ones
obtained here, assuming that all important genetic Literature Cited
and environmental effects were accounted for in the
model. Bard, Y. 1974. Nonlinear Parameter Estimation. Academic Press,
The computing times of the simulations were New York.
probably longer than needed because the computer BIF. 1990. Guidelines for uniform beef improvement programs. Beef
program used was a research tool that has not been Improvement Federation, Oklahoma State Univ.. Stillwater.
Dempster, A. P., N. M. Laird, and D. B. Rubin. 1977. Maximum
optimized for speed. In addition, the computer pro- Likelihood from Incomplete Data via the EM Algorithm. J.
gram had checks at various points in the computa- Royal Stat. SOC.,Ser. B 38:l.
tional procedure that must have added time to each Elzo, M. A. 1983. Multibreed sire evaluation within and across
round of iteration. However, placing the covariances countries. Ph.D. Dissertation, Univ. of California, Davis.
used in the simulation as priors is likely to reduce the Elzo, M. A. 1990a. Recursive procedures to compute the inverse of
the multiple trait additive genetic covariance matrix in inbred
number of EM iterations needed t o achieve conver- and noninbred multibreed populations. J. Anim. Sci. 68:1215.
gence. A small test was conducted to check whether 1) Elzo, M. A. 1990b. Covariances among sire by breed group of dam
the number of EM iterations needed to achieve interaction effects in multibreed sire evaluation procedures. J .
convergence and 2 ) the convergence values would be Anim. Sci. 68:4079.
the same when priors were equal to and different from Elzo, M. A., and T. R. Famula. 1985. Multibreed sire evaluation
procedures within a country. J. Anim. Sci. 60:942.
the simulation covariances. Two additional small data Fletcher, R. 1974. Methods Related to Lagrangian Functions. In: P.
sets of similar structure and size to the five previous E. Hill and W. Murray ( E d . ) Numerical Methods for Con-
data sets were generated. Three runs per data set strained Optimization. p 219. Academic Press, New York.
were carried out. In the first run, simulation covari- Harville, D. A. 1977. Maximum likelihood approaches to variance
ances were used as priors. In the second run, the prior component estimation and to related problems. J. Am. Stat.
Assoc. 72:320.
values used were lower than the simulation covari- Harville, D. A,, and T. P. Callahan. 1990. Computational aspects of
ances. In the third run, prior values higher than the likelihood-based inference for variance components. In: D. Gia-
simulation covariances were used. In runs two and nola and K. Hammond ( E d . , Advances in Statistical Methods
three, only two prior covariance matrices were used, for Genetic Improvement of Livestock. p 136. Springer-Verlag,
one for all genetic effects and another for all environ- New York.
Lo, L. L., R. L. Fernando, and M. Grossman. 1994. Genotypic
mental effects. The only consequence of using these covariance between relatives in multibreed populations: Addi-
low and high priors was the need for one additional tive model. Theor. Appl. Genet. 87:423.
EM iteration to achieve convergence; all covariance Marquardt, D. W. 1963. An algorithm for least-squares estimation of
estimates at convergence were the same. nonlinear parameters. J. SOC.Indust. Appl. Math. 11:431.
Ryan, D. M. 1974. Penalty and Barrier Functions. In: P. E. Hill and
W. Murray ( E d . ) Numerical Methods for Constrained Optimi-
Programming Aspects zation. p 175. Academic Press, New York.
Sargent, R.W.H. 1974. Reduced-gradient and Projection Methods for
Programming these procedures is more involved Nonlinear Programming. In: P. E. Hill and W. Murray ( E d . )
than intrabreed procedures. Because of the number of Numerical Methods for Constrained Optimization. p 149. Aca-
random genetic effects to be predicted per bull may be demic Press, New York.
substantially larger than for a single breed, sparse- Searle, S. R., G. Casella, and C. E. McCulloch. 1992. Variance
Components. John Wiley & Sons. New York.
matrix procedures (e.g., FSPAK, Perez-Enciso and Swann, W. H. 1974. Constrained Optimization by Direct Search. In:
Misztal, personal communication) become a must if P. E. Hill and W. Murray ( E d . ) Numerical Methods for Con-
solutions are to be obtained directly. For very large strained Optimization. p 191. Academic Press, New York.

Downloaded from jas.fass.org by on June 17, 2010.


3062 ELZO
Appendix

Derivation of the Computational Procedure to Estimate Covariance


Components in M u 1tibreed Populations
The computational procedure used to solve for the REML estimates of makes use of the expectation- +
maximization ( E M ) algorithm (Dempster et al., 1977). The EM algorithm used here is based on the general
version described by Dempster et al. (1977), which relies on the function

which is assumed to exist for all pairs (+‘, 4), where x = complete data and z = incomplete data. Also, f(x,+) is
assumed t o be an increasing function almost everywhere.
The incomplete data used here are defined to be a linear combination of the vector of observations, K’y, where
K’ is a matrix of contrasts such that K X = 0. The complete data are considered to be the vectors of unknown
random effects in the model ( u and v). The log-likelihood of the complete data, L,, is:

In [f(u,,un,vIq5)] = constant + In [f(u,I+]

+ In [f(unl+] + In [f(vl+l, [21


where
. nbgcom nap

In [f(u,I+)l = - -
1
2
ag=l i=l
[lnIGoagI + uaiGOAguapil,

In [f(u,I4)1 = - + u ~ ~ ~ G
and ; ~ ~ u ~ ~ ~ ~

and u, = vector of bull additive genetic effects; u, = vector of bull nonadditive genetic effects; nag = number of
bulls in breed group combination ag; nbu = total number of bulls; and nm = number of calves in calf group m.
The EM algorithm is an iterative procedure that has two steps in each iteration: 1) an expectation step (E-
step), and 2 ) a maximization step (M-step). The E-step consists of computing Q ( + I @ ( P ) ) , and in the M-step,
+( P
+ + +
is computed, where C#I(P l ) is the value of that maximizes Q ( 4 I 4( P) ) . The M-step is accomplished by
iteration because the differentiation of &( + I4( P ) ) with respect to 4 results in a nonlinear set of equations. The
derivation of the E-step and the M-step for the ( p + l I t h iteration is described below.
E-step. The function &( 4 I +( P) ) , ignoring the constant term, is

The function for additive genetic covariances is

1
nbgcom %g

x
ag=l i = l (In I GOagl + uagi’ G;:g~api) I Ky,+(P)
nbgcom ag
- - -
[na,Jn I GOagI + trGO:,i E[uapiuagi’ I K‘y,r#~(~)]
ag=l i=l

Downloaded from jas.fass.org by on June 17, 2010.


GENETIC COVARIANCES IN MULTIBREED POPULATIONS 3063

where

"ag
= I Ky,+'P'] + var(uagi I Ky,+(P)))
(E[uapi I K'~,+'~']E[u~pil
i=l
"ag
= [uagiuagit+ var(uapi - uagi)j,
i=l

and where

Similarly, the function for nonadditive genetic covariances is

where

and where

ufi = BLUP of unb, and


var(cfi - u), = EVP of u.,

Finally, the function for residual covariances is

where
"m
s$de = E[~,,v~~lK'y,+'~']

"In
- (+mi.+mi.' + var(Gmi - vmi)j,
i=l

Downloaded from jas.fass.org by on June 17, 2010.


3064 ELZO
and where
Gmi = BLUP of vmi, and
var($mi - vmi) = EVP of Gmi.

The EM algorithm requires the function Q(4I 4( P)) to increase at each EM iteration. This is accomplished by
choosing 4' P + l ) as the value that maximizes Q ( 4 I +( P) ) . However, to compute $( P+l) by maximizing Q ( 4 I 4( PI)
in the M-step, only S o a g ( P ) , SO&.( P), and SOm.(P) are needed (i.e., the complete Q(q5 I q5(P) ) function does not
need to be computed). Thus, the quantities that need to be computed in the E-step are Soag.(p), Sonk.(P), and
Sorn.(P), which are functions of the predicted values of u,, u,, and v and their respective EVP. The predicted
values of Ua and u, are obtained by solving the mixed-model equations (MME) for the bull model (Equation [41
in the main text) and their EVP from elements of the inverse of the left-hand side. The predicted values of v are
computed as
A

v = y - Xb" - zu, [71

and their EVP as

where the {Cij) are submatrices of the inverse of the left-hand side of the MME for the bull model.
M-step. The vector 4(P+l) is computed by maximizing the function Q ( 4 I +( P) ) . This requires differentiating
+
Qa( 4 I 4' P) ) , Q,( 4 I $( P) ) and Q,( 4 I qb( P) with respect to and equating the resulting set of equations to zero.
The derivative of the additive genetic function is

'Goa,
where GLtgGoag,GOag written as C
Na

j=1
-9.,
a+j J
was inserted in the first term.

A similar strategy is used to obtain the derivative of the nonadditive and residual functions. Thus, the
derivative of the nonadditive genetic function is

and the derivative of the residual function is

The set of equations to be solved in the M-step is

where

Downloaded from jas.fass.org by on June 17, 2010.


GENETIC COVARIANCES IN MULTIBREED POPULATIONS 3065

and where

3GOag
Do,, = i j = 1, ... , N,;
, DOm*. = - a i i , and
~

34,

B,, = {C M

m = l
nm(tr(Rim)-lDimi(Rim)-lD~mj]

for i = 1, ... , N,, j = N, + N, + 1, ... , N,;


Bnn = {nbu(trG,~kD,n,G,~D,nkj)},

and where

aGOnk
DOnh = , and i j = N, + 1, ... , N, + N,;
aii
~

Be, = {CM

m = l
, for i j = N, + N, + 1, ... , N,;
~(tr(Rim)-1Dimi(Rim)-1D6,jri

da = {nbgcom

ag = 1
trG&DOa,G~agSag. +
1 (p)
C
M

m = l
tr(Rim)-lDi,i(Rim)- , for i = 1, ... , N,;

and
M
tr(R:m)-lDimi(R:,)- , for i = N, + N, + 1, ... , N,.

Equations [121: 1) are nonlinear in 4, thus, 4 ( P + l ) must be computed iteratively (i.e., by successive
approximations [Harville, 1977; Harville and Callahan, 1990]), 2 ) are equal to those obtained by a Scoring
Algorithm applied to maximizing Q ( 4 I 4 ( P ) ) (R. L. Quaas, Cornel1 University, personal communication), and 3
have no built-in restrictions on the values of the covariances, thus there is no guarantee that all parameters
being estimated will be within the parameter space. Thus, computer programs need to incorporate restrictions to
ensure that estimates of covariance matrices are symF etric positive definite (or at least symmetric positive
semidefinite) at each Scoring iteration and at each EM iteration. Restriction strategies that could be considered
include 1) barrier and penalty functions (Fletcher, 1974; Ryan, 1974; Harville, 19771, 2 ) gradient projection
methods (Sargent, 1974; Harville, 1974), and 3 ) direct search methods (Swann, 1974).
The convergence criterion used to stop the Scoring and the EM iterations was that the absolute change in the
estimates of covariances of two successive iterations was small (Bard, 1974; Searle et al., 1992). Thus,
convergence was achieved when

where t is a vector of small numbers. The values o f t can be either set in advance or computed by the program
(Bard, 1974). In the second case, Bard (1974) recommended using Marquardt's (1963) expression

where 71 = lop4 and 72 = lop3.

Downloaded from jas.fass.org by on June 17, 2010.


Citations This article has been cited by 2 HighWire-hosted articles:
http://jas.fass.org#otherarticles

Downloaded from jas.fass.org by on June 17, 2010.

Das könnte Ihnen auch gefallen