Beruflich Dokumente
Kultur Dokumente
Received: 20 January 2012 Revised: 10 April 2012 Accepted: 16 April 2012 Published online in Wiley Online Library: 12 June 2012
Abstract
BACKGROUND: The protein and amino acid contents of peanuts play a key role in determining their quality and value.
Therefore, accurate, nondestructive, quick, and automated measurement of these components would be valuable in a
commercial environment. This study explored the feasibility of determining the contents of protein and amino acids in peanuts
using near infrared–reflectance spectroscopy (NIRS).
RESULTS: 141 peanut samples were collected from 12 provinces in China. The spectra were scanned and obtained with an NIRS
system. The determination coefficient and the ratio of the standard deviation in the validation set to the standard error of
validation corresponded to 0.99 and 6.53 for protein, 0.88 and 2.52 for Asp, 0.83 and 3.00 for Thr, 0.86 and 2.40 for Ser, 0.87
and 2.57 for Glu, 0.88 and 2.36 for Gly, 0.88 and 3.00 for Leu, 0.89 and 2.88 for Arg, and 0.96 and 7.50 for Cys.
CONCLUSIONS: NIRS combined with multivariate calibration has significant potential in determining the protein and amino
acid contents of peanuts. This method is suitable for use in an industrial setting owing to its ease of use as well as the relatively
low cost of obtaining and running the necessary equipment.
c 2012 Society of Chemical Industry
Some researchers found that amino acids, including aspartic and Quality Control, Beijing 100193, People’s Republic of China
considered for analysis in such complex food systems as peanuts, the sample in a tared glass, drying it in a ventilated oven for 4 h
which contain water, oil, fat, and protein, among others. NIRS is at 103 ◦ C,22 cooling it afterwards in a desiccator and weighing
ideal for quantitatively determining oil, protein and moisture by again. The crude protein (CP) content of the peanut samples was
deducing C–H, N–H and O–H bonds. In addition, high scatter measured using methods established by the American Association
coefficients allow for excellent diffuse reflectance spectra of solids for Cereal Chemists,23 and evaluated using a FOSS 2300 Nitrogen
to be obtained. Analyzer (FOSS, Sweden) and with a conversion factor of 5.46.
NIRS may be applied with minimal sample preparation, and Aspartic (Asp), threonine (Thr), serine (Ser), glutamic (Glu), glycine
it has been used in amino acid analysis by several researchers (Gly), leucine (Leu), arginine (Arg), and cysteine (Cys) were analyzed
with varying degrees of success. Van Kempen and Simmin12 and measured in our department using official method of the
evaluated NIRS for estimating the digestible amino acid contents American Association for Cereal Chemists.24
of several feed ingredients of animal origin. Cross-validation of
their calibration models for the prediction of lysine and methionine NIRS analysis
resulted in determination coefficients (R2 ) ranging from 0.80 to Peanut samples were scanned using a diode array analyzer (DA
0.95. Williams et al.13 reported satisfactory results (R2 = 0.66–0.96) 7200, Perten Instruments, Huddinge, Sweden). Each sample (60 g)
in correlating NIRS data on ground wheat and barley with their was fitted in a 75 mm diameter cup that rotated during NIRS
amino acid concentrations. Wu et al.14 showed the applicability scanning. Absorbance readings at 5 nm wavelength increments
of NIRS to amino acid analysis in milled rice powder. In their were collected over an NIR wavelength range of 950–1650 nm.
study, most of the amino acid calibration models had high Three scans were conducted on each sample and the data were
determination coefficients (R2 = 0.85–0.98), except for those averaged before analysis.
of cysteine (R2 = 0.78), histidine (R2 = 0.65), and methionine Using multivariate regression software (Unscrambler, v. 9.7,
(R2 = 0.10). Pazdernik et al.15 demonstrated that the accuracy of Camo, Oslo, Norway), partial least squares (PLS) (in which one
NIRS screening for amino and fatty acid concentrations in soybeans variable is modeled) was performed to develop prediction models
may be improved by grinding. Fontaine et al.16,17 obtained R2 for protein and amino acid. Principal components analysis (PCA)
values of 0.84–0.98 for soybeans and soybean meal. Overall, the was performed before the application of PLS regression to reduce
predictive ability of amino acid calibration models was found to the spectral data and derive the first nine principal components in
be dependent on the type of grain, sample form (whole grain order to examine the possible grouping of samples and to detect
or ground), and specific amino acid. In peanut-related research, the spectral outliers as well.
Rao et al.18 reported a high determination coefficient (R2 = 0.97) Protein and amino acids were the dependent variables. PLS
for peanut oil calibration models; Sundaram et al.19 assessed the regression analysis was conducted using the spectra from the
applicability of NIRS for peanut oil and fatty acid analysis in calibration dataset to develop an empirical equation for predicting
peanuts nondestructively. They found that most of the fatty the concentrations of total protein and amino acids (Asp, Thr, Ser,
acid calibration models had high residual predictive deviation Glu, Gly, Leu, Arg, and Cys). The validation step was carried out
(RPD) values. Single-kernel devices were used to determine the using the full cross-validation method and the external validation
moisture and oil concentrations in nuts and grains by crushing a method to ensure predictive ability and to avoid over-fitting of the
kernel of a nut or grain from the bulk sample and measuring its data.25 With cross-validation, the same samples were used for both
conductivity.20,21 Despite these advances in research, very little the calibration step and the validation step and the outlier samples
is known about the use of the NIRS for protein and amino acid were not omitted. A sample was removed from the calibration
analysis in peanuts. dataset, and the model was calibrated on the remaining data
Therefore, the aim of the present study was to evaluate the points. The value for the removed sample was predicted, and
ability of NIRS to predict the protein and amino acids contents of the prediction residual was computed. This process was repeated
peanut samples. Peanuts are a main source of plant protein for with another sample in the calibration set until every object had
humans. Modern diet formulation methods balance rations based been removed once, after which all prediction residuals were
on amino acid contents, thereby increasing the need to develop combined to compute the validation residual variance and root
rapid and cost-effective techniques for protein and amino acid mean square error of cross validation (RMSEV). The standard error
measurement. of calibration (SEC), the coefficient of determination in calibration
(R2c ), and RMSEV were calculated to evaluate the predictive ability
of the models. Moreover, the standard error of prediction (SEP),
MATERIALS AND METHODS bias, slope and the coefficient of determination in validation (R2v )
Samples were calculated based on external validation to evaluate the
One hundred and forty-one peanut samples were collected in 2011 performance of models developed between the parameters of
from the following 12 provinces in China: Shangdong, 89 varieties; predictive ability of the models developed for the whole sample
Henan, 21 varieties; Guangdong, 8 varieties; Fujian, 5 varieties; set. The ratio of the standard deviation of the validation set to the
Guangxi, 4 varieties; Hubei, 4 varieties; Jiangsu, 5 varieties; Jiangxi, standard error of validation or prediction (RPD) values was also
1 variety; Liaoning, 1 variety; Hunan, 1 variety; Sichuan, 1 variety used to evaluate the goodness of fit. The RPD values usually range
and Hebei, 1 variety. Ninety-nine varieties were assigned to the from 1 to 10. Higher values indicate a stronger calibration model
calibration set, whereas the remaining 42 varieties constituted the for the accurate prediction of unknown sample composition. RPD
validation set. All samples were stored at 4 ◦ C from collection until values of 1 or less are an indication of an inadequate model.
analysis. RPD values in the range 3.1–4.9 are good for initial screening
purposed (where an accurate prediction is not needed when a
Chemical analysis large number of peanut batches are handled), and greater than
Samples were analyzed in duplicate and then averaged. Dry 5 (range 5–6.4) is good for quality control and prediction.26 – 28
119
matter (moisture) was determined by accurately weighing 2 g of To develop NIRS calibration models for prediction of protein and
Table 1. Statistics of the protein and amino acids in peanuts (n = 141) in the calibration and validation set and the repeatability (Sr) of reference
methods
Crude protein 141 226.2–342.0 268.1 ± 19.6 99 237.1–342.0 282.5 ± 18.6 42 226.2–330.1 280.2 ± 22.5 5.40
ASP 141 22.6–49.0 33.4 ± 5.3 99 29.2–49.0 35.7 ± 3.7 42 22.6–45.6 29.8 ± 4.6 1.70
THR 141 5.4–9.6 7.3 ± 0.9 99 5.4–8.4 7.2 ± 0.5 42 5.4–9.6 7.4 ± 1.1 0.09
SER 141 8.8–20.8 13.8 ± 1.8 99 11.6–20.8 14.9 ± 1.6 42 8.8–18.3 12.0 ± 2.0 0.12
GLU 141 32.1–93.6 56.1 ± 10.6 99 46.4–93.6 63.1 ± 8.7 42 32.1–87.5 50.0 ± 11.9 2.34
GLY 141 10.3–23.6 16.1 ± 2.1 99 14.2–23.6 17.7 ± 1.5 42 10.3–21.6 14.5 ± 2.3 0.27
LEU 141 12.7–25.3 18.7 ± 1.5 99 15.7–25.3 18.9 ± 1.7 42 12.7–20.3 17.0 ± 2.1 0.85
ARG 141 22.8–47.3 33.5 ± 3.9 99 28.3–47.3 35.4 ± 3.6 42 22.8–46.2 30.6 ± 4.5 1.09
CYS 141 0.6–5.5 3.9 ± 0.7 99 2.5–5.5 4.0 ± 0.4 42 0.6–5.1 3.7 ± 1.0 0.13
amino acids, the entire (whole) dataset (141 samples) was split up residuals at the extremes and relatively equal weighting at the
into a calibration set and a validation set. This experimental design center.29
was chosen because a sample population could contain spectrally
similar samples whose characterization, in terms of the reference
analysis, could be expensive and unjustified.29 Chemical composition and NIRS analysis
Figure 1 shows the raw NIRS spectra of peanut samples over the
spectral range of 950–1650 nm. It contains information on the
RESULTS AND DISCUSSION relative proportions of C–H bonds (usually from fats and oil), O–H
Materials bonds (found in water) and N–H bonds (found in protein).30 Five
Basic statistics of the protein and amino acid compositions for absorption regions were observed (Fig. 2); the peaks at 950 nm
the calibration and validation sets are summarized in Table 1. were probably related to second overtone O–H and N–H stretches
Ninety-nine spectra were assigned to the calibration set, while and third overtone C–H stretches; the peaks near 1120, 1180 and
the remaining 42 constituted the validation set. The results 1230 nm corresponded to the second overtone of C–H stretching
reveal a broad range in protein and amino acid contents, vibration, the combination band of O–H stretching and O–H
especially in the calibration set, which is important for the deformation, respectively; and the peaks around 1410 nm were
development of calibration models. However, the range in related to the first overtone of O–H and N–H in amino and amide
the validation set was smaller for protein and most amino groups, which was the combination of the N–H stretching and
acids. The precision of the chemical methods was assessed in vibration with other vibration modes of the specific molecule.11,31
terms of the standard deviation of the difference between the The second derived mathematical treatment was used to show the
replicates (i.e. repeatability; Table 1). Repeatability was calculated five peaks probably related the second overtone O–H and N–H
to evaluate the quality of the NIR predictive models. In addition, stretches and the third overtone C–H stretches (Fig. 2). Figure 1
the mean and standard deviation values indicated that the shows that the NIRS on peanuts with different varieties were
formed sets were characterized by even constituent distributions, very similar to one another, and that the overlapping bands due
suggesting that calibration sets will weight the calibration model to overtones and combination modes makes the quantitative or
equally across the entire concentration range, with minimal qualitative analysis not straight. Therefore, chemometric methods
120
Figure 1. NIRS of peanut samples (wave number, (nm) in abscissa and absorbance in ordinate).
wileyonlinelibrary.com/jsfa
c 2012 Society of Chemical Industry J Sci Food Agric 2013; 93: 118–124
Contents of protein and amino acids in peanuts using reflectance spectroscopy www.soci.org
E
B C D
A
Second-order derivative
Wavelength
Figure 2. The second-order derivative spectra of the peanut samples: (A) second overtone O–H and N–H stretches and third overtone C–H stretches; (B,
C, D) second overtone of C–H stretching vibration, the combination band of O–H stretching and O–H deformation, respectively; (E) first overtone of
O–H and N–H in amino and amide groups, which was the combination of the N–H stretching and vibration with other vibration modes of the specific
molecule.
different calibration features.33 When the chemical method was quality control and analysis. Calibration and validation equations
Table 2. Calibration and validation statistics of the NIRS model for protein and amino acid content prediction developed for the sample set
Calibration Validation
Parameter Math treatment R2c RMSE R2 RMSEV SEP R2v Bias Slope RPD
Crude protein 2,4,4,1 0.99 0.19 0.98 0.29 0.03 0.92 0.00 1.000 6.53
ASP 0.88 0.18 0.6 0.34 0.21 0.84 −0.033 1.106 2.52
THR 0.83 0.03 0.77 0.04 0.03 0.82 0.006 1.023 3.00
SER 0.86 0.09 0.54 0.17 0.1 0.82 −0.011 1.11 2.40
GLU 0.87 0.45 0.54 0.86 0.49 0.85 0.063 1.085 2.57
GLY 0.88 0.09 0.55 0.18 0.11 0.82 −0.003 1.047 2.36
LEU 0.88 0.09 0.61 0.17 0.05 0.81 −0.033 0.983 3.00
ARG 0.89 0.16 0.66 0.29 0.17 0.86 −0.028 1.059 2.88
CYS 0.96 0.01 0.94 0.02 0.004 0.995 0.000 0.992 7.50
30
shows the scatter plot of references measured and predicted
using NIRS with PLS models in the prediction set. The prediction of
28 the amino acid content of peanuts using PLS regression analysis
developed as a standard showed a high coefficient between
26 the values of the reference method and the NIRS prediction.
Acceptable correlations between the predicted values and the
24 reference values were also found for Met, although the coefficients
of determination were of a significantly lower R2 (0.49) (data not
22 shown).
In addition, linear regression between amino acid and crude
22 24 26 28 30 32 34 protein (CP) contents for the same sample populations was
measured value calculated (Table 3). As a consequence, the r of the linear
regression of amino acids to CP was mostly high for the sample
Figure 4. References measured versus NIR predicted by the PLS model in
the calibration set and prediction set. population and equal to or slightly lower than the results from
NIRS calibration. Indeed, Thr NIRS explained much more of the
variance (0.91) than the CP regression, with a poor correlation of
for protein content showed that NIRS and reference values were 0.83. Rubenthaler and Bruinsma40 were the first to report amino
closely associated. A calibration plot between predicted values acid prediction using NIRS. They calibrated the ratio of Lys to CP
and measured values for total protein is shown in Fig. 4, which for several small wheat populations and obtained coefficients of
illustrates excellent accuracy. The goodness of the calibration correlation, r, between 0.85 and 0.98. They concluded that NIRS
for protein content was comparable with calibration equations predicted amino acids independent of CP. Workman41 presented
previously developed for seed protein in rapeseed36 and in single calibration data based on 111 calibration samples that were
seeds of wheat and soybean.37 The coefficient of determination selected to avoid spectral similarities from 400 corn samples.
in the cross-validation of protein calibration developed in the Using equipment and a calibration algorithm similar to those used
present study was excellent (R2 = 0.99) and compares well with in our laboratory, Workman41 achieved R2 values between 0.62
the results of Orman and Schumann,38 who used three types and 0.89 for 12 amino acids. The RMSEVs obtained were in the
of spectral data to predict the content of protein in maize range 0.02–0.14. Validation with 30 independent samples with
grain; their study showed that the protein values predicted R2 values of 0.23–0.58 gave very low correlations for five amino
by equations developed from all three types of spectral data acids and an SEP of 0.01 (Cys and Trp) or 0.02 (Met, Lys, and
correlated well with reference values, having R2 values ranging Thr). With data for soybean meal as well, Workman41 nevertheless
from 0.83 to 0.98. concluded that NIRS is a highly promising method for rapid amino
acid measurement in major feed ingredients. Dyer and Feng42,43
Amino acid calibration and prediction used NIR amino acid calibrations for screening purposes in the
Calibration, cross-validation, and external validation statistics for development of genetically altered grains. Based on data from
NIRS models for different sample sets are shown in Table 2, which 150 corn samples, they reported the following statistical data:
shows the comparative results of amino acid analysis from the R2 = 0.78 and RMSEV = 0.012 for methionine; RMSEV = 0.013 for
PLS model based on PCA feature extraction. Overall, R2 values for Cys; R2 = 0.93 and RMSEV = 0.017 for Lys and RMSEV = 0.013 for
different amino acids ranged from 0.83 to 0.96 and RPD ranged Thr. Despite good correlations, the standard errors were mostly
from 2.35 to 7.50. Based on the guidelines for interpreting R2 higher than those obtained in our laboratory based on very
122
outlined by Williams and Norris,39 the NIR calibration equations accurate reference analysis.
wileyonlinelibrary.com/jsfa
c 2012 Society of Chemical Industry J Sci Food Agric 2013; 93: 118–124
Contents of protein and amino acids in peanuts using reflectance spectroscopy www.soci.org
5.0 1.0
ASP Thr
4.5 0.9
Y=0.44984+0.86531x Y=0.12201+0.83301x
NIRS Predicted
NIRS Predicted
4.0
0.8
3.5
0.7
3.0
0.6
2.5
2.0 0.5
2.0 2.5 3.0 3.5 4.0 4.5 5.0 0.5 0.6 0.7 0.8 0.9 1.0
Measured value Measured value
10
Glu
Ser
2.0 9
Y=0.75807+0.86947x
1.8 Y=0.18891+0.86298x 8
NIRS Predicted
NIRS Predicted
1.6 7
1.4 6
1.2 5
1.0 4
0.8 3
2.8 Leu
Gly
2.4
2.6
Y=0.24324+0.85234x
2.2 Y=0.13165+0.92507x
2.4
2.0
NIRS Predicted
NIRS Predicted
2.2
1.8 2.0
1.6 1.8
1.4 1.6
1.2 1.4
1.0 1.2
1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6
Measured value Measured value
5.0 Arg
0.6 Cys
4.5
Y=0.36451+0.89119x 0.5 Y=0.0149+0.96138
NIRS Predicted
NIRS Predicted
4.0
0.4
3.5
0.3
3.0 0.2
2.5 0.1
2.0 0.0
2.0 2.5 3.0 3.5 4.0 4.5 5.0 0.0 0.1 0.2 0.3 0.4 0.5 0.6
Measured value Measured valued
Figure 5. References measured versus NIR predicted by the PLS Model in the calibration set and prediction set.
These results indicate that determining crude protein and amino ACKNOWLEDGEMENTS
acids using NIRS in peanuts is possible. Near-infrared regions This work was supported by the ‘Special Fund for Agro-scientific
have considerable influence on the spectra owing to the strong Research in the Public Interest’ (Serial No. 200903043). We thank
relationship between protein and amino acids, mainly with O–H our academic colleagues for many stimulating discussions in this
123
wileyonlinelibrary.com/jsfa
c 2012 Society of Chemical Industry J Sci Food Agric 2013; 93: 118–124