0 Bewertungen0% fanden dieses Dokument nützlich (0 Abstimmungen)
90 Ansichten12 Seiten
This document proposes using penalty analysis based on consumer responses to check-all-that-apply (CATA) questions to identify drivers of liking and directions for product reformulation. Two studies were conducted evaluating apples and yogurts using CATA questions about sensory characteristics and an ideal product. Data was analyzed by counting consumers who did not check an attribute for their ideal product and the associated drop in liking. Partial least squares regression identified attributes whose deviation from ideal caused decreases in liking. For apples, juiciness, sweetness, flavor, firmness and crispness were most important. For yogurt, smoothness, homogeneity and creaminess were main drivers and responsible for over a 1 point drop in the 9-point scale when
This document proposes using penalty analysis based on consumer responses to check-all-that-apply (CATA) questions to identify drivers of liking and directions for product reformulation. Two studies were conducted evaluating apples and yogurts using CATA questions about sensory characteristics and an ideal product. Data was analyzed by counting consumers who did not check an attribute for their ideal product and the associated drop in liking. Partial least squares regression identified attributes whose deviation from ideal caused decreases in liking. For apples, juiciness, sweetness, flavor, firmness and crispness were most important. For yogurt, smoothness, homogeneity and creaminess were main drivers and responsible for over a 1 point drop in the 9-point scale when
This document proposes using penalty analysis based on consumer responses to check-all-that-apply (CATA) questions to identify drivers of liking and directions for product reformulation. Two studies were conducted evaluating apples and yogurts using CATA questions about sensory characteristics and an ideal product. Data was analyzed by counting consumers who did not check an attribute for their ideal product and the associated drop in liking. Partial least squares regression identified attributes whose deviation from ideal caused decreases in liking. For apples, juiciness, sweetness, flavor, firmness and crispness were most important. For yogurt, smoothness, homogeneity and creaminess were main drivers and responsible for over a 1 point drop in the 9-point scale when
Penalty analysis based on CATA questions to identify drivers of liking
and directions for product reformulation
Gastn Ares a, , Cecilia Dauber a , Elisa Fernndez a , Ana Gimnez a , Paula Varela b a Departamento de Ciencia y Tecnologa de Alimentos, Facultad de Qumica, Universidad de la Repblica, General Flores 2124, CP 11800, Montevideo, Uruguay b Instituto de Agroqumica y Tecnologa de Alimentos, Avda. Agustn Escardino 7, 46980 Paterna, Valencia, Spain a r t i c l e i n f o Article history: Received 28 August 2012 Received in revised form 27 March 2013 Accepted 27 May 2013 Available online xxxx Keywords: CATA Apples Yogurt Consumer studies Product optimization a b s t r a c t One of the most important steps of new product development process is product optimization, which aims at identifying consumers ideal products and directions for product reformulation. The present work proposes the application of a penalty analysis based on consumer responses to CATA questions to identify drivers of liking and directions for product reformulation. Two studies were conducted in which 74 and 119 consumers evaluated a set of samples (5 apples and 8 yogurts) using a check-all-that-apply question related to sensory characteristics and were also asked to check all the terms they considered appropriate to describe their ideal product. Data were analyzed by counting the number of consumers who did not check an attribute as they did for their ideal product, and its associated mean drop. A dummy variable transformation approach was proposed to make linear regression models between CATA terms and over- all liking scores using Partial Least Squares (PLS). Juiciness, sweetness, apple avor, rmness and crispi- ness were the most relevant attributes for consumers in the apple study. Meanwhile, in the yogurt study smoothness, homogeneity and creaminess were the main drivers of liking and were responsible for the highest penalization on overall liking (more than 1 in the 9-point hedonic scale). PLS regression enabled the identication of the attributes which deviation from the ideal caused a signicant decrease in overall liking. Penalty analysis on CATA questions proved to be a simple and useful approach to identify drivers of liking and directions for improving the products in both studies. Advantages and disadvantages of this approach are discussed, as well as directions for further research. 2013 Elsevier Ltd. All rights reserved. 1. Introduction New product development has been regarded as a strategy for gaining competitive advantage and long-term nancial success (Costa & Jongen, 2006). The implementation of a market-orienta- tion and consumer-driven approach has been recognized as the best way to develop successful products (Grunert, Baadsgaard, Larsen, & Madsen, 1996; Stewart-Knox & Mitchell, 2003). The main stages of a consumer-driven new product development process are: identication of consumer needs, development of an idea to address those needs, product design to substantiate the idea and the products market introduction (Urban & Hauser, 1993). Within product design, one key step is the selection of a product formula- tion that is aligned as much as possible with consumer sensory preferences (van Kleef, van Trijp, & Luning, 2006). In this context, one of the main challenges for Sensory and Consumer Science is to provide actionable information for making specic changes in product formulation, and not just product descriptions (Moskowitz & Hartmann, 2008). Over the years, many strategies have been used in new product development to identify the sensory attributes that drive con- sumer preferences and the characteristics of the ideal product, i.e. the product that maximize consumer liking (Lagrange & Norback, 1987). A popular approach has been the application of preference mapping, which consists of a group of techniques that are able to relate consumer liking scores of a large set of products with their sensory characteristics as evaluated by a trained asses- sor panel (van Kleef et al., 2006). Considering the time and re- sources associated with creating and training trained assessor panels, particularly for specic applications during new product development, consumer-based sensory characterization has gained popularity in the last decade (Varela & Ares, 2012). Moreover, trained assessors may describe the product differently to consum- ers and/or evaluate attributes that may be irrelevant for consum- ers, consumer-driven sensory characterization of products could have greater external validity (ten Kleij & Musters, 2003). Thus, product optimization is increasingly being performed by asking consumers to describe the sensory characteristics of food products. Just-about-right (JAR) scales have been one of the rst and sim- plest consumer-based approaches to get information about the optimum intensity of sensory attributes (Popper & Kroll, 2005). 0950-3293/$ - see front matter 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.foodqual.2013.05.014
Corresponding author. Tel.: +598 29248003.
E-mail address: gares@fq.edu.uy (G. Ares). Food Quality and Preference xxx (2013) xxxxxx Contents lists available at SciVerse ScienceDirect Food Quality and Preference j our nal homepage: www. el sevi er . com/ l ocat e/ f oodqual Please cite this article in press as: Ares, G., et al. Penalty analysis based on CATA questions to identify drivers of liking and directions for product refor- mulation. Food Quality and Preference (2013), http://dx.doi.org/10.1016/j.foodqual.2013.05.014 In this approach consumers are asked to evaluate a set of attributes as deviations from their ideal, by indicating if its intensity is too strong, too weak or just-about-right (Lawless & Heymann, 2010). Penalty analysis on data from JAR has been used to identify the sensory attributes that have the largest inuence on consumer lik- ing and to identify directions for product reformulation (Plaehn & Horne, 2008). As an alternative, Xiong and Meullenet (2006) intro- duced a partial least squares (PLS) regression approach to study the relative inuence of attributes on consumer liking. Penalty analysis on data from JAR scales enables the identication of the products which are closer to the ideal, the direction in which an attribute should be changed if it is not in its optimum or JAR level and how much liking is affected when an attribute is not JAR (Lesniaus- kas & Carr, 2004). Despite their popularity and the fact that they provide actionable information, the application of JAR scales in product optimization has raised several concerns. This type of task could make consumers focus on sensory characteristics that they would not normally do (Popper & Kroll, 2005), leading to changes in their hedonic perception (Ares, Barreiro, & Gimnez, 2009; Epler, Chambers, & Kemp, 1998; Popper, Rosentock, Schraidt, & Kroll, 2004). Intensity questions have been reported to have a smaller inu- ence on consumer liking and have been recommended for product optimization by some authors (Moskowitz, 2001; Popper et al., 2004). Considering that consumers are able to rate attribute inten- sity (Husson, Le Dien, & Pags, 2001; Moskowitz, 1996; Worch, L, & Punter, 2009) and assuming that they have an implicit ideal in their minds (Moskowitz, 2003), Van Trijp, Punter, Mickartz, and Kruithof (2007) proposed the Ideal Prole method for identifying ideal products. In this approach consumers are asked to directly rate attribute intensity for their ideal product using unstructured scales. Although this method has been shown to provide accurate descriptions of ideal products that are similar to the most liked products (Worch, Dooley, Meullenet, & Punter, 2010; Worch, L, Punter, & Pags, 2012a, 2012b) and actionable information for product reformulation similar to that provided by JAR scales, it could be difcult and not intuitive for consumers to rate the ideal intensity of a large set of attributes using scales. Check-all-that-apply (CATA) questions have been gaining popu- larity for sensory characterization of food products by consumers due to their simplicity and ease of use (Adams, Williams, Lancaster, & Foley, 2007; Ares, Barreiro, Deliza, Gimnez, & Gmbaro, 2010; Ares, Varela, Rado, & Gimnez, 2011a; Dooley, Lee, & Meullenet, 2010; Plaehn, 2012). In this approach, consumers are presented with a list of terms and are asked to select all the terms that they consider appropriate for the product. The relevance of each term is determined by calculating its frequency of use. CATA questions have been reported to be a quick, simple and easy method to gath- er information about consumer perception of the sensory charac- teristics of food products; having a smaller inuence on liking scores than just-about-right or intensity questions (Adams et al., 2007). Plaehn (2012) proposed a penalty analysis on data from CATA questions to identify the relative importance of emotional attri- butes on overall liking scores of a set of citrus avored sodas. Con- sidering that CATA questions have been used to identify the sensory characteristics of consumer ideal product (Ares, Varela, Rado, & Gimnez, 2011b; Cowden, Moore, & Vanluer, 2009), a pen- alty analysis approach could be used to identify how much overall liking is reduced because of the deviations in sensory proles be- tween real and ideal products, as detected by a CATA question. In this context, the aim of the present work was to identify driv- ers of liking and directions for product reformulation by applying a penalty analysis based on consumer responses to CATA questions about a set of samples and their ideal product. 2. Materials and methods Two studies were carried out in which consumers were asked to answer a CATA question to describe a set of samples and their ideal product. In the rst study consumers were asked to score their tex- ture liking and to describe the texture of eight yogurts formulated following a factorial design. In the second study consumers evalu- ated their overall liking of ve commercial apple cultivars and completed a CATA question which included odor, avor and tex- ture characteristics. Penalty analysis based on consumer responses to the products compared to their ideal product was used to iden- tify drivers of liking and directions for product reformulation. 2.1. Study 1: yogurt study 2.1.1. Samples Eight yogurts were formulated by modifying the fat content of the milk, and the concentration of gelatin and modied starch (Na- tional 465, National Starch, Trombudo Central, Brasil), following a 2 3 full factorial design. These variables have been previously re- ported to affect yogurt texture (Tamime & Robinson, 1991). Sample formulations (Table 1) were selected in order to get a set of yogurts with a range of different texture characteristics, based on previous studies (Ares et al., 2007), the usual formulation of yogurts com- mercialized in the Uruguayan market, and results from preliminary tests. Yogurts were prepared using 8% commercial sugar and 2% pow- dered skimmed milk. The rest of the formulation consisted of gel- atin, modied starch, skimmed pasteurized milk (0.1% fat content) or whole pasteurized milk (2.6% fat content), as shown in Table 1. Yogurts were prepared using a Thermomix TM 31 (Vorwerk Mexico S. de R.L. de C.V., Mexico D.F., Mexico). The solid ingredi- ents were mixed with the milk, previously heated to 50 C. The dis- persion was mixed for 1 min under gentle agitation (100 rpm), heated to 90 C for 5 min and cooled to 42 C. Then, the mix was placed in glass containers and inoculated with 1 mL of lactic cul- tures, prepared by dispersing lyophilized cultures (Yo-Mix 205 LYO 250 DCU, Danisco, France) in UHT skimmed milk to a concen- tration of 250 DCU per liter. Fermentation was carried out in a temperature controlled oven at (42 1) C and stopped when the sample reached a pH of 4.55 (after 56 h, depending on the formulation). When the nal pH was reached, the coagulum was broken by agitating each yogurt for 1 min using the Thermomix TM 31 at 100 rpm. After that, yo- gurts were placed in glass containers, cooled under agitation to 25 C in a water bath at 5 C, and then stored refrigerated (4 5 C) for 24 h, prior to their evaluation. 2.1.2. Consumer testing Consumers (n = 74) were recruited among students, professors and workers from the School of Chemistry (Universidad de la Table 1 Formulation of the yogurts used in Study 1. Sample Milk fat content (%) Concentration of modied starch (%) Concentration of gelatin (%) 1 0.1 0 0 2 0.1 0 0.5 3 0.1 1 0 4 0.1 1 0.5 5 2.6 0 0 6 2.6 0 0.5 7 2.6 1 0 8 2.6 1 0.5 2 G. Ares et al. / Food Quality and Preference xxx (2013) xxxxxx Please cite this article in press as: Ares, G., et al. Penalty analysis based on CATA questions to identify drivers of liking and directions for product refor- mulation. Food Quality and Preference (2013), http://dx.doi.org/10.1016/j.foodqual.2013.05.014 Repblica, Montevideo, Uruguay) based on their yogurt consump- tion (at least once a week) and their interest and availability to par- ticipate in the study. Their ages were 1867 years old and 64% were female. Cash incentives were not provided. Testing was conducted in standard sensory booths under arti- cial daylight type illumination, temperature control (2224 C) and air circulation. Samples were presented in a monadic series, in closed plastic containers labeled with three-digit random numbers, at room temperature. Twenty grams of each sample were served to assessors at 10 C in closed odorless plastic containers labeled with three digit ran- dom numbers. Sample presentation order followed a completed block design balanced for carry-over and position effects. Still min- eral water was used for rinsing between samples. Consumers were rst asked to score their texture liking using a horizontal 9-point hedonic scale anchored at dislike very much (1) and like very much (9). Next, they completed a CATA question with 16 texture terms related to texture characteristics of yogurts. The terms were selected based on previous qualitative consumer studies (Gimnez & Ares, 2010) and were the following: smooth, viscous, homogenous, liquid, lumpy, creamy, sticky, rough, gummy, thick, gelatinous, rm, heterogeneous, consistent, runny, and mouth-coat- ing. Consumers were asked to try each yogurt sample and to check all the terms that they considered appropriate to describe its texture. Then, consumers were asked to check all the terms they considered appropriate to describe the texture of their ideal yogurt. 2.2. Study 2: apple study 2.2.1. Samples Five commercial apple cultivars available in Uruguay were used: crisp pink, fuji, granny smith, red delicious and royal gala. All were provided by a fruit and vegetable wholesale supplier lo- cated in Montevideo, Uruguay. Apples were removed from a cool storage room at 5 C 24 h prior to testing and placed at room tem- perature. Each fruit was cleaned with a wet cloth and cut into quarters approximately 5 min before tasting. If any bruising or vi- sual defect was observed the sample was discarded. 2.2.2. Consumer testing Consumers (n = 119) were randomly recruited among people walking through the City Hall of Montevideo (Uruguay) based on their apple consumption (at least once a week) and their interest in participating. Their ages were 1875 years old and 67% were fe- male. Cash incentives were not used. Testing was conducted in standard sensory booths under arti- cial daylight type illumination, temperature control (2224 C) and air circulation. Samples were presented monadically, in plastic containers labeled with three-digit random numbers, at room tem- perature. Sample presentation order followed a completed block design balanced for carry-over and position effects. Water was available for rinsing between samples. Consumers were rst asked to score their overall liking using a horizontal 9-point hedonic scale anchored at dislike very much (1) and like very much (9). Next, they completed a CATA question with 15 terms related to sensory characteristics of apples. Consumers were asked to try the sample and then to check all the terms that they con- sidered appropriate to describe each apple. The terms were selected based on previous literature (Andani, Jaeger, Wakeling, & MacFie, 2001; Daillant-Spinnler, MacFie, Betys, & Hedderley, 1996; Jaeger, Andani, Wakeling, & MacFie, 1998) and preliminary consumer stud- ies. The terms considered in the CATA question included texture, a- vor and odor characteristics: rm, sour, odorless, juicy, crispy, tasteless, sweet, avorsome, mealy, bitter, coarse, apple avor, apple odor, soft and astringent. After testing each sample, consumers were asked to complete the CATA question to describe their ideal apple. 2.3. Data analysis Overall liking scores were analyzed using analysis of variance (ANOVA) considering sample as xed source of variation and con- sumer as a random effect. Cluster analysis was applied on centered and reduced overall liking scores from Study 2 in order to identify consumer segments with different preference patterns, consider- ing Euclidean distances and Ward aggregation. Frequency of use of each sensory attribute was determined by counting the number of consumers that used that term to describe each sample. Cochrans Q test (Manoukian, 1986; Parente, Manzon- i, & Ares, 2011) was carried out to identify signicant differences between samples for each of the terms included on the CATA ques- tion. In Study 2, Fishers exact test (Fisher, 1954) was used to deter- mine signicant differences between clusters in the frequency of use of each term for describing the ideal product. Correspondence analysis (CA) was used to get a bi-dimensional representation of the samples and the relationship between sam- ples and terms from the CATA question. This analysis was per- formed on the frequency table containing the samples in rows and the terms from the CATA question on the columns. The ideal product was considered as supplementary individual in the analy- sis. This option is available in R language. A multiple factor analysis for contingency tables (MFACT) was used to investigate the relationship between responses to the CATA question of the two consumer groups identied in the cluster anal- ysis (Bcue-Bertau & Pags, 2004). The frequency table of each con- sumer segment was considered as a separate group of variables in the analysis. RV coefcient between the congurations of both clusters was also calculated. Penalty analysis was carried out on consumer responses to determine the drop in overall liking associated with a deviation from the ideal for each attribute from the CATA question. CATA data is usually coded as binary data assigning 1 or 0 if a term is checked or not checked to describe a product, respectively. In the present work, a dummy variable approach was used to describe if an attribute was used to describe the product as in the ideal product (0) or differently (1). Therefore, for each attribute the per- centage of consumers who used it differently for describing each product and the ideal was determined, as well as the mean drop in liking associated with that deviation from the ideal. A one factor KruskalWallis test was performed for each CATA variable as the factor and overall liking as dependent variable, in order to deter- mine if deviation from the ideal for each attribute caused a signif- icant decrease in overall liking (Plaehn, 2012). Furthermore, a partial-least squares (PLS) regression was used to estimate the weight of the deviation from the ideal of each term from the CATA question, following a similar approach to that pro- posed by Xiong and Meullenet (2006). In this model absolute liking scores were considered as dependent variable and the dummy variables indicating if consumers described the product different from their ideal as regressors. Only attributes which were consid- ered as deviated from the ideal for at least 20% of the consumers were considered, as suggested by Xiong and Meullenet (2006) and Plaehn (2012). Table 2 Mean texture liking scores and standard deviations (between brackets) for the yogurt samples evaluated in Study 1. Sample 1 2 3 4 5 6 7 8 Texture liking (n = 74) 4.2 c,d (2.1) 5.6 a (1.9) 3.5 d (2.2) 5.2 a,b (1.9) 5.6 a (2.1) 5.9 a (1.9) 4.4 b,c (2.3) 5.3 a,b (1.9) Mean texture liking scores with different superscripts are signicantly different according to Tukeys test for a condence level of 95%. G. Ares et al. / Food Quality and Preference xxx (2013) xxxxxx 3 Please cite this article in press as: Ares, G., et al. Penalty analysis based on CATA questions to identify drivers of liking and directions for product refor- mulation. Food Quality and Preference (2013), http://dx.doi.org/10.1016/j.foodqual.2013.05.014 All signicance test were done at a signicance level of 0.05. Statistical analyses were performed using XLStat 2009 (Addinsoft, Paris, France) and R language (R Development Core Team, 2007) using FactoMineR (L, Josse, & Husson, 2008). 3. Results 3.1. Study 1: yogurt samples 3.1.1. Texture liking scores Signicant differences in the texture liking scores of the yogurt samples were found (F = 13.19, p < 0.0001). As shown in Table 2, average texture liking scores were low, ranging from 3.5 (SD = 2.2) to 5.9 (SD = 1.9). Samples 2, 4, 5, 6 and 8 had the highest overall liking scores (5.25.9), whereas samples 1 and 3 were the least preferred by consumers. 3.1.2. CATA counts Signicant differences (p 6 0.05) in the frequency with which 14 out of the 16 terms of the CATA question were used to describe the yogurt samples, suggesting that consumers perceived differ- ences in the sensory characteristics of the evaluated yogurts (Ta- ble 3). The ideal yogurt was described as smooth, homogeneous, creamy, consistent and thick, which indicates that these were the main drivers of liking for this type of product, in agreement with Pohjanheimo and Sandell (2009) and Bayarri, Carbonell, Barrios, and Costell (2011). According to their texture, samples were sorted into three main groups, as shown in sample representation in the rst and second dimensions of the CA (Fig. 1). A rst group of yogurts, composed of samples 3 and 7, were located at positive values of the rst dimen- sion and negative values of the second dimension, being mainly described as heterogeneous, lumpy and rough. These two samples had a similar formulation and only differed in their fat content; they both included 1% of modied starch and did not include gel- atin. Samples 1 and 5 were located at positive values of the rst and second dimension and were described as runny and liquid by consumers; which could be explained by the fact that these samples did not include modied starch and gelatin in their formu- lation (Table 1). Finally, samples 2, 6, 4 and 8, which were formu- lated with 0.5% gelatin, were located at negative values of the second dimension and were described as thick, consistent, rm and gelatinous. As shown in Fig. 1, the ideal yogurt was characterized by the terms smoothness, creaminess and homogeneity. As expected, the position of the ideal product was close to the samples which showed the highest texture liking scores and relatively far from the least preferred samples (Table 2). 3.1.3. Penalty analysis Fig. 2 shows the mean drops in texture liking as a function of the proportion of consumers that checked an attribute differently than for the ideal product for three yogurt samples. As shown, the penalty analysis enabled the identication of directions for product improvement for each of the samples. In the case of sam- ple 1, the attributes with the highest mean drop and deviation from the ideal were Homogeneous, Consistent and Thick. By look- ing at Table 3 it seems clear that it is necessary to increase the homogeneity and thickness of this sample since the frequency of use of these attributes to describe sample 1 was lower than for the ideal product. Furthermore, in the case of sample 3 the main sensory problems were associated with its smoothness, lumpiness, homogeneity and creaminess, which made it largely deviate from the ideal yogurt. Finally, for sample 6 the percentage of consumers who stated that the attributes deviated from the ideal was lower than for samples 1 and 3, in agreement with the higher overall lik- ing score of the former sample. The main deviations from the ideal and penalties for this sample were related to its creaminess and smoothness. As shown in Table 3 deviation from the ideal in those attributes was associated with a lower frequency of mention that can be linked to a lower intensity, when compared to the ideal yo- gurt. It is worth mentioning though, that in some products there are some attributes never perceived as enough by consumers, among which creaminess is a typical example. In a product where creaminess is characteristic (ice-cream, sauces, soups, etc.) they might always report as not creamy enough. This fact has been found when consumers use JAR scales (Moskowitz, 2003; Roth- man, 2007), and would probably have an inuence when using other kind of scales or even CATA questions, when consumers rate products comparing with their expectations for an ideal product that have in their minds. Regression coefcients of the PLS regression models for the eight yogurt samples are shown in Table 4. For all samples devia- tion from the ideal signicantly affected overall liking for only a subset of attributes. Smoothness was the only attribute in which deviation from the ideal signicantly decreased overall liking for all samples. For sample 1, overall liking scores signicantly de- creased because smoothness, lumpiness, homogeneity, roughness, mouth-coating, heterogeneity and its liquid consistency deviated from the ideal. Meanwhile, in the case of sample 2 overall liking signicantly decreased due to the deviation from the ideal in smoothness, homogeneity and creaminess. According to Xiong and Meullenet (2006) one of the main advantages of PLS-based penalty analysis is information about the maximum potential improvement on overall liking, which is calculated as the difference between the models intercept and ac- tual mean liking score. Although the condence interval of the intercept can be broad, the estimation of the maximum potential improvement is usually in agreement with average liking scores. As shown in Table 4, the maximumpotential increase in overall lik- ing if the attributes that deviated from the ideal were modied Table 3 Frequency (%) with which the terms of the CATA question were used by consumers to describe the eight yogurt samples and their ideal product, and results from Cochrans Q test for comparison between samples. Attribute Sample Ideal 6 5 2 8 4 7 1 3 Smooth *** 92 64 62 53 45 38 23 41 12 Creamy ** 86 38 35 35 38 36 32 16 18 Homogeneous *** 80 57 26 39 43 49 5 20 8 Consistent *** 41 45 11 45 55 57 20 0 9 Thick *** 38 43 8 32 51 49 30 3 23 Firm *** 20 45 1 36 65 47 8 0 1 Runny *** 18 5 47 11 0 3 15 55 20 Viscous ns 12 12 14 8 15 7 7 5 18 Mouth-coating * 9 19 14 11 16 16 24 15 30 Liquid *** 3 1 45 4 0 3 22 73 23 Heterogenous 3 7 18 19 0 4 42 32 49 Lumpy *** 1 11 26 7 8 11 61 32 57 Gelatinous *** 0 22 0 30 26 31 0 1 4 Sticky ns 0 4 3 4 8 3 8 3 14 Rough *** 0 7 9 5 11 16 46 24 46 Gummy ns 0 1 1 0 5 5 7 1 4 Samples are arranged in descending texture liking order from left to right. *** Indicates signicant differences between samples according to Cochrans Q test at p 6 0.001. ** Indicates signicant differences between samples according to Cochrans Q test at p 6 0.01. * Indicates signicant differences between samples according to Cochrans Q test at p 6 0.05. ns Indicates no signicant differences between samples according to Cochrans Q test (p 6 0.05). 4 G. Ares et al. / Food Quality and Preference xxx (2013) xxxxxx Please cite this article in press as: Ares, G., et al. Penalty analysis based on CATA questions to identify drivers of liking and directions for product refor- mulation. Food Quality and Preference (2013), http://dx.doi.org/10.1016/j.foodqual.2013.05.014 Fig. 1. Representation of the yogurt samples, the ideal product and the terms in the rst and second dimensions of the correspondence analysis of the CATA counts of Study 1. Fig. 2. Mean drops in overall liking as a function of the percentage of consumers that checked an attribute differently than for the ideal product for three of the yogurt samples of Study 1. Attributes highlighted in bold correspond to those in which more than 20% of the consumers considered that it deviated from the ideal and caused a signicant decrease in texture liking according to KruskalWallis test for a 95% condence level. G. Ares et al. / Food Quality and Preference xxx (2013) xxxxxx 5 Please cite this article in press as: Ares, G., et al. Penalty analysis based on CATA questions to identify drivers of liking and directions for product refor- mulation. Food Quality and Preference (2013), http://dx.doi.org/10.1016/j.foodqual.2013.05.014 ranged from 1.3 to 3.0. This information enables to take decisions for product reformulation based on the potential gain in consumer overall liking scores. By combining Table 4 with the description of the samples and the ideal yogurt presented in Table 3, it is possible to identify rec- ommendations for product improvement for each of the evaluated samples. Lumpiness was the attribute with the highest regression coefcient in the PLS model of sample 1, suggesting that it was the main attribute to be modied to improve it. Similarly, in the case of sample 8, creaminess was the attribute with the highest regression coefcient. Considering that this attribute was less fre- quently used to describe this sample than the ideal product, it would be recommended to increase its creaminess. For some sam- ples (samples 3, 5, 6 and 7) there were several attributes with sim- ilar weights with regard to their inuence on texture liking. A summary of the recommendations for improvement of each prod- uct is shown in Table 5. 3.2. Study 2: apple samples 3.2.1. Overall liking scores Signicant differences in the overall liking scores of the apple cultivars were found (F = 12.34, p < 0.0001). As shown in Table 6, Table 4 Percentage of consumers (%) who describe each yogurt sample as different from the ideal for each of the attributes included in the CATA question, regression coefcients (RC) and intercept of PLS models. Term Sample 1 Sample 2 Sample 3 Sample 4 Sample 5 Sample 6 Sample 7 Sample 8 % RC % RC % RC % RC % RC % RC % RC % RC Smooth 62 0.15 50 0.24 82 0.17 59 0.21 41 0.16 41 0.20 77 0.14 53 0.14 Lumpy 31 0.31 8 55 0.10 12 27 0.15 12 59 ns 9 Viscous 18 12 16 14 20 ns 14 16 22 0.15 Homogeneous 65 0.13 49 0.18 77 0.08 39 0.17 59 ns 28 0.16 74 0.10 36 ns Liquid 73 0.14 4 26 0.09 5 45 0.18 4 24 ns 3 Thick 38 ns 32 ns 34 ns 46 ns 35 ns 41 ns 32 ns 43 ns Gelatinous 1 30 ns 4 31 ns 0 22 ns 0 26 ns Firm 20 ns 41 ns 22 ns 41 ns 19 46 ns 26 0.14 55 0.15 Sticky 3 4 14 ns 3 3 4 8 ns 8 Creamy 73 ns 57 0.18 69 0.10 58 0.32 59 ns 51 0.16 57 0.19 57 0.35 Rough 24 0.17 5 46 0.09 16 9 7 46 0.14 11 Consistent 41 ns 45 ns 39 ns 41 ns 38 ns 39 0.17 39 ns 45 0.18 Mouth-coating 22 0.13 12 34 0.10 20 ns 18 15 28 0.11 18 Gummy 1 0 4 5 1 1 7 5 Runny 51 ns 23 ns 30 0.09 18 35 0.13 23 ns 27 0.11 18 Heterogenous 35 0.15 22 ns 49 0.12 7 18 9 45 0.20 3 Intercept 7.2 7.2 6.3 7.0 6.9 7.3 7.3 7.4 Mean texture liking 4.2 5.6 3.5 5.2 5.6 5.9 4.4 5.3 Mean drop * 3.0 1.8 2.8 1.8 1.3 1.4 2.9 2.1 : Indicates that the attribute was not included in the PLS model because less than 20% of the consumers considered that it deviated from the ideal; ns: corresponds to non- signicant coefcients. * Mean drop is calculated as the intercept of the model minus the actual texture liking score. Table 5 Summary of the recommendations for reformulating the texture of the eight yogurts considered in Study 1, based on results from PLS modeling (Table 4) and consumer responses to the CATA question (Table 3). Sample Main recommendations for reformulation 1 Reduce lumpiness and roughness. Increase smoothness, homogeneity and thickness (to reduce deviation in liquid) 2 Increase smoothness, homogeneity and creaminess 3 Increase smoothness, homogeneity, consistency and creaminess. Reduce lumpiness and roughness and heterogeneity 4 Increase creaminess, smoothness and homogeneity 5 Increase smoothness and consistency (to reduce deviation in Liquid). Reduce lumpiness 6 Increase smoothness, homogeneity, creaminess. Reduce consistency 7 Increase smoothness, creaminess and homogeneity. Reduce rmness, roughness, mouth-coating and heterogeneity 8 Reduce rmness, consistency and viscosity. Increase smoothness and creaminess Table 6 Mean overall liking scores and standard deviations (between brackets) for the apple cultivars evaluated in Study 2, at the aggregate level and for the two consumer segments identied using Cluster analysis. Sample Global (n = 119) Cluster 1 (n = 79) Cluster 2 (n = 40) Crisp pink 7.2 b (2.1) 7.7 c (1.9) 6.3 b (2.2) Fuji 7.1 b (2.1) 7.4 c (1.9) 6.1 b (2.1) Granny smith 5.7 a (2.5) 6.4 b (2.3) 4.2 a (2.3) Red delicious 6.2 a (2.6) 5.2 a (2.3) 8.2 c (1.1) Royal gala 5.7 a (2.3) 5.2 a (2.2) 6.7 b (1.9) Mean overall liking scores with different superscripts are signicantly different according to Tukeys test for a condence level of 95%. Table 7 Frequency (%) with which the terms of the CATA question were used by consumers to describe the ve apple cultivars and their ideal apple, and results from Cochrans Q test for comparison between samples. Attribute Sample Ideal Crisp pink Fuji Red delicious Royal gala Granny Smith Juicy *** 92 63 76 48 51 49 Firm *** 79 68 70 18 19 66 Sweet *** 77 32 39 61 31 5 Flavorsome *** 76 43 44 31 25 25 Apple avor *** 69 45 40 37 25 14 Crispy *** 64 66 55 11 16 46 Apple odor *** 39 13 8 8 5 8 Sour *** 22 52 12 3 7 80 Astringent *** 7 8 7 1 3 16 Soft *** 6 1 2 45 49 2 Mealy *** 5 1 0 58 36 1 Coarse *** 3 3 1 24 15 2 Bitter *** 2 5 10 3 6 18 Odourless *** 1 13 14 14 22 14 Tasteless *** 0 4 9 10 31 8 Samples are arranged in descending texture liking order from left to right. *** Indicates signicant differences between samples according to Cochrans Q test at p 6 0.001. 6 G. Ares et al. / Food Quality and Preference xxx (2013) xxxxxx Please cite this article in press as: Ares, G., et al. Penalty analysis based on CATA questions to identify drivers of liking and directions for product refor- mulation. Food Quality and Preference (2013), http://dx.doi.org/10.1016/j.foodqual.2013.05.014 on the aggregate level overall liking scores ranged from 5.7 (SD = 2.5) to 7.2 (SD = 2.1); being crisp pink and fuji the preferred cultivars. Cluster analysis on overall liking scores enabled the identica- tion of two consumer segments with different preference patterns. Cluster 1 was composed of 79 consumers who clearly preferred crisp pink and fuji apples and rejected royal gala and red delicious (Table 6). On the other hand, the remaining 40 consumers (Cluster 2) preferred red delicious apples, rating their overall liking using an average score higher than 8, whereas they disliked slightly granny smith. 3.2.2. CATA counts Signicant differences (p h0.001) were found in the frequency with which all the terms included in the CATA question were used to describe the apple samples, suggesting that consumers per- ceived large differences in the sensory characteristics of the evalu- ated apple cultivars (Table 7). Sample representation in the rst and second dimensions of the CA showed that according to both consumer segments the apples were sorted into three groups (Fig. 3). A rst group, located at negative values of the rst and second dimension, was composed of rm and crispy apples, crisp pink and fuji. Royal gala and red delicious formed a second group, being described as mealy, soft and coarse by both clusters. Finally, granny smith apples were located in a distinct position due to their sourness, bitterness and astringency. Despite the sensory maps of the samples (RV coefcient = 0.91, p h0.0001) and their general description were similar, the clusters differed in the location of their ideal apple and in how they used some of the terms of the CATA question. As shown in Fig. 3, the location of the ideal apple for Clusters 1 and 2 was clearly different. For consumers in Cluster 1 the sensory Fig. 3. Representation of the samples, the ideal apple and the terms in the rst and second dimensions of the correspondence analysis of the CATA counts of Study 2 for the two identied consumer segments with different preference patterns: (a) Cluster 1 (n = 79) and (b) Cluster 2 (n = 40). G. Ares et al. / Food Quality and Preference xxx (2013) xxxxxx 7 Please cite this article in press as: Ares, G., et al. Penalty analysis based on CATA questions to identify drivers of liking and directions for product refor- mulation. Food Quality and Preference (2013), http://dx.doi.org/10.1016/j.foodqual.2013.05.014 characteristics of the ideal apple were similar to those of cultivars fuji and crisp pink, whereas for Cluster 2 the ideal apple was inter- mediate between Fuji and Red delicious, being closer to the latter. The location of the ideal products for both consumer segments was in agreement with overall liking scores. The difference in the loca- tion of the ideal apple of both consumer segments could be ex- plained by their responses to the CATA question. As shown in Table 8, signicant differences between clusters were identied in the frequency with which 4 terms of the CATA question were used to describe the ideal apple. Consumers in Cluster 1 used sig- nicantly more frequently the terms rm, sour and crispy, and sig- nicantly less frequently the termsoft than consumers in Cluster 2, which indicates differences in their drivers of liking. Cluster 1 pre- ferred rmer, crisper and more sour apples than consumers in Cluster 2, as shown in Table 6 and Fig. 3. Cluster 1 clearly preferred fuji and crisp pink apples, which were characterized by their rm- ness and crispiness; whereas they rejected red delicious and royal gala apples which were described as soft and mealy. The opposite trend was found for Cluster 2. Daillant-Spinnler et al. (1996) also found consumer segmentation when testing 12 southern-hemi- sphere varieties of apples, with patterns according to whether a sweet, hard apple or a juicy, acidic apple was preferred. Regarding the use of the terms from the CATA question, the rep- resentation of the terms in the rst and second dimension of the MFACT showed that the clusters differed in the way in which they described the samples; in particular in how they used terms some of them related to complex sensory attributes, such as Apple avor, avorsome, apple odor, and tasteless (Fig. 4). These attributes for both clusters were located far from each other, suggesting that they were used differently. As shown in Fig. 3, consumers in both segments associated avor and odor intensity with their preferred apple cultivars. The terms avorsome and apple avor were asso- ciated with crisp pink and fuji for Cluster 1, whereas they were associated with royal gala and red delicious for Cluster 2. A similar trend was found for the term tasteless, which was associated with red delicious and royal gala apples for Cluster 1 and with granny smith for consumers in Cluster 2. On the other hand, it is interest- ing to highlight that the rest of the terms of the CATA question, which corresponded to simplest sensory attributes, were located close for both clusters, suggesting that they were used in a similar way by consumers in both clusters (Fig. 4). The RV coefcient be- tween term congurations for both clusters was 0.65 (p = 0.0006), higher than the RV coefcient between sample Table 8 Frequency (%) with which the terms of the CATA question were used by the two identied consumer segments to describe their ideal product and signicance at which signicant differences existed according to Fishers exact test. Term Frequency of use (%) p Fishers exact test Cluster 1 (n = 79) Cluster 2 (n = 40) Juicy 92 93 >0.999 Firm 89 60 0.001 Sweet 76 80 0.653 Flavorsome 80 68 0.176 Apple avor 67 60 0.676 Crispy 75 43 0.001 Apple odor 41 38 0.844 Sour 29 8 0.009 Astringent 9 93 0.265 Soft 0 18 0.001 Mealy 4 8 0.662 Coarse 1 5 0.545 Bitter 1 3 >0.999 Odorless 0 3 0.336 Tasteless 0 0 1 Terms highlighted in bold correspond to those in which signicant differences in their frequency of use between Clusters existed according to Fishers exact test. Fig. 4. Representation of the terms from the CATA question in the rst and second dimensions of the multiple factor analysis performed on CATA counts of Study 2 for the two identied consumer segments with different preference patterns. 8 G. Ares et al. / Food Quality and Preference xxx (2013) xxxxxx Please cite this article in press as: Ares, G., et al. Penalty analysis based on CATA questions to identify drivers of liking and directions for product refor- mulation. Food Quality and Preference (2013), http://dx.doi.org/10.1016/j.foodqual.2013.05.014 congurations (RV = 0.91). This suggests that although both clus- ters did not differ in the perception of similarities and differences among apple cultivars, they differed in the way in which they used some of the terms to describe them. 3.2.3. Penalty analysis Fig. 5 shows the mean drops in overall liking as a function of the proportion of consumers that checked an attribute differently than for the ideal apple across all samples for the two consumer seg- ments identied in cluster analysis. Except for apple odor, at the aggregate level deviation from the ideal of all the attributes in- cluded in the CATA question caused a signicant drop in overall liking for consumers in Cluster 1. The attributes that caused the highest decrease in overall liking were tasteless, coarse, soft, mealy, juicy and rm, which indicates that texture attributes had the highest relevance for the hedonic perception of these consum- ers. On the other hand, avor attributes were the most relevant for consumers in Cluster 2, who penalized samples which deviated from the ideal in sweetness, taste intensity and bitterness. It is also interesting to highlight that deviation from the ideal in the terms rm, crispy, mealy and coarse did not cause a signicant drop in overall liking for Cluster 2, meaning that most probably consumers in this group would be prepared to sacrice texture in favor of their preferred apple taste. Regression coefcients of the PLS regression models for the ve apple samples and the two consumer segments are shown in Ta- ble 9. For all samples deviation from the ideal signicantly affected overall liking for only a subset of attributes. Moreover, clear differ- ences were identied between the clusters. For example, Cluster 1 signicantly decreased overall liking scores for crisp pink apples due to the deviation from the ideal in rmness, juiciness and sweetness; whereas Cluster 2 penalized deviation from the ideal in juiciness, sweetness, sourness and avorsome. By looking at the maximumpotential increase in overall liking if the attributes that deviated from the ideal were modied, it seems clear that it is not worth it to suggest improvements in the sensory characteristics of fuij and red delicious apples for consumers in Cluster 1 and Cluster 2, respectively. Furthermore, by studying Table 9 together with the description of the samples, their ideal apple (Table 7) and the overall liking rat- ings (Table 6), it is possible to identify recommendations for prod- uct improvement for consumers in Cluster 1 and 2. Consumers in Cluster 1 preferred crisp pink and fuji apples (Table 6). Juiciness was the attribute which deviation from the ideal had the highest weight in decreasing overall liking for crisp pink apples, indicating that this cultivar would be more liked by these consumers by increasing its juiciness (Table 9). Meanwhile, the main direction for improvement in the case of fuji apples for consumers in Cluster 1 was related to the term tasteless, which indicates the need for an increase in avour intensity. On the other hand, consumers in Clus- ter 2 clearly preferred red delicious apples, which could be im- proved by increasing its sweetness and reducing its softness (Tables 7 and 9). However, the improvement in these last cultivars for Cluster 1 and Cluster 2 would not lead to a large increase in overall liking scores, as previously discussed. For consumers in Cluster 2 it would be recommended to improve royal gala apples by increasing its sweetness and juiciness and reducing its softness. These changes would lead to a potential increase in overall liking of Fig. 5. Mean drops in overall liking as a function of the percentage of consumers that checked an attribute differently than for the ideal product at the aggregate level for the two identied consumer segments with different preference patterns. Attributes highlighted in bold correspond to those in which deviation from the ideal and caused a signicant decrease in overall liking according to KruskalWallis test for a 95% condence level. G. Ares et al. / Food Quality and Preference xxx (2013) xxxxxx 9 Please cite this article in press as: Ares, G., et al. Penalty analysis based on CATA questions to identify drivers of liking and directions for product refor- mulation. Food Quality and Preference (2013), http://dx.doi.org/10.1016/j.foodqual.2013.05.014 2.1 in the 9-point hedonic scale. A summary of the recommenda- tions for changing each apple cultivar for the two consumer seg- ments is shown in Table 10. 4. Discussion and conclusions Sensory methodologies which aim at identifying ideal products based on consumer descriptions are widely used in new product development to obtain actionable directions for product improve- ment and are nowadays gaining in popularity (Worch et al., 2012a, 2012b). According to Van Trijp et al. (2007) methods that rely on consumer self-reported attribute ideals or deviation from the ideal deliver more realistic ideal points than methods based on regres- sion-based techniques. The present work proposed the application of a new penalty- based method on consumer responses to a CATA question to de- scribe the samples and their ideal product, as an extension to the approach suggested by Plaehn (2012) when working with the emo- tional prole of drinks. Consumers are just asked to describe the samples and their ideal product using a CATA question. Compared to methodologies that rely on the use of scales, this approach would be simpler and easier to use for consumers and could also potentially have a smaller impact on hedonic scores than JAR or intensity scales (Adams et al., 2007). Apart from its simplicity for consumers, an advantage of the method is that it could be applied with a small set of products, not like regression-based methods that require larger sample sets. However, it must be considered that the number of samples should be 5 or more if factorial tech- niques such as CA or MFA are to be used for data analysis. Asking consumers to describe their ideal product using a CATA question consists of a exible and simple add-onto a hedonic bal- lot. Its main advantages is that it provides information about con- sumer perception of the sensory characteristics of the products and also information which enables to identify the sensory characteris- tics of consumer ideal product at the aggregate level and for con- sumer segments with different preference patterns. This enables the identication of drivers of liking for a set of products based exclusively on consumer perception and without the need for regression techniques. In the studies included in the present arti- cle, the description of the ideal product provided by consumers was similar to that of the samples with the highest liking scores, which indicates the validity of the information provided by Table 10 Summary of the recommendations for changes in the ve apple cultivars considered in Study 2, based on results from PLS modeling (Table 9) and consumer responses to the CATA question (Table 7), for the two consumer segments identied in Cluster analysis. Cluster Sample Main recommended changes 1 Crisp pink Increase juiciness, sweetness and rmness Fuji Changes are not necessary Granny smith Increase avorsome, sweetness and juiciness. Reduce sourness Royal gala Increase taste intensity (to reduce deviation in avorsome, tasteless and odorless), juiciness, crispiness and rmness. Reduce bitterness, softness, astringency and mealiness Red delicious Reduce coarseness and mealiness. Increase taste intensity (to reduce deviation in apple avor, avorsome, tasteless and odorless), rmness, juiciness, and sweetness 2 Crisp pink Increase juiciness, sweetness and taste intensity (to reduce deviation in avorsome). Reduce sourness Fuji Increase sweetness and taste intensity (to reduce deviation in avorsome). Reduce crispiness Granny smith Reduce bitterness and sourness Royal gala Increase juiciness and sweetness. Reduce softness Red delicious Changes are not necessary Table 9 Percentage of consumers (%) who describe each apple sample as different from the ideal for each of the attributes included in the CATA question, regression coefcients (RC) and intercept of PLS models for the two consumer segments identied in Cluster analysis. Term Crisp pink Fuji Granny smith Royal gala Red delicious Cluster 1 Cluster 2 Cluster 1 Cluster 2 Cluster 1 Cluster 2 Cluster 1 Cluster 2 Cluster 1 Cluster 2 % RC % RC % RC % RC % RC % RC % RC % RC % RC % RC Firm 42 0.13 35 ns 38 ns 30 ns 44 ns 38 ns 82 0.10 50 ns 84 0.09 53 ns Juicy 45 0.31 53 0.23 37 0.16 35 ns 53 0.13 60 ns 55 0.15 50 0.25 65 0.14 38 ns Sweet 59 0.16 70 0.23 50 0.13 70 0.19 76 0.14 78 ns 65 ns 55 0.25 59 0.09 23 0.36 Bitter 23 ns 10 27 0.18 10 32 ns 28 0.33 26 0.09 10 26 ns 3 Apple odor 47 ns 40 ns 48 ns 33 ns 47 ns 33 ns 50 ns 40 ns 49 ns 30 ns Sour 49 ns 65 0.17 43 ns 15 69 0.15 63 0.30 48 ns 8 43 ns 10 Crispy 36 ns 40 ns 49 0.13 40 0.19 49 ns 43 ns 73 0.09 45 ns 75 ns 33 ns Flavorsome 49 ns 53 0.14 54 ns 58 0.22 64 0.16 58 ns 72 0.07 50 ns 70 0.08 55 ns Coarse 23 ns 5 23 ns 5 22 ns 10 37 ns 8 46 0.15 18 Soft 22 ns 18 ns 23 ns 18 23 ns 18 55 0.13 48 0.21 60 ns 30 0.43 Odorless 30 ns 20 ns 30 ns 18 31 ns 20 ns 38 0.08 25 ns 31 0.09 20 ns Tasteless 22 ns 10 27 0.29 13 24 ns 18 52 0.16 15 31 0.12 5 Mealy 24 ns 10 24 ns 8 24 ns 10 53 0.09 33 ns 71 0.15 48 ns Apple avor 48 ns 55 ns 51 0.11 50 ns 64 ns 70 ns 65 ns 50 ns 58 0.11 43 ns Astringent 26 ns 13 30 ns 13 35 0.14 18 29 0.09 8 29 ns 3 Intercept 9.0 8.2 7.6 8.2 8.4 5.2 8.8 8.6 7.8 8.8 Mean overall liking 7.7 6.3 7.4 6.1 6.4 4.2 5.2 8.2 5.2 6.7 Mean drop * 1.3 1.9 0.2 2.1 2.0 1.0 3.6 0.4 2.6 2.1 : Indicates that the attribute was not included in the PLS model because less than 20% of the consumers considered that it deviated from the ideal; ns: corresponds to non- signicant coefcients. * Mean drop is calculated as the intercept of the model minus the actual overall liking score. 10 G. Ares et al. / Food Quality and Preference xxx (2013) xxxxxx Please cite this article in press as: Ares, G., et al. Penalty analysis based on CATA questions to identify drivers of liking and directions for product refor- mulation. Food Quality and Preference (2013), http://dx.doi.org/10.1016/j.foodqual.2013.05.014 consumers when describing their ideal product using CATA ques- tions. Similar results have been reported when asking consumers to describe their ideal product by rating attribute intensity using scales in Ideal Prole Method (Worch et al., 2010, 2012a, 2012b). Further research is needed to investigate the stability of consumer descriptions of their ideal product using a CATA question within a session and between sessions. Penalty analysis based on the comparison of consumer percep- tion of the samples and their ideal product provided information about the impact of deviation from the ideal on liking scores, di- rectly from consumers. A graphical representation of the relation- ships between overall liking scores and deviation from the ideal product was obtained, as well as the potential improvement in overall liking scores and information about the impact of deviation from the ideal of each attribute. Meanwhile, the direction of the sensory changes needed to reduce the deviation from the ideal was obtained from the difference between the percentage of con- sumers who used a term for describing the samples and the ideal product. This type of analysis enabled to make specic and action- able recommendations for each product based on the inuence of deviation from the ideal on overall liking. Besides, PLS modeling provided information about the potential for improvement for each product, enabling a realistic decision as to the value of refor- mulation. The main disadvantage of the method is related to the fact that information about attribute intensity and the degree of difference between the products and the ideal for each consumer is not gathered. In the present study differences in the inuence of deviation from the ideal when the product is less or more intense in each specic attribute were not considered. However, the PLS dummy approach could be easily performed by considering two different dummy variables for each attribute, one which indicates if the attribute is used to describe the product and not the ideal, and a second one which indicates if the attribute is used to describe the ideal and not the product. A similar approach has been used by Xiong and Meullenet (2006) when dealing with JAR scales. Another drawback could be how the terms included in the CATA question were selected, if not chosen appropriately some drivers of liking or disliking might be missed, but this fact is inherent to all attribute-based descriptive techniques. Further research and comparison with other optimization tech- niques, such as ideal prole and JAR scales, would be needed. Apart from comparing ideal products and recommendations for product reformulation, it would be necessary to compare the methodolo- gies in terms of ease of use and time required for completing the task. Besides, it would also be necessary to study the minimum number of consumers needed for obtaining reliable product spaces from CATA questions. Considering that the proposed penalty-based approach relies on overall liking scores, working with the usual number of consumers considered in hedonic tests (100120) (Hough et al., 2006) seems reasonable for obtaining a reliable iden- tication of drivers of liking and directions for product reformula- tion. However, the number of consumers to be included in the study also depends on the number of segments that are sought to be identied. Due to the methodological nature of the present work only 74 consumers were considered for Study 1, which does not compromise its validity. Another interesting issue that arose from the results is related to differences between consumer segments in the way they de- scribe the evaluated products. Consumers tended to associate odor and avor terms, such as Apple odor and avor, to their preferred samples, indicating that their evaluation of these terms were strongly affected by their preference patterns. Ares et al. (2010) and Lado, Vicente, Manzzioni, and Ares (2010) also reported that consumer segments with different preference patterns differed in the way in which they used some terms of a CATA question to de- scribe samples. In particular, Lado et al. (2010) found that the main differences between consumer segments were observed in the terms related to total odor and avor intensity. The fact of nding differences in the use of sensory terms depending on the prefer- ence pattern is indeed worth of further investigation. Is it that pref- erence in a way biases the description? Is it that consumers idealize the sensory characters of the perfect product in their minds? What kind of attributes would be affected by this? Or is it simply a matter of attribute denition? From the present study, and also from Lado et al. (2010), it seems that mainly attributes less dened and that describe typicality might be the ones more affected. Attributes like rm, crispy, soft, mealy or sweet have been similarly used by both clusters in this study, while other attributes like apple avor where the ones that differed. For a consumer, the apple avor in their preferred or ideal apple might be a par- ticular avor that they would rate more intensely or tick more fre- quently in a CATA, even if another sample has a more intense avor but not corresponding with prole they have in their minds as how an apple should taste. This result coming from this work suggests the need for further research related to the selection of terms to be included in a CATA question and particularly to study the validity of consumer evalu- ations of complex sensory attributes. On the other hand, the fact that consumers responses to a CATA question might be inuenced by their preference patterns makes the inclusion of information about the ideal product interesting for better understanding con- sumer perception of the sensory characteristics of a set of products and for the identication of their drivers of liking. Acknowledgements The authors are indebted to Comisin Sectorial de Investigacin Cientca (CSIC) fromUniversidad de la Repblica for nancial sup- port and to Comisin Administradora del Mercado Modelo for pro- viding the apple samples used in Study 1. Also, the authors are grateful to the Spanish Ministry of Science and for the contract awarded to the author P. Varela (Juan de la Cierva Program). The authors would like to thank Luca Antnez, Alejandra Sapolinski and Leticia Vidal for their help with data collection in Study 1. References Adams, J., Williams, A., Lancaster, B., & Foley, M. (2007). Advantages and uses of check-all-that-apply response compared to traditional scaling of attributes for salty snacks. In 7th Pangborn Sensory Science Symposium, 1216 August 2007, Minneapolis, USA. Andani, Z., Jaeger, S. R., Wakeling, I. N., & MacFie, H. J. H. (2001). Mealiness in apples: Towards a multilingual consumer vocabulary. Journal of Food Science, 66, 872879. Ares, G., Barreiro, C., Deliza, R., Gimnez, A., & Gmbaro, A. (2010). Application of a check-all-that-apply question to the development of chocolate milk desserts. Journal of Sensory Studies, 25, 6786. Ares, G., Barreiro, C., & Gimnez, A. (2009). Comparison of attribute liking and JAR scales to evaluate the adequacy of sensory attributes of milk desserts. Journal of Sensory Studies, 24, 664676. Ares, G., Gonalvez, D., Prez, C., Reoln, G., Segura, N., Lema, P., et al. (2007). Inuence of gelatin and starch on the instrumental and sensory texture of stirred yogurt. International Journal of Dairy Technology, 60, 263269. Ares, G., Varela, P., Rado, G., & Gimnez, A. (2011a). Are consumer proling techniques equivalent for some product categories? The case of orange- avoured powdered drinks. International Journal of Food Science and Technology, 46, 16001608. Ares, G., Varela, P., Rado, G., & Gimnez, A. (2011b). Identifying ideal products using three different consumer proling methodologies. Comparison with external preference mapping. Food Quality and Preference, 22, 581591. Bayarri, S., Carbonell, I., Barrios, E. X., & Costell, E. (2011). Impact of sensory differences on consumer acceptability of yoghurt and yoghurt-like products. International Dairy Journal, 21, 111118. Bcue-Bertau, M., & Pags, J. (2004). A principal axes method for comparing contingency tables: MFACT. Computational Statistics & Data Analysis, 45, 481503. Costa, A. I. A., & Jongen, W. M. F. (2006). New insights into consumer-led food product development. Trends in Food Science & Technology, 17, 457465. G. Ares et al. / Food Quality and Preference xxx (2013) xxxxxx 11 Please cite this article in press as: Ares, G., et al. Penalty analysis based on CATA questions to identify drivers of liking and directions for product refor- mulation. Food Quality and Preference (2013), http://dx.doi.org/10.1016/j.foodqual.2013.05.014 Cowden, J., Moore, K., & Vanluer, K. (2009). Application of check-all-that-apply response to identify and optimize attributes important to consumers ideal product. In 8th Pangborn Sensory Science Symposium, 2630 July 2009, Florence, Italy. Daillant-Spinnler, B., MacFie, H. J. H., Betys, P. K., & Hedderley, D. (1996). Relationships between perceived sensory properties and major preference directions of 12 varieties of apples from the Southern Hemisphere. Food Quality and Preference, 7, 113126. Dooley, L., Lee, Y. S., & Meullenet, J. F. (2010). The application of check-all-that- apply (CATA) consumer proling to preference mapping of vanilla ice cream and its comparison to classical external preference mapping. Food Quality and Preference, 21, 394401. Epler, S., Chambers, E., IV, & Kemp, K. E. (1998). Hedonic scales are better predictors than just-about-right scales of optimal sweetness in lemonade. Journal of Sensory Studies, 13, 191197. Fisher, R. A. (1954). Statistical methods for research workers. Edinburgh: Oliver and Boyd. Gimnez, A., Ares, G. (2010). Identication of consumers texture vocabulary of milk desserts and yogurts using a free listing task. In VI Simposio Iberoamericano de Anlisis Sensorial SENSIBER 2010, 1921 August 2010, So Paulo, Brazil. Grunert, K. G., Baadsgaard, A., Larsen, H. H., & Madsen, T. K. (1996). Market orientation in food and agriculture. Boston, MA: Kluwer. Hough, G., Wakeling, I., Mucci, A., Chambers, E., IV, Mndez Gallardo, I., et al. (2006). Number of consumers necessary for sensory acceptability tests. Food Quality and Preference, 17, 522526. Husson, F., Le Dien, S., & Pags, J. (2001). Which value can be granted to sensory proles given by consumers? Methodology and results. Food Quality and Preference, 12, 291296. Jaeger, S. R., Andani, Z., Wakeling, I. N., & MacFie, H. J. H. (1998). Consumer preferences for fresh and aged apples: A cross-cultural comparison. Food Quality and Preference, 9, 355366. Lado, J., Vicente, E., Manzzioni, A., & Ares, G. (2010). Application of a check-all-that- apply question for the evaluation of strawberry cultivars from a breeding program. Journal of the Science of Food and Agriculture, 90, 22682275. Lagrange, V., & Norback, J. P. (1987). Product optimization and the acceptor set size. Journal of Sensory Studies, 2, 119136. Lawless, H. T., & Heymann, H. (2010). Sensory evaluation of food. Principles and practices (2nd ed.). New York: Springer, pp. 227253. L, S., Josse, J., & Husson, F. (2008). FactoMineR: An R package for multivariate analysis. Journal of Statistical Software, 25(1), 118. Lesniauskas, R. O., & Carr, B. T. (2004). Workshop summary: Data analysis: getting the most out of just-about-right data. Food Quality and Preference, 15, 891899. Manoukian, E. B. (1986). Mathematical nonparametric statistics. New York, NY: Gordon & Breach. Moskowitz, H. R. (1996). Experts versus consumers: A comparison. Journal of Sensory Studies, 11, 1937. Moskowitz, H. R. (2001). Sensory directionals for pizza: A deeper analysis. Journal of Sensory Studies, 16, 583600. Moskowitz, H. R. (2003). The just-about-right scale Do panellists know their ideal point? In H. R. Moskowitz, A. M. Muoz, & M. C. Gacula (Eds.), Viewpoints and controversies in sensory science and consumer product testing. Massachusetts: Food & Nutrition Press. Moskowitz, H. R., & Hartmann, J. (2008). Consumer research: Creating a solid base for innovative strategies. Trends in Food Science & Technology, 19, 581589. Parente, M. E., Manzoni, A. V., & Ares, G. (2011). External preference mapping of commercial antiaging creams based on consumers responses to a check-all- that-apply question. Journal of Sensory Studies, 26, 158166. Plaehn, D. (2012). CATA penalty/reward. Food Quality and Preference, 24, 141152. Plaehn, D., & Horne, J. (2008). A regression-based approach for testing signicance of just-about-right variable penalties. Food Quality and Preference, 19, 2132. Pohjanheimo, T., & Sandell, M. (2009). Explaining the liking for drinking yoghurt: The role of sensory quality, food choice motives, health concern and product information. International Dairy Journal, 19, 459466. Popper, R., & Kroll, D. (2005). Just-about-right scales in consumer research. Chemosense, 7, 46. Popper, R., Rosentock, W., Schraidt, M., & Kroll, B. J. (2004). The effect of attribute questions on overall liking ratings. Food Quality and Preference, 15, 853858. R Development Core Team (2007). R: A language and environment for statistical computing. 3-900051-07-0. Vienna, Austria: R Foundation for Statistical Computing. Rothman, L. (2007). The use of just-about-right scales in food product development and reformulation. In H. MacFie (Ed.), Consumer-led food product development. Washington, DC: CRC Press. Stewart-Knox, B. J., & Mitchell, P. (2003). What separates the winners from the losers in new food product development? Trends in Food Science & Technology, 14, 5864. Tamime, A. Y., & Robinson, R. K. (1991). Yogur ciencia y tecnologa. Zaragoza, Spain: Acribia, S.A. ten Kleij, F., & Musters, P. A. D. (2003). Text analysis of open-ended survey responses: A complementary method to preference mapping. Food Quality and Preference, 14, 4352. Urban, G. L., & Hauser, J. R. (1993). Marketing of new products (2nd ed.). Hemel Hempstead: Prentice-Hall. van Kleef, E., van Trijp, H. C. M., & Luning, P. (2006). Internal versus external preference analysis: An exploratory study on end-user evaluation. Food Quality and Preference, 17, 387399. Van Trijp, H. C., Punter, P. H., Mickartz, F., & Kruithof, L. (2007). The quest for the ideal product: Comparing different methods and approaches. Food Quality and Preference, 18, 729740. Varela, P., & Ares, G. (2012). Sensory proling, the blurred line between sensory and consumer science. A review of novel methods for product characterization. Food Research International, In press, doi:10.1016/j.foodres.2012.06.037. Worch, T., Dooley, L., Meullenet, J. F., & Punter, P. H. (2010). Comparison of PLS dummy variables and Fishborne method to determine optimal product characteristics from ideal proles. Food Quality and Preference, 21, 10771087. Worch, T. W., L, S., & Punter, P. (2009). How reliable are consumers? Comparison of sensory proles from consumers and experts. Food Quality and Preference, 21, 309318. Worch, T., L, S., Punter, P., & Pags, J. (2012a). Extension of the consistency of the data obtained by the Ideal Prole Method: Would the ideal products be more liked than the tested products? Food Quality and Preference, 26, 7480. Worch, T., L, S., Punter, P., & Pags, J. (2012b). Assessment of the consistency of ideal proles according to non-ideal data for IPM. Food Quality and Preference, 24, 99110. Xiong, R., & Meullenet, J. F. (2006). A PLS dummy variable approach to assess the impact of JAR attributes on liking. Food Quality and Preference, 17, 188198. 12 G. Ares et al. / Food Quality and Preference xxx (2013) xxxxxx Please cite this article in press as: Ares, G., et al. Penalty analysis based on CATA questions to identify drivers of liking and directions for product refor- mulation. Food Quality and Preference (2013), http://dx.doi.org/10.1016/j.foodqual.2013.05.014