Beruflich Dokumente
Kultur Dokumente
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
American Association of Teachers of Spanish and Portuguese is collaborating with JSTOR to digitize, preserve
and extend access to Hispania.
http://www.jstor.org
Jens H. Clegg
Indiana University-Purdue University Fort Wayne, USA
Abstract: The teaching of the Spanish noun gender system to students is based on a set of generalizations
that the last phoneme, or sound, of a noun is an excellent predictor of the gender of that noun (Bull 1965).
These generalized norms have been refined over the years and can be found in most textbooks. The norms
are taught to students who then apply them to nouns and can deduce the gender of the noun. However,
students still have difficulty determining the gender of nouns accurately, a fact which may indicate that
the rules for gender assignment are inadequate. This article examines the pros and cons of the current set
of generalizations and applies them to a set of highly frequent Spanish nouns. As a result of this analysis,
a newly refined set of generalizations based on frequent nouns is proposed. These new generalizations
may be easier for students to learn and correctly apply to deduce the gender of Spanish nouns.
Introduction
Spanish, all nouns have a specific grammatical gender that is an integral part of the word.
The specific gender that corresponds to each noun may, or may not, have a direct relation
In ship to the gender of the referent. However, the designation "gender" is used to describe
this grammatical function. The gender of nouns is important since Spanish morphosyntactic
rules mandate that all descriptors of the noun agree in gender with that noun. This creates a
challenge when an English speaking student learns Spanish. In English, nouns have no specific
grammatical gender and, as a result, the student has to learn a new and unfamiliar grammatical
concept. This can result in a developmental error since it is a new and foreign concept for second
language learners. There are two potential ways for students to learn the correct gender for each
noun. The first is the memorization of the specific gender as part of the individual noun. This
approach requires the student to store the information for the gender of each noun together with
the noun in their memory.
The second method is a set of generalized norms that allow the student to predict the
gender of the noun by applying the norms. Bull (1965) studied the patterns for gender of all of
the Spanish nouns in a Spanish/English dictionary. Bull's purpose was to discover generalized
noun gender norms that could be taught to students instead of using rote memorization to learn
the gender of each individual noun. He focused his study on the terminal, or final, grapheme
(letter) and in some cases groups of graphemes of the words. Bull found that words that end
in -a, -d, -cidn, -sis, and -itis will be feminine in gender 98% of the time, whereas words end
ing in any other grapheme (-o, -r, -n, -s, etc.) will be assigned a masculine gender 96% of
the time. In this group of other graphemes associated with masculine gender are those which
would be atypical or unusual for Spanish phonology, such as words ending in -b (el club 'the
club'), or -j (el reloj 'the watch'). Bull found that these atypical endings are associated with
masculine gender and those findings have been confirmed in other studies (Clegg 2010; Clegg
and Waltermire 2009). Bull claims that students who learn these norms and follow them will
Bergen (1978) furtherstudied and confirmed the findings of Bull (1965), refining them by
adding four additional generalizations. Bergen found that the endings -umbre (la servidumbre
'servitude'), -ie (la especie 'the species'), and -z (la luz 'the light') should be associated with
feminine gender. Bergen also suggested that nouns of Greek origin ending in -ma (elproblema
'the problem') and -ta (el planeta 'the planet') should be associated with masculine gender.
Bergen then coined the term LONERS as an acronym for endings associated with masculine
gender. In his study, Bergen (1978) also began to refer to the endings as phonemes or terminal
phonemes, hereafter TPs, instead of the graphemes that Bull (1965) had used.
Teschner and Russell (1984) looked in depth at the findings of Bull (1965) and the refine
ments of Bergen (1978) and found that there were some inaccuracies. Teschner and Russell used
a much larger dictionary (89,000 words) and carefully excluded words that were ambivalent for
gender. They first point out that students are unaware of words of Greek origin and that, further,
there are many words ending in -ma that have feminine gender. They also comment that the new
refinements -umbre and -ie are low productivity forms and are not frequent enough to justify the
creation of separate norms. Based on the larger dictionary, Teschner and Russell conclude that
the endings -z, -n, and -5 are in fact indeterminate for gender since they only slightly favor one
gender or the other. They do, however, confirm that -a, -d, and -ion are feminine endings and
that -o, -r, and -e are masculine endings. Teschner and Russell (1984) simplified the form to
-ion instead of the -cion proposed by Bull (1965) because their results indicated that regardless
of the letterbefore -ion (-cion, -gion, -sidn, etc.) the form is overwhelmingly feminine (99.4%).
Based on the findings of Bull (1965), Bergen (1978), and Teschner and Russell (1984),
it appears that there is a clear correlation between the terminal grapheme/TP, or group of
graphemes/TPs, and the gender of Spanish nouns. Using those findings most textbooks of Span
ish include a section on Spanish noun gender and teach the students a variation of this norms
based system. For a few examples, see Iguina and Dozier (2008: 33-35), Castells et al. (2010:
39^40), or Blanco and Tocaimaza-Hatch (2007: 378). Each textbook is a little different,but all
are based primarily on the findings of Bull (1965). Table 1 summarizes the norms.
This system of norms has been used for over forty years now with little change. The purpose
of the norms based approach was to teach students a simpler way to learn the Spanish noun
gender system. The question then is: how is it working? Colleagues will tell you that, in their
experience, students are still committing many errors and routinely misapplying the norms.
One simple reason is that teachers love to put irregular forms on the exam. Thus, students who
apply the norms without memorizing the exceptions will incorrectly identify the gender of the
exceptional nouns on tests and may lose faith in the system. However, there are larger problems
with the system that can be easily explained and are, in fact, a result of misunderstanding the
norms based system.
To implement the norms based system correctly, students must understand how the system
works and in what situations the system applies. The norms based system only applies to
inanimate nouns. Animate nouns, or nouns with a biological gender, are not covered by the
norms. These animate nouns, such as el hombre 'the man' or la mujer 'the woman' have a
specific biological gender that is not necessarily related to the TP. If the students applied the
TP based norms hombre would turn out masculine, as it should, but so would mujer based on
the masculine TP Irl. Also included in this group of animate nouns, to which the norms do not
apply, are the terms used for professions such as maestro/maestra 'male/female teacher'. These
terms referring to profession can be quite complicated (Flores Epperson and Ranson 2010) and
must be learned separately from the norms. It is important that students understand that these
norms only apply to inanimate nouns and not animate nouns.
There is also a second group of nouns to which these norms do not apply; epicene, or
ambivalent gender nouns. For example, with el capital 'investment capital' vs. la capital
'capital city', the gender of the noun changes based on meaning. Other examples include: el
policia 'the police man' vs. lapolicia 'the police force', and el cura 'the priest' vs. la cura 'the
cure'. Another group of nouns to which these norms do not apply are nouns that begin with
accented /a/ such as el/las agua/s 'the water/s' and el/las alma/s 'the soul/s'. These nouns use
the masculine article in the singular and the feminine article in the plural. This change is made
for the phonemic reason of avoiding the sequence la plus stressed [a]. In both of these cases,
the TP based gender norms do not apply and students must simply memorize these forms. The
challenge is that these forms may not always be listed in textbooks or if they are listed there
endings -ion, and -umbre, what are they? Bull (1965) clearly labels his norms as "final letter"
(109). Subsequent researchers changed them to TPs even though they include groups of letters
like -ma and -ta. This causes great confusion for the student. Is camion 'truck' -ion or -? If
the student chooses -ion, they will get the wrong gender. Another example of this is -s vs. -z.
Students are taught that -z is pronounced Is/, but are the norms grapheme based or phoneme
based? For the noun nariz, is the ending -5 or -z? If they apply the TP based norms, they will
incorrectly use masculine gender. This lack of clarity in the basis of the norms and also the
length of the endings may cause a great deal of confusion for students.
A further misunderstanding can be found by returning to the findings of Teschner and Rus
sell (1984). These authors point out that students are incapable of identifying words of Greek
origin and thus erroneously apply the -ma and -ta norm to all words with those endings. As a
result, the common exception elproblema 'the problem' is correct as is elplaneta 'the planet'.
However, the feminine wordsfirma 'signature',pluma 'pen', crema 'cream', bicicleta 'bicycle',
and many others would all be incorrect. Are there enough examples of masculine words ending
in -ma and -ta to justify this norm?
The category of atypical TPs is another category that causes some confusion. Spanish
phonotactics prefers that words end in the following phonemes: /a, e, o, 1, n, r, s, d, 0/. There
are exceptions to this tendency, but they are few and are frequently borrowings from another
language like el club 'the club' mentioned previously. The norms state that atypical TPs, those
other than the nine cited above, are associated with masculine gender. The issue here is what the
students know. Students are generally not trained linguists and are not aware of what is typical
and what is atypical. The English language provides no help here since English phonotactics
has no restrictions on its TPs. This fact makes this norm difficult for students to understand
and apply correctly.
The final challenge with the present norms is the data upon which they are based. The
norms are formulated based on lists of nouns from massive dictionaries. In Bull's (1965)
case, he used 38,000 nouns as a basis for the study. Teschner and Russell (1984) used all of
the nouns from the 1956 edition of El diccionario de la lengua espanola (RAE). These large
numbers of nouns far exceed the number known and used by native speakers, let alone language
learners. Does the use of all of the possible nouns, including those seldom known or used,
change how the norms turn out? In their revision of the norms, Teschner and Russell (1984:
136) removed the TP -e from the masculine category and called it indeterminate. The authors
recognized that there were many -e ending TP nouns that were masculine but found that there
were a disproportionate number of highly frequent -e nouns that were feminine. This indicates
that frequent nouns may have different TP gender patterns when considered separately from
Given all the ambiguity and potential confusion and error caused by the TP based norms
in use today, a fresh look at the situation is in order. The present study is an effort to improve
Methodology
To determine the effect of the relative frequency of words on the TP based norms, the
frequent nouns of the Spanish language were identified using a frequency dictionary (Davies
2006). This dictionary is based on a corpus of twenty million words that represent oral and
written Spanish from all social classes, genders, and ages from the majority of the Spanish
speaking world. The frequency dictionary contains a list of the 5,000 most common words in
Spanish. Davies (2006: vii) affirms that, for English, on average only 5,000 words comprise
95% of written text and that only 1,000 words account for 85% of oral speech. This would
indicate that the nouns found in the 5,000 most frequent words may represent the majority of
what students should learn. Because of this, these frequent nouns would also be the perfect set
In the top 5,000 words, there are 2,507 nounsjust over half of the words in the top 5,000.
Of those 2,507 words, sixteen are ambivalent or epicene nouns and 279 are animate (biological)
nouns. Since the norms do not apply to animate or ambivalent nouns, these 295 words were
excluded from the analysis. A copy of the excluded animate nouns, epicene/ambivalent nouns,
and nouns that begin with stressed [a] can be found in Appendices A and B of this article. This
leaves 2,212 nouns on which to test the norms.
Each of the nouns was analyzed using the norms summarized in Table 1, cited previously,
to determine the accuracy of the present norms for frequent nouns. The only difference being
that all nouns ending in -ma and -ta were analyzed together, not just those of Greek origin. This
was done because, as was discussed previously, students would not be able to tell the difference
Results
Table 2 below shows the results for the TPs associated with masculine gender. Overall,
the norms are only 75.4% accurate, which is a far cry from the 97% accuracy rate predicted by
Bull (1965). We also note that there are 85 words out of 1,006 that are exceptions to the norm.
For the prototypical member of the category, -o, there was only one exception, la mano 'the
hand' which is a classic example taught in most classes. The endings -n, -r, and -5 were also
very accurate with 97%, 97%, and 100% accuracy rates respectively.
The endings that were less than 90% accurate included the atypicals, -/, -e, -ma, and -ta. For
the atypicals, there were five masculine nouns ending in the phonemes /x//u//b/l\l including
the examples: reloj 'watch', espiritu 'spirit', club 'club', taxi 'taxi', and whiskey 'whiskey'.
There were two feminine nouns tribu 'tribe' and ley 'law'. The accuracy rate for the atypical
ending group was 71%, but there are only six total examples. It is clear that this category has
-/ 37 78.7% 10
-o 684 99.9% 1
-n 44 97.8% 1
-e 132 79.0% 35
-r 73 97.3% 2
-s 15 100.0% 0
Atypicals 5 71.0% 2
-ma 15 48.5% 16
For the ending -/, there were 37 masculine nouns and 10 feminine nouns for a 78.7%
accuracy rate. This is not as high as 90%, but it is still a good generalization for students. A
good generalization is one that would allow a student to get at least 70% of the forms correct
by applying the norm. The ending -e which was deemed indeterminate by Teschner and Russell
(1984) has 35 exceptions and a 79% accuracy rate. Again, the rate is not 90%, but it is still a
good generalization.
The last two endings -ma and -ta had a very low accuracy rate with 48% and 5% respec
tively. For -ma there were fifteen masculine nouns:
The fact that there are more feminine words in -ma than masculine is evidence that this norm
is not a good generalization for students. The ending -ta is far less accurate with only one
masculine noun,planeta 'planet' and nineteen feminine nouns:
pista, derrota, conquista, peseta, etiqueta, bicicleta, manta, alerta, maleta, siesta, apuesta,
chaqueta, camiseta, imprenta, orbita, autopista, acta, pinta, grieta.
Clearly, this generalization is not accurate and should be removed from the norms.
Now we turn to the analysis of the endings, or TPs, that are associated with feminine gender
found in Table 3. Overall, the norms do achieve the promised 90% accuracy, but it is lower than
what was predicted by Bull (1965). Again, the prototypical ending -a is highly accurate with only
four exceptions, the masculine nouns el dia 'the day', el mapa 'the map', el mediodia 'midday',
and el tranvia 'the tram' indicating that this is a good generalization for students.
In the extraction of the frequent -d ending nouns, it was noted that a large proportion of
the words, in fact, ended in -udand -ad, as in ciudad 'city' and actitud 'attitude'. Morin (2006)
-a 661 99.4% 4
-d 4 100.0% 0
-is 5 55.6% 4
-z 17 81.0% 4
-umbre 4 100.0% 0
points out that most nouns ending in /d/ are actually part of the morphemes -tad or -tud, which
are categorically feminine. To ensure that this factor did not affect the analysis, words ending
in -d were analyzed separately from words ending in -ad and -ad, as can be seen in Table 3.
What is surprising is that there are only four nouns total for the ending -d when the -ad
and -ud endings are removed. These four nouns are pared 'wall', red 'web', sed 'thirst', and
merced 'mercy', all of which are feminine. Though there are few tokens, the norm is 100%
accuracy with no exceptions. For -ud and -ad, there were 117 nouns, all of which are feminine.
These findings indicate that whether a noun ends in -ud, -ad, or -d, it is associated with feminine
gender. This shows that there is no need to distinguish -ud and -ad from -d and that students
nearly 50/50 masculine and feminine. There were five feminine nouns, crisis 'crisis', lesis
'thesis', hipotesis 'hypothesis', dosis 'doasge', and sintesis 'synthesis', and four masculine
nouns, analisis 'analysis', enfasis 'emphasis', parentesis 'paretheses', and ten is 'tennis shoes'.
For this ending, the generalization is not accurate and will not help students.
As was predicted by Bergen (1978), the ending -z is associated with feminine gender with
an 81% accuracy rate. There were four exceptions, the masculine nouns matiz 'matrix', arroz
'rice', lapiz 'pencil', and maiz 'corn'. Although not at the 90% level, this generalization can
still be helpful to students. The ending -umbre was highly accurate with all examples being
feminine. There were only four words, but the generalization is still a good one.
When the results for the norms described in Table 1 are combined, masculine and femi
nine, we have an overall accuracy rate of 83% with a total of 98 exceptions that students must
memorize. Thus, if students memorize and correctly apply the norms in Table 1, they will be
correct on 2,116 out of 2,215 nouns and only need to memorize 99 exceptions. However, we
still have the potential problem of misunderstanding or inaccurately applying the norms.
To resolve this problem, I propose a revised set of norms found in Table 4. The revised
norms are completely grapheme, or letter, based. Students understand letters and it may be
easier for them to apply these new norms since they are consistent. There are still two groups
of graphemes, -ion and -umbre, which are both easily identified by students. The two acronyms
L-O-N-E-R-S and A-D-Z-ION-UMBRE, as seen in Table 4, represent the different endings.
Based on the results of the analysis, some generalizations were eliminated. The endings -ma
and -ta for words of Greek origin were clearly not accurate, and were hard to apply, so they
were eliminated. All words ending in -a are now associated with feminine gender. Students
will have to memorize the common exceptions like problema 'problem' and then assume that
all other words ending in -a are feminine. On the feminine side the ending -is was eliminated
since it is a low productivity form and not very accurate. In the revised norms, all words ending
in -s are associated with masculine gender.
The only other group that was modified was the atypical group. This group is now labeled
as "anything else". Students can easily learn the two acronyms, LONERS and ADZIONUMBRE
and assume that anything not fitting into those norms is associated with masculine gender. To
simplify even further,if students memorize only the acronym ADZIONUMBRE they can then
correctly assume that all other endings are associated with masculine nouns. This is a far easier
To test the accuracy rate of the proposed new norms, they were applied to the same 2,212
frequent nouns and new accuracy rates and lists of exceptions were created. Table 5 shows
the results for the masculine endings. Overall, the accuracy ratings for the masculine endings
are eleven percentage points higher with the proposed new norms and there are thirty fewer
exceptions. The revised norms for the masculine endings are a better generalization.
-/ 37 78.7% 10
-o 684 99.9% 1
-n 44 97.8% 1
-e 132 79.0% 35
-r 73 97.3% 2
-s 19 79.2% 5
Anythingelse 5 71.0% 2
Table 6 has the accuracy rates for the revised feminine endings. Overall, the accuracy rate
is five points higher, but the number of exceptions has risen by twelve. This rise can be seen in
the -a endings and is due to the seventeen -ma and -ta words of Greek origin that are now in
that category. This set of endings is a good generalization for students and the higher number
of exceptions is acceptable due to the increase in accuracy for both the masculine and feminine
-a 696 97.1% 20
-d 121 100.0% 0
-z 17 81.0% 4
-umbre 4 100.0% 0
results. Combining both masculine and feminine results for the revised norms overall, there is
an 8% increase in accuracy as well as a decrease of 18 exceptions that students would have to
learn. Combining the better accuracy, lower number of exceptions, and the much simpler and
easier to apply norms, the proposed new norms in Table 4 appear to be a better way to teach
Spanish noun gender norms to students. If the students learn the exceptions found in Appendix
C and then apply the revised norms there should be an increase in accuracy and in the ease of
application.
Conclusions
The norms for learning noun gender that have been in place over the last forty years have
helped many students to learn the Spanish noun gender system reasonably well. However, the
potential misunderstandings caused by the nature of the norms may cause complications as well
as potentially higher error rates for students. The use of modern technology to refine the norms
and quantifiably base them on frequent forms allows for greater norm accuracy and relevancy
to modern speech. Changing the nature of the norms to a graphemic- (or letter-) based system
will make the norms more accurate as well as potentially more easily applied and understood
by students. Basing the revised norms on frequent nouns and eliminating inaccurate endings
also increases their accuracy and decreases exceptions. The use of the acronyms LONERS
and ADZIONUMBRE for the masculine and feminine endings, respectively, is a good way to
help students memorize the norms. In fact, the norms can be simplified to the statement that
ADZIONUMBRE is feminine and all other endings are masculine. Teachers and textbook
authors need to be aware of the nature of the norms and also be sure to provide students with
the necessary lists of exceptions and the training required to use them. If students learn these
acronyms and memorize the exceptions found in Appendix C, they have the potential to be more
accurate with gender assignment in Spanish. In the future, research studies that test whether
these norms are more easily learned and accurately applied by students are needed.
WORKS CITED
Bergen, John J. (1978). "A Simplified Approach for Teaching the Gender of Spanish Nouns." Hispania
61.4: 865-76. Print.
Castells, Matilde, Elizabeth Guzman, Paloma Lapuerta, and Judith Liskin-Gasparro. (2010). Mosaicos
Spanish as a World Language. Upper Saddle River, NJ: Prentice Hall. Print.
Clegg, Jens H. (2010). "Native Spanish Speaker Intuition in Noun Gender Assignment." Language Design:
Journal of Theoretical and Experimental Linguistics 12: 5-18. Web. 18 Apr. 2010.
Clegg, Jens H., and Mark Waltermire. (2009). "Gender Assignment to English-origin Nouns in the Span
ish of the Southwestern United States." International Journal of the Linguistic Association of the
Southwest 28.1: 1-17. Print.
Davies, Mark. (2006). A Frequency
Dictionary of Spanish. New York: Routledge. Print.
Flores Epperson, Belen, and Diana
Ranson. (2010). "^La quimica, la quimico o el quimico? Como llamar
a una mujer profesional." Hispania 93.3: 399^12. Print.
Iguina, Zulma, and Eleanor Dozier. (2008). Manual de gramatica: Grammar reference for students of
Spanish. Boston: Thomson-Heinle. Print.
Morin, Regina. (2006). "Spanish Gender Assignment in Computer and Internet Related Loan Words."
Journal of Italian Linguistics 18.2: 143-69. Print.
Real Academia Espanola. El diccionario de la lengua espanola. (1956). 18th ed. Madrid: Espasa-Calpe.
Print.
Teschner, Richard V., and William M. Russell. (1984). "The Gender Patterns of Spanish Nouns: An Inverse
APPENDICES
Frequency Frequency
Rank Gender Noun Rank Gender Noun
Frequency Frequency
Rank Gender Noun Rank Gender Noun
Frequency Frequency
Rank Gender Noun Rank Gender Noun
Frequency Frequency
Rank Gender Noun Rank Gender Noun
Frequency Frequency
Rank Gender Noun Rank Gender Noun
1460 La gallina
Appendix B: Ambiguous (Epicene) and Nouns that Begin with Stressed [a]
Ambiguous Nouns
39 El/Las agua/s
-a
4 El dia
29 El problema
75 El tema
134 El sistema
149 El programa
408 El idioma
680 El clima
827 El poema
901 El planeta
1006 El mapa
1034 El sintoma
1097 El drama
1156 El fantasma
1166 El esquema
1187 El panorama
1255 El mediodia
1896 El lema
1909 El aroma
2234 El tranvfa
2499 El enigma
-ion
1188 El camion
-z
1067 El matiz
1519 El arroz
1605 El lapiz
1612 El maiz
-/
345 La piel
578 La serial
778 La moral
894 La carcel
1247 La sal
445 La miel
1899 La central
2010 La catedral
2418 La inicial
2492 La cal
-o
21 La mano
-n
49 La razon
-e
7 La parte
25 La gente
40 La noche
57 La tarde
59 La calle
64 La frente
76 La clase
116 La especie
129 La muerte
162 La base
172 La serie
212 La sangre
265 La suerte
295 La came
305 La fuente
310 La frase
326 La corriente
392 La mente
429 La fe
562 La leche
619 La superficie
852 La clave
984 La Have
993 La torre
1077 La fase
1133 La nieve
1211 La nave
1320 La fiebre
1412 La [ndole
1505 La catastrofe
1873 La variante
1949 La peste
2108 La sede
2400 La piramide
2490 La higiene
-r
365 La labor
375 La flor
-s
441 La crisis
962 La tesis
1164 La hipotesis
1248 La dosis
1321 La sintesis
113 La ley
1434 La tribu